Intelligent Memory

A technical précis of "Intelligent Memory".

Introduction

There are currently three memory architectures that are deployed in computer systems. They are "Traditional (dumb)" memory (DRAM,SRAM,etc), "Content Addressable Memory" (binary, ternary,etc), and "Intelligent Memory".

Traditional memory has Address, Data and Control buses where the Address bus identifies the memory word that is accessed by the host for each transaction. The Address bus width determines the maximum addressable memory and therefore defines a size limit on the system. This was a major issue in the transition from 16bit to 32bit processors, and we are now crossing the capacity bridge again in the transition between 32bit and 64bit processors. Access to memory of this type is by nature sequential, so even the most efficient searching mechanisms will encounter many "miss" locations before identifying the correct location of data.

Content Addressable Memory (CAM) have Data, Control and AddressHit buses. During certain operational phases, the Data bus values are compared by internal comparators to the stored data in each word, and where they match, the address of the match is presented on the AddressHit bus. This is used to drive a traditional memory device which holds the related data. Searches are carried out in parallel as each storage element contains a comparator thus providing a very high performance search engine. This type of memory is used extensively in micro-processor cache (binary) and IP routing (ternary) applications, but the high cost prohibits widespread use for database systems. CAM applications are only as scaleable as the specific product design allows as the maximum capacity is limited by the width of the AddressHit bus.

" Intelligent Memory" is a generic term used to describe a memory system that combines a processing capability with a storage capability in a cell. The memory (and processing) system of the computer is then made up of an array of cells and some form of host processor that provides an I/O controller. "Intelligent Memory" systems are characterised by the complexity of the processing element and the size of the storage element in each cell.

Large Cell Systems (MPP)

The only commercially available "Intelligent Memory" systems are often categorised as "Massively Parallel Processors". The cells consist of a full capability RISC processor attached to a DRAM memory array of significant size (64Mb). While this design is very effective at general purpose "Super-Computing" tasks, the high unit cost of the cells limits the number of cells that can be deployed significantly and therefore the degree of parallelism that can be achieved. There are also issues with inter-cell communication which need to be resolved by specific re-writing the programs to take advantage of these resources. This process is expensive and not applicable to the wider commercial database market.

Memory and Re-configurable logic systems

Researcher at NEC have proposed a simpler, more scaleable technology called Active Pages (http://citeseer.nj.nec.com/oskin98active.html). In this technology, a DRAM block (64Mb) is attached to a re-configurable logic block to make up a cell. This technology is more cost effective, but is difficult to configure into practical applications over those algorithms developed by the researchers.

Small Cell Systems (SIMD) - Intelligent Memory

Our technology is an extension of the principle of reducing complexity. To optimise the cell count and therefore increase the degree of parallelism achieved, we have opted to use a storage cell that compares in size to a single record within the database (2Kb), and the processing element is reduced to a simple magnitude comparator which provides for all of the common functions executed by a relational database (Select, Insert, Delete, Update). This creates an example of a Single Instruction Multiple Data (SIMD) co-processor. Thus our design provides the optimal balance between cost effectiveness and performance for database applications.