Enumera CPU . Low cost super computing solutions. History of the CPU Core. 1968 Chuck Moore invents the FORTH Language 1980 Developed by Chuck Moore at Forth Inc. 1983 Broke from Forth Inc to become Novix 1985 Harris Buys Novix Chip becomes Harris RTX family
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Low cost super computing solutions
US05070451 12/03/1991 Forth specific language microprocessor
US05319757 06/07/1994 FORTH specific language microprocessor
US05440749 08/08/1995 High performance, low cost microprocessor architecture
US05530890 06/25/1996 High performance, low cost microprocessor
US05604915 02/18/1997 Data processing system having load dependent bus timing
US05659703 08/19/1997 Microprocessor system with hierarchical stack and method of operation
US05784584 07/21/1998 High performance microprocessor using instructions that operate within instruction groups
US05809336 09/15/1998 High performance microprocessor having variable speed system clock
These chips are stack based processors
There are no registers in the “normal Sense” they use a stack instead.
Often they are referred to as FORTH Chip's but the language they use is very different from FORTH. The CPU's will support more common languages such a C.
The Book cover to the Left is by
Context switching for interrupts takes essentially zero time
Does not need to be pipelined for ALU and operands, because the operands are immediately available in the top of stack
Code is smaller than CISC programs by a factor of 2.5 to 8
Comments from Phil Koopman’s Book:
Size and Weight
Power and Cooling
process .8 um (Current state of the Art 0.18)
Mips: 500 (asynchronous design, No Crystal Voltage dependent)
The F21 comes with a network co-processor for parallel processing
Video co-processor (NTSC out in the I21 Design)
Analog processor ( 6 Bit 40 Mhz sample rate )
1 Instruction per clock cycle : A total of 16,000 transitors!!
Very predictable behavior. Full code simulator for Windows.
Designed with OKAD, pronounced Oh CAD, is the VLSI design environment.
Custom software written by Chuck
Will allow for fill electrical simulation
Will allow for Macro and Micro thermal simulation
(this allow for better optimization of design performance)
John Sokol started Enumera to be able to expand research into parallel computing.
This began as an extension from using clusters for video compression and high performance web and video serving.
The first project was 100 Pentium CPU's in a 6" rack. Ethernet Multicast Boot on disk less motherboards. Designs for over 300 CPU's in a 6" (single depth) rack were drafted but dropped when we found the Chuck Moore CPU's.
1980 experimenting with 8x 6809 CPU's with shared memory
(wire wrapped) developed a RAM memory bank passing system.
1984 built a cluster of 20+ Commador 64's.
1988 -92 had opportunity to work on a Cray YMP.Then ported finite element of code for Maxwell's equations to a SGI 4 CPU SMP box.
1993 At Sun Micro Systems I used 384 globally distributed servers to deliver live and recorded video to 13,000 viewers.
1994 Wells Fargo I worked on architecture to use 600 HPUX boxes located in the bank branched to reduce load on the Main Frames.
1995 to Present, have been using cluster of loosely coupled server for
video compression, serving, Neural Networks and Genetic Programming
These numbers are proprietary.
process V Mips mA mW
.8 um 5.0 500 20 100 F21(tested)
.35 3.0 1100 15 45
.35 3.3 1200 16 54
.18 1.5 2000 12 18
.18 1.8 2400 12 21
Numbers are guesstimates pending simulation.
Speed can be throttled by a factor of 1024, reducing power to uW.
1.) As low cost, low power devices, using a single core and co-processors.
2.) As an ultra high performance cluster on a chip, still low power and cost.
As a stand alone processor the F21 already uses lower power then the Dragon ball, ARM processor or MIPS cores such as the NEC VR4181.
The F21 is also much lower cost, and run over 100x faster then a typical Motorola Dragon Ball(2.7 Mips).
Enumera plans to release a version of this with out the video and some internal DRAM as a single chip MP3 player.
Co-processor dedicated to audio or I/O processing can be
As a single chip computing cluster (multi cores on one chip) there a many advantages.
The Cores are asynchronous so this will average out the draw on the power supply.
Each core will have 4 to 6, 1+ Gigabit per second bi-directional serial interfaces to it's neighboring CPU cores.
Each Gbps serial will be addressable as a simple register read and write.
A patent is being filed for a cooling and power delivery system to allow for enormous clusters and ram on a very large chips.
In theory it would be possible to build a single device running at 10 TeraFlops!
On this size Chip a 7x7 array (49 CPU’s) with ram could be
build. Co-processors could also be added.
Each CPU’s would be operating at 2400 MIPS x 49 for a total of 117 Billion operations per second.
The power consumption would be 1 watt 1.8 Volts a 500 mA.
With this level of computing power new applications that were unthinkable before, now become possible.
at over 1 GHz.! A new tracking a/d design is also expected to increase performance futher.
Ultra High speed PWM output design will reduce the size and complexity of power drivers for motors and actuators.
Disposable computing. (MP3 Players, PDA's, like Calculators and watches are today)
Improved servos and more advanced coding systems then Viterbi.
(increase storage capacity and reduce cost.)
Reduction of component count.
Possibly eliminate external RAM.
Addition of Video and Audio compression.
Addition of TCP/IP servers on the drive itself.
Addition of RAID on the drive.