Download Presentation
## Physics of Information Technology

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Physics of Information Technology**Crystalline Computation Norm Margolus**Crystalline Computation**Spatial Computers • Architectures and algorithms for large-scale spatial computation Emulating Physics • Finite-state, locality, invertibility, and conservation laws Physical Worlds • Incorporating comp-universality at small and large scales see: http://people.csail.mit.edu/nhm/cc.pdf**Spatial Computers**• Finite-state computations on lattices are interesting • Crystalline computers will one day have highest performance • Today’s ordinary computers are terrible at these computations**Tom Toffoli and I wanted to see CA-TV**Stimulated by new rev rules Rule = a few TTL chips! Later, rule = SRAM LUT LGA’s needed more “serious” machine! CAMs 1 to 6**Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step **Lattice example**//* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step**Lattice Gas Machines**Better for modeling: • Flexible neighborhoods • Incorporate physics Better for hardware: • Easier communication • Flexible parallelism The lattice gas programming model: alternate steps of movement and interaction.**CAM-8 node**• maps directly onto programming model (bit fields) • 16-bit lattice sites (use internal dimension to get more) • each node handles a chunk of a larger space**CAM-8: discrete hydrodynamics**Six direction LGA flow past a half-cylinder. System is 2Kx1K.**Crystallization**Crystallization using irreversible forces (Jeff Yepez, AFOSR)**CAM-8: materials simulation**512x512x64 spin system, 6up/sec w rendering 256x256x256 reaction-diffusion (Kapral)**Physical sim**Logical sim Wavefront sim Microprocessor at right runs at 1KHz CAM-8: logic simulation A simple logic simulation in a gate-array-like lattice dynamics. Wavefront simulation (R. Agin)**CAM-8: porous flow**• Used MRI data from a rock (Fontainebleau sandstone) • Simulation of flow of oil through wet sandstone agrees with experiment within 10% • 3D LGA’s used for chemical reactions, other complex fluids 64x64x64 simulation (.5mm/side)**Local dynamics can be used to find and mark features**Bit maps can be “continuously” rotated using 3 shears CAM-8: image processing (Toffoli)**25 Million 16-bit site-updates/sec**8 node prototype arbitrary shifts of all bit fields at no extra cost limited by mem bandwidth still 100x PC! CAM-8 node (1988 technology) 6MB/sec x 16 DRAMs + 16 ASICs 25 MHz 64Kx16**100 x faster per DRAM chip**No custom hardware Needed fast FPGA interface to DRAM FPGA node design (1997) RDRAM FPGA 600MB/sec 1 RDRAM + 1 FPGA 150 MHz 64Kx16 pipelined**Embedded DRAM (2000)**4 Mbit Block of DRAM: 20 Blocks of DRAM (80 Mbits total): 2K 40K 2K 2K 2K bits / 40 ns 40K bits / 40 ns = 1 Tbit/sec per chip! Each chip would be 1000x faster than 128-DRAM CAM-8!**SPACERAM: A general purpose crystalline lattice computer**• Large scale prob with spatial regularity • 1 Tbit/sec ≈ 1010 32-bit mult-accum / sec (sustained) for finite difference • Many image processing and rendering algorithms have spatial regularity • Brute-force physical simulations (complex MD, fields, etc.) • Bit-mapped 3D games and virtual reality (simpler and more direct) • Logic simulation (physical, wavefront) • Pyramid, multigrid and multiresolution computations**The Task: Large-scale brute force computations with a**crystalline-lattice structure The Opportunity: Exploit row-at-a-time access in DRAM to achieve speed and size. The Challenge: Dealing efficiently with memory granularity, commun, processing The Trick: Data movement by layout and addressing The Result: Essentially perfect use of memory and communications bandwidth, without buffering SPACERAM . 18µm CMOS, 10 Mbytes DRAM, 1 Tbit/sec to mem, 215 mm2, 20W http://people.csail.mit.edu/nhm/isca.pdf**Lattice computation model**1D example: the rows are bit-fieldsand columns sites A B Shift bit-fields periodically and uniformly (may be large) A B Process sites independently and identically (SIMD)**Mapping the lattice into memory**A A B B A A B B **Mapping the lattice into memory**CAM-8 approach: group adjacent into memory words A B Shifts can be long but re-aligning messy chaining A B Can process word-wide (CAM-8 did site-at-a-time)**Mapping the lattice into memory**CAM-8: Adjacent bits of bit-field are stored together = + + +**Mapping the lattice into memory**SPACERAM: bits stored as evenly spread skip samples. = + + +**…**module #00 … module #01 … module #02 … … module #03 … module #19 PE0 PE63 The SPACERAM chip**Virtual Processor**Space-efficient parallelism Static routing (systolic) Lattice Gas Machines Simpler & more general Ideal for physical modeling Virtual SIMD programming General Lattice Computer Perfect memory embedding Finer-grained virtual-SIMD Computing Crystals Architecture in software Easier to design/build/test Hardware Evolution**Crystalline Computation**Spatial Computers • Architectures and algorithms for large-scale spatial computation Emulating Physics • Finite-state, locality, invertibility, and conservation laws Physical Worlds • Incorporating comp-universality at small and large scales see: http://people.csail.mit.edu/nhm/cc.pdf**Why emulate physics?**• Comp must adapt to microscopic physics • Comp models may help us understand nature • Rich dynamics • Start with locality: Cellular Automata**Conway’s “Game of Life”**In each 3x3 neighborhood, count the ones, not including the center: If total = 2:center unchanged If total = 3:center becomes 1 Else:center becomes 0 256x256 region of a larger grid. Glider gun inserted near middle.**Conway’s “Game of Life”**• Captures physical locality and finite-state But, • Not reversible (doesn’t map well onto microscopic physics) • No conservation laws (nothing like momentum or energy) • No interesting large-scale behavior 256x256 region of a larger grid. About 1500 steps later.**Reversibility & other conservations**• Reversibility is conservation of information • Why does exact conservation seem hard? The same information is visible at multiple positions For rev, one nth of the neighbor info must be left at the center**Adding conservations**• With traditional CA’s, conservations are a non-local property of the dynamics. • Simplest solution: redefine CA’s so that conservation is a manifestly local property • CA regular computation in space & time • Regular in space: repeated structure • Regular in time: repeated sequence of steps**Diffusion rule**Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot). Use 2x2 blockings. Use solid blocks on even time steps, use dotted blocks on odd steps.**Diffusion rule**Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot).**Diffusion rule**Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot).**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Use 2x2 blockings. Use solid blocks on even time steps, use dotted blocks on odd steps.**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks.**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks**TM Gas rule**Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks**a**b c a c d d b TM Gas rule Even steps:rotate cw Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc