210 likes | 226 Views
Computer Systems. the impact of caches. Introduction. Different sorts of memory On-die 0/1/10 cycles On-board 100 On-disk 10.000 Off-machine 1.000.000. The CPU-Memory Gap. The increasing gap between disk, DRAM and SRAM, CPU speeds.
E N D
Computer Systems the impact of caches Computer Systems – the impact of caches
Introduction Different sorts of memory • On-die 0/1/10 cycles • On-board 100 • On-disk 10.000 • Off-machine 1.000.000 Computer Systems – the impact of caches
The CPU-Memory Gap • The increasing gap between disk, DRAM and SRAM, CPU speeds. Computer Systems – the impact of caches
metric 1980 1985 1990 1995 2000 2000:1980 $/MB 8,000 880 100 30 1 8,000 access (ns) 375 200 100 70 60 6 typical size (MB) 0.064 0.256 4 16 64 1,000 DRAM metric 1980 1985 1990 1995 2000 2000:1980 $/MB 500 100 8 0.30 0.05 10,000 access (ms) 87 75 28 10 8 11 typical size (MB) 1 10 160 1,000 9,000 9,000 Disk Storage Trendsbigger, not faster (Culled from back issues of Byte and PC Magazine) Computer Systems – the impact of caches
Processor trendsfaster metric 1980 1985 1990 1995 2000 2000:1980 $/MB 19,200 2,900 320 256 100 190 access (ns) 300 150 35 15 2 100 typical size (MB)0.008 0.016 0.032 SRAM 1980 1985 1990 1995 2000 2000:1980 processor 8080 286 386 Pent P-III clock rate (MHz) 1 6 20 150 750 750 cycle time (ns) 1,000 166 50 6 1.6 750 Computer Systems – the impact of caches
Intel Processors CacheSRAM http://www.intel.com/pressroom/kits/quickreffam.htm Computer Systems – the impact of caches
L0: Smaller, faster, and costlier (per byte) storage devices Registers CPU registers hold words retrieved from cache memory. On-chip L1 cache (SRAM) L1: L1 cache holds cache lines retrieved from the L2 cache. Off-chip L2 cache (SRAM) L2: L2 cache holds cache lines retrieved from memory. Main memory (DRAM) L3: Larger, slower, and cheaper (per byte) storage devices Main memory holds disk blocks retrieved from local disks. Local secondary storage (local disks) L4: Local disks hold files retrieved from disks on remote network servers. Remote secondary storage (distributed file systems, Web servers) L5: Memory Hierarchy Computer Systems – the impact of caches
80 GB: ~$110 1GB: ~$200 4 MB: ~$500 SRAM DRAM Disk Pay the price • To access large amounts of data in a cost-effective manner, the bulk of the data must be stored on disk Computer Systems – the impact of caches
Locality • Principle of Locality: • Programs tend to reuse data and instructions near those they have used recently, or that were recently referenced themselves. • Temporal locality: Recently referenced items are likely to be referenced in the near future. • Spatial locality: Items with nearby addresses tend to be referenced close together in time. Computer Systems – the impact of caches
Locality Example sum = 0; for (i = 0; i < n; i++) sum += a[i]; return sum; • Data • Reference array elements in succession (stride-1 reference pattern): • Reference sum each iteration: • Instructions • Reference instructions in sequence: • Cycle through loop repeatedly: Spatial locality Temporal locality Spatial locality Temporal locality Computer Systems – the impact of caches
Power Programmer • Claim: Being able to look at code and get a qualitative sense of its locality is a key skill for a professional programmer. • Good locality? int sumarrayrows(int a[M][N]) { int i, j, sum = 0; for (i = 0; i < M; i++) for (j = 0; j < N; j++) sum += a[i][j]; return sum } Computer Systems – the impact of caches
Stride-M example • Question: Does this function have good locality? int sumarraycols(int a[M][N]) { int i, j, sum = 0; for (j = 0; j < N; j++) for (i = 0; i < M; i++) sum += a[i][j]; return sum } Computer Systems – the impact of caches
Matrix M=2,N=3 int sumarrowrows() int sumarrowcols() Computer Systems – the impact of caches
Expect: Stride-1 is better! • int A[2][4] Computer Systems – the impact of caches
Reality: small matrices fit in cache • int A[32][32] Computer Systems – the impact of caches
Reality: Performance-drop cache L2 / L1 not dramatic • int A[180][180] Computer Systems – the impact of caches
Reality: Only when DRAM is accessed,the penalty can be seen • int A[512][512] Computer Systems – the impact of caches
Memory Mountain Computer Systems – the impact of caches
Summary • As long as your data fits in the cache, and your program shows good locality, good performance is guaranteed. Computer Systems – the impact of caches
Assignment • Practice Problem 6.9 (p. 624): 'Order three functions to the spatial locality enjoyed by each.' • Practice Problem 6.22 (p. 659): 'Estimate the time, in CPU cycles, to read a 8-byte word, from the different L1-d of a i7 processor Computer Systems – the impact of caches