320 likes | 336 Views
Explore innovative compression techniques to increase L1 cache capacity and reduce tag space requirements in ICCD'05 San Jose. Investigate AWN, AHS, AAHS, and OATS methods for efficient cache block utilization.
 
                
                E N D
Restrictive Compression Techniques to Increase Level 1 Cache Capacity Prateek Pujara, Aneesh Aggarwal {prateek, aneesh} @ binghamton.edu State University of New York at Binghamton Presented By: Prateek Pujara ICCD'05 San Jose
OUTLINE • Introduction • Motivation • Restrictive Compression Schemes • AWN - All Words Narrow • AHS - Additional Half-word Storage • Enhanced Techniques • AAHS - Adaptive AHS • OATS - Optimizing Address TagS • Conclusion ICCD'05 San Jose
Processor Memory Gap • Performance gap between processor and memory is increasing. • Cache memory is used to bridge this gap. ICCD'05 San Jose
Cache Problems • Access Latency • Energy consumption • Suggested Solutions • Small size • Pipelining • Decoupled tag and data ICCD'05 San Jose
Pipelined Cache Access ICCD'05 San Jose
Is that enough? • Pipelining the cache prevents reduction in throughput. • Decoupling the access results in minimal energy consumption. • However small size can result in performance loss. ICCD'05 San Jose
Alternative solutions • Cache compression • Many elaborative techniques are proposed for L2 cache/memory. • L2 cache can tolerate the overhead Can L1 tolerate? ICCD'05 San Jose
Previous work • Frequent Value Cache (FVC) • Small cache is provided to store frequently seen values in a cache block. • Higher order insignificant bits are compressed to save energy. ICCD'05 San Jose
Problem with L1 cache compression • Cannot tolerate the increase in latency. • Elaborate techniques cannot be used. • Should not update the byte-offset. • For example • A block is compressed by ignoring all the insignificant higher order bits/bytes. • The byte-offset of each word depends on size of words before it. ICCD'05 San Jose
Contribution of this work • We investigate techniques to increase the L1 cache capacity • narrow widths of the data is explored. • Our compression techniques • AWN (All Words Narrow) • AHS (Additional Half-word Storage) • AAHS (Adaptive AHS) • We also propose OATS to reduce the tag space requirement, which is inevitable with any compression technique. ICCD'05 San Jose
Narrow width data Narrow Word: A word, which can be represented using half the number of bits. (16 in case of 32-bit architecture) ICCD'05 San Jose
TERMS USED IN THE PAPER • Narrow Cache BlockA cache block, which contains all narrow words i.e. all the words are represented by half the number of bits. • Normal Cache BlockA cache block in which all the words are represented using entire set of bits. • Physical Cache BlockPhysical space provided in the cache to store a cache block. ICCD'05 San Jose
AWNAll Words Narrow • All the words in the block should be narrow words. • All the narrow words are compressed into half the size. • Thus size of the cache block is reduced to half. ICCD'05 San Jose
AWNAll Words Narrow ..00 ..00 0000 ..00 0000 ..11 0111 1001 ICCD'05 San Jose
AWNAll Words Narrow ffff 0000 03af 0000 0000 93af 0000 7401 physical cache block narrow cache block space for another narrow cache block ICCD'05 San Jose
AWNAll Words Narrow ..00 0000 ..00 0000 ..11 ..00 1001 1001 normal cache block ICCD'05 San Jose
Additional Hardware • Additional tag space provided for each physical cache block. • A width bit is provided for each physical cache block. • width bit = 0 ----- normal cache block • width bit = 1 ----- 2 narrow cache blocks ICCD'05 San Jose
Implementation details Byte-offset = 3 Byte-offset = 3 Width bit = 1 Width bit = 0 Byte-Offset decoder Byte-Offset decoder Size = 32 bits Size = 32 bits 32 bits 16 bits 16 bits 1 normal block 2 narrow blocks 03af0000 93af7401 2794ffff 98f14000 03af0000 93af7401 2794ffff 98f14000 16 bits 16 bits 32 bits 93af 98f1 2794fff Conventional case or Compressed block with width bit = 0 Compressed block with width bit = 1 ICCD'05 San Jose
Implementation details Replacement Policy • The replacement policy is still LRU. • If the new cache block is a narrow cache block then conventional LRU policy is used. • If the new cache block is a normal cache block, MRU information is used for replacement. • This ensures that the technique does not perform worse than the conventional cache. ICCD'05 San Jose
Implementation details Replacement Policy Width-bit 1 Width-bit 1 new cache block (normal cache block) narrow cache block (16) narrow cache block(6) Width-bit 1 new - normal cache block (0) Width-bit 0 ICCD'05 San Jose
AHSAdditional Half-word Storage • Limitations of AWN • Even a single normal word makes the whole block a normal block. • Additional half-word storage is provided to convert these blocks to narrow blocks. • An extra storage bit is provided for each word in the physical cache block. ICCD'05 San Jose
AHSAdditional Half-word Storage narrow cache block 0000 03af 0000 0000 ffff 93af 0000 7401 space for another narrow cache block physical cache block 0 0 0 0 0 0 0 0 extra storage bits 2 extra half-words per physical cache block ICCD'05 San Jose
AHSAdditional Half-word Storage normal cache block 0000 03af 0000 f f f f 93af 01f0 9401 0000 space for another narrow cache block physical cache block 03af 0000 93af 9401 xxxx 01f0 1 0 extra storage bits 2 extra half-words per physical cache block ICCD'05 San Jose
Increase in cache capacity ICCD'05 San Jose
Limitations of AHS • Additional half-word storage space is not optimally utilized. • Extra half-word space is equally divided among the potential narrow cache blocks that can occupy the physical cache block. • Thus the physical cache block cannot contain one block with 1 normal-sized word and another block with 3 normal sized words in case of 4 extra half-word space provided. ICCD'05 San Jose
Adaptive AHS - AAHS • An adaptive scheme that allows the blocks to take varied number of extra half-words. • 2 extra storage bits are required because a cache block can use more than 1 half words. • To avoid the increase in extra storage bits we restrict a cache block to take only 3 half-words in the case of 4 extra half-word space provided. ICCD'05 San Jose
Adaptive AHS - AAHS 3801 0100 0010 0000 ffff f8b2 0000 9401 0000 03af 0000 0000 00ff 7401 ffff 93af narrow cache block with 1 normal-sized words narrow cache block with 3 normal-sized words physical cache block 93af 7401 3801 0000 f8b2 9401 03af 0000 xxxx xxxx xxxx xxxx 00ff 0100 0010 0000 01 00 01 10 11 00 00 00 extra storage bits 4 extra half-words per physical cache block ICCD'05 San Jose
Optimizing Address TagS -OATS • Additional tag space and tag comparisons are required for AWN and AHS techniques. • Intuitively, the higher order bits of the address tags in a set are expected to be the same. • Instead of providing the entire set of bits used for the address tag, only a small number of additional tag bits are provided for each physical cache block. ICCD'05 San Jose
Optimizing Address TagS -OATS • A physical cache block with 22 bits address tag in the conventional cache, may be provided with 24 bits, partitioned into 3 parts: • 20 higher order bits - common for both the blocks. • 2 lower order bits separate for each block. • A physical cache block can hold • 1 normal cache block or • 2 narrow cache blocks that have the same 20 higher order tag bits. ICCD'05 San Jose
Increase in cache capacity ICCD'05 San Jose
CONCLUSION • We proposed Restrictive Compression techniques that do not require update of byte-offset and hence does not impact the cache access latency. • Our basic technique AWN compresses a block only if all the words in the block are of small size and results in 20% increase in cache capacity. • We extended the AWN technique by providing some additional space for upper half words (AHS, AAHS) which resulted in about 50% increase in cache capacity while incurring a 38% increase in storage space. • To avoid the additional tag requirement (which is inevitable with compression) we proposed OATS technique which reduces the overhead of AHS to about 30%. ICCD'05 San Jose
Questions/Comments? Prateek Pujara - prateek@binghamton.edu Aneesh Aggarwal - aneesh@binghamton.edu Electrical and Computer Engineering Department State University of New York at Binghamton ICCD'05 San Jose