200 likes | 343 Views
This document explores a component-based definition of spatial locality, emphasizing how nearby data elements are accessed when a specific data element is used. It discusses the overall miss rate and how data layout can be improved despite certain constraints. Through experimental data from SPEC benchmarks, the study introduces the concept of reuse distance and signature, analyzing effective and ineffective spatial reuse. With performance implications and potential applications for user tuning and cache hints, this research provides valuable insights into optimizing data structures for better locality.
E N D
Traditional Definition of Spatial locality • When a data element is accessed, the nearby data elements will also be accessed • Overall miss rate • The fewer the misses, the better the layout
Questions • Can the overall spatial locality be decomposed into finer components? • How much can the locality of a given data layout be improved? • Can a data layout be improved if the miss rate cannot be lowered?
A Component-based Definition of Spatial Locality • Based on the reuse distance • Based on components
Reuse Distance • The reuse distance of a memory access is the number of distinct data elements accessed between this and the previous access to the same data.
Reuse Signature • The distribution of all reuse distances • In our experiment, we use log sized bins
A Component-based Definition ofSpatial Locality • Spatial locality measures the change in reuse distance when the data block size changes from b1 to twice the size b2 = 2*b1. b1 b2
Cases of Change Data layout b1 Trace x ........................................... x Data layout Case 1: x ........................................... x b2 Case 2: x y x ...........................................
Data Layout Quality • Effective spatial reuse: the bin they are located has been moved to left by at least C bins, when the block size is doubled • Ineffective spatial reuse: the other reuses • In our experiment, we pick C to be 3 • The data layout quality of a bin: • 2 * effective spatial reuse / total spatial reuse
Component • Data reuses of nearby bins that have a similar portion of effective spatial reuse • The spatial locality of a component is the weighted average of the spatial locality of each bin • We manually examine the bins and groups them into components
Experimental Setting • 7 SPEC2000 (equake, art, swim, gzip, mcf, crafty, twolf) and 1 SPEC2006 (milc) • Data size varies from 1.2MB to 72MB • Trace length varies from 7.7 billion to 400 billion • Valgrind[NethercoteSeward’07] to collect traces • Augmented reuse distance analyzer [DingZhong’03] • Several hundred times slower
Why Optimized Swim Better L1 miss rate reduced by 6.7% Performance increased by 8.1% 4% reference: 0.16 -> 0.99
Good spatial locality Good temporal locality Poor Spatial locality
Possible Uses • User tuning [Levinthal] • Superpage management • Data-based cache hints [ Fang’05, BeylsD’Hollander’05]
Related Work • Spatial locality [BuntMurphy’84,Weinberg+’05] • Component-based analysis [DingZhong’03,Shen+’03] • Spatial uses within a loop [WolfLam’91,McKiney+’96]
Summary • A quantitative model of spatial locality • We have tested our new model on 8 SPEC benchmark programs • Among 18 components, 2 have good temporal locality, 6 have good spatial locality, 4 have poor spatial locality/poor temporal locality/large • Possible uses