250 likes | 412 Views
This presentation discusses innovative methodologies for minimizing clock skew in synchronous circuits, particularly under the influence of time-variant temperature gradients. By building a parameterized macro model and applying a novel algorithm called PECO, the research addresses significant temperature-induced skew issues that other methods overlook, particularly spatial variations. This work demonstrates how considering correlations in temperature variations can effectively optimize clock tree embedding, showcasing experimental results that reveal up to a fivefold reduction in worst-case skew compared to previous algorithms.
E N D
Minimal Skew Clock Embedding Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu Hu Partially supported by NSF and UC MICRO funds.
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
PLL Disp Source Intel AUDIO MEM-ctrll VIDEO Sys Clock Tree Synthesis in Synchronous Circuits • Clock signals synchronize data transfer between functional elements in synchronous design • Different clock structures exist [Tree, Mesh, Hybrid, etc] • Clock skew is the delay difference between two sinks of clock tree • Clock skew becomes one of the most significant concerns in clock tree synthesis for high performance designs
v s0 a b s2 s0 s4 s1 s3 s2 s3 a b s1 v s4 Methodologies for Clock Skew Minimization • The sources of skew • Un-balanced clock distribution • Process, supply voltage and temperature (PVT) variation • Uncertainty from loading • Methodologies • Active de-skew circuit using micro-controller [Rusu’00] • Passive balanced embedding by CAD algorithms [Tsay'91] [Edahiro'91] [Chao'92] [Boese'92] [Cong’98] Variation-induced skew needs to be considered! Embedding Topo-Gen
Existing work and Our Contributions • This work is focused on reducing the temperature variation induced skew • The existing work for temperature aware clock skew minimization[Cho:ICCAD’05] • Considered only spatial temperature variations • The time-variant temperature variation was ignored • Assumed the worst case temperature map was given • The major contributions of this work • Build a parameterized macro model for temperature variations • Present an effective algorithm PECO, which consider the time-variant temperature variation with correlation • PECO reduces worst case skew by up to 5x compared with the ZST/DME algorithm
Outline • Backgrounds and Motivations • On-chip Temperature Variation Modeling • Variation Sources: Spatial & Temporal • Temperature Correlations • Algorithms • Experimental Results • Conclusions
Spatial Temperature Variation Induced Skew • Spatial variant: Non-uniform power density generates on-chip temperature gradient • Clock tree embedding considering the spatial temperature variation: TACO[Cho:ICCAD’05] • Ignore the time-variant temperature under different workloads
Temporal Temperature Variation Induced Skew • Significant different temperature maps from two SPEC2000 applications: Ammp, Gzip Dilemma: Optimizing skew for one application hurts the other….
Problem Formulation • Given: • The source, sinks andan initial embeddingof the clock tree • Each region is modeled by mean and variance for temperature, and correlation between variations • To find: • An re-embedding of the clock tree • To Minimize the worst case skew under all temperature variations
Considering temperature correlations during optimization can compress searching space! Correlations in Temperature Variation • Spatial and Temporal Correlation: Strong correlations exist between temperature for different workloads and different regions on chip • Resource sharing between workloads cause temporal correlation (i,j) Correlation between area i and j
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Re-embedding Algorithm • Experimental Results • Conclusions
v x y a b c d Re-embedding Process (An example) Perturbation option Sink Original merging point
v x y a b c d Re-embedding Process (An example) New merging point
Delay, Skew Calculation for Clock Tree • The clock tree is a SIMO linear system • Cares impulse responds in each sinks • Perturbed Modified Nodal Analysis (MNA) • x is for source, sinks and merging point • L selects sink responses • Defining a new state variable with both nominal (x) and perturbed state variables (Δx) • Structured and parameterized state matrix The number of perturbation configurations I=5N is huge! (N is number of merging points)
Compressing State Matrix by Temperature Correlation • Motivations • Spatial and temporal correlation of the temperature values excludes the need to exhaustively calculate all perturbation combinations • Highly correlated merging points should be perturbed in the same fashion • Solution • Clustering merging points based on correlation strength • Perform the same perturbation for all points within one cluster
Merging Points Clustering by Temperature Correlation • Objective • Given correlation matrix C of them, a low-rank matrix, N >> K • Partition N merging points into K clusters • Maximize the correlation strength within each of K clusters C
Low-Rank Approx. Merging Points Clustering by Temperature Correlation • Objective • Given correlation matrix C of them, a low-rank matrix, N >> K • Partition N merging points into K clusters • Decide the clustering number K • Singular Value Decomposition (SVD) reveal the real rank (K) information from C • Partition the merging points into K clusters • K-Means clustering algorithm is employed. • K = 4, N = 70 • Reduced from 570 to 54
Cluster based reduction (SVD + K-Means) Structural reduction [Hao Yu, DAC’06] Transient time analysis (Back-Euler) Structural Reduction & Transient Time Analysis
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
Experimental Settings • Temperature variation profiles obtained by micro-architecture level power-temperature transient simulator [Liao,TCAD’05] with 6 SPEC2000 applications • 100 temperature profiles are collected under every 10 million clock cycles • Compare two algorithms: • DME method: minimize wire-length for zero-skew under Elmore delay model with nominal temperature • Our PECO: minimize skew under a more accurate high-order macromodel with temperature variations
Skew Distribution • Under 100 temperature maps, and PECO reduces worst-skew and the mean skew
Experimental Results (cont.) • PECO reduces the worst-case skew by up to 5X (i.e., for net r5) • Skew measured in higher-order delay model considering temperature variations for all applications • Skew reduction increases for larger clock nets • PECO increases wire-length by less than 1% • Runtime • Optimization time of PECO is less than DME • Model building time is still long but more accurate
Outline • Backgrounds and Motivations • Modeling and Problem Formulation • Algorithms • Experimental Results • Conclusions
Conclusions • Studied the clock optimization for workload dependent temperature variation • Reduced the worst-case skew by up to 5X with only 1% wire-length overhead compared to best existing method • The methodologies can be extended to handle • PVT variations with spatial correlations • Other design freedoms such as, floorplanning, power/ground optimization, etc
Thank you! ACM International Symposium on Physical Design 2007 Hao Yu (graduated), Yu Hu, Chun-Chen Liu and Lei He Minimal Skew Clock Embedding Considering Time Variant Temperature Gradient