1 / 52

Lecture 2: Intro to Computer Architecture

Lecture 2: Intro to Computer Architecture. Michael B. Greenwald Computer Architecture CIS 501 Fall 1999. General Information. Class: TR 1:30-3, in LRSM Auditorium Recitation: T 10:30-12 in Moore 225

chen
Download Presentation

Lecture 2: Intro to Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2: Intro to Computer Architecture Michael B. Greenwald Computer Architecture CIS 501 Fall 1999

  2. General Information • Class: TR 1:30-3, in LRSM AuditoriumRecitation: T 10:30-12 in Moore 225 • Instructor: Professor Michael GreenwaldOffice: Moore (GRW), room 260email: cis501@cis.upenn.eduOffice hours: R10:30-12noon or by appt. • TA: Sotiris IoannidisOffice: Moore, room 102eemail: sotiris@dsl.cis.upenn.eduOffice hours: TR5-6PM or by appt. • Secretary: Christine MetzOffice: Moore, room 556

  3. Outline • Review • Quantitative principles of computer design • Amdahl’s law • CPU performance equation • Quantitative measurements • Costs • Performance

  4. Typos in HW 3c. • New version on web page. • D = defects/ • Defects per layer

  5. Technology Trends: Microprocessor Capacity “Graduation Window” Alpha 21264: 15 million Pentium Pro: 5.5 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Sparc Ultra: 5.2 million Moore’s Law • CMOS improvements: • Die size: 2X every 3 yrs • Line width: halve / 7 yrs

  6. Trends in application demands • Program increase memory demands by factor of 1.5-2 per year (1/2 to 1 bit/year) • Avail. disk space (or net bw) is always consumed. • User I/O bandwidth grows: tty->crt->bitmap->video->?virtual reality? • Processing power: cheapest to produce one version of program. Optimize for mid-range. Slow on low-end, fast on high-end. Are these demands growing because of increased capabilities or increased appetites?

  7. The Quantitative Approach

  8. Measurement and EvaluationQuantitative Approach • Architecture is an iterative process: • Searching the space of possible designs • At all levels of computer systems Cost / Performance Analysis Creativity Good Ideas Mediocre Ideas Bad Ideas

  9. Measurement and EvaluationQuantitative Approach • Not a guarantee of good ideas, just a way to discard bad ideas. Cost / Performance Analysis Creativity Good Ideas Mediocre Ideas Bad Ideas

  10. Computer Engineering Methodology Technology Trends

  11. Computer Engineering Methodology Evaluate Existing Systems for Bottlenecks Benchmarks Technology Trends

  12. Computer Engineering Methodology Evaluate Existing Systems for Bottlenecks Benchmarks Technology Trends Simulate New Designs and Organizations Workloads

  13. Computer Engineering Methodology Evaluate Existing Systems for Bottlenecks Implementation Complexity Benchmarks Technology Trends Implement Next Generation System Simulate New Designs and Organizations Workloads

  14. Measurement Tools Measure • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation • Simulation (many levels) • ISA, RT, Gate, Circuit • Queuing Theory • Rules of Thumb • Fundamental “Laws”/Principles Experiment Analyze Design

  15. Measurement Tools Measure • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation • Simulation (many levels) • ISA, RT, Gate, Circuit • Queuing Theory • Rules of Thumb • Fundamental “Laws”/Principles Experiment Analyze Design All produce “measures”: what do measures mean? How do they compare?

  16. DC to Paris Speed Passengers Throughput (pmph) 6.5 hours 610 mph 470 286,700 3 hours 1350 mph 132 178,200 The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concorde • Time to run the task (ExTime) • Execution time, response time, latency • Tasks per day, hour, week, sec, ns … (Performance) • Throughput, bandwidth

  17. DC to Paris Speed Passengers Throughput (pmph) 6.5 hours 610 mph 470 286,700 3 hours 1350 mph 132 178,200 The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concorde • Which is better?

  18. DC to Paris Speed Passengers Throughput (pmph) 6.5 hours 610 mph 470 286,700 3 hours 1350 mph 132 178,200 The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concorde • Which is better? It depends if you are trying to win a race from DC to Paris, or you are trying to move the most people.

  19. DC to Paris Speed Passengers Throughput (pmph) 6.5 hours 610 mph 470 286,700 3 hours 1350 mph 132 178,200 The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concorde • Even if trying to move most people, performance is useless without understanding cost. Else, why not just fly two Concordes at once, doubling throughput? 747-400, $160M in ‘98

  20. Costs • Performance metrics are mostly useless without understanding costs.

  21. Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Wafer Defect Die Smaller dies are cheaper, and reduce cost per defect.

  22. Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Defect Smaller dies are cheaper, and reduce cost per defect.

  23. IC Cost parameters  Number of masking levels (measure of manufacturing complexity), was typically 3.0, growing wafer yield = wafers that are not completely bad. Typically close to 100% Defects per unit area = 0.6 to 1.2 per cm2. Drops with learning curve. Die Cost goes roughly with die area4

  24. Integrated Circuits Costs IC cost = Die cost + Testing cost + Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer =  * ( Wafer_diam / 2)2 –  * Wafer_diam – Test dies Die Area  2 * Die Area Die Yield = Wafer yield * 1 +  Defects_per_unit_area * Die_Area  { } Die Cost goes roughly with die area4

  25. Integrated Circuits Costs Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer =  * ( Wafer_diam / 2)2 –  * Wafer_diam – Test dies Die Area  2 * Die Area Die Yield = Wafer yield * 1 + Die Cost = Wafer cost * 1 +  * ( Wafer_diam / 2)2 –  * Wafer_diam Die Area  2 * Die Area  Defects_per_unit_area * Die_Area  { }  Defects_per_unit_area * Die_Area  { } Die Cost goes roughly with die area4

  26. IC Cost parameters Defects per unit area = 0.6 to 1.2 per cm2 Technologies that can fix defects (e.g. lasers a’la Lincoln Labs (MIT)), reduce effective defects per unit area and increase yield. However, need to understand costs which differ from formula. Still: Die Cost goes roughly with die area+1

  27. Real World Examples(circa ‘93) Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer 386DX 2 0.90 $900 1.0 43 360 71% $4 486DX2 3 0.80 $1200 1.0 81 181 54% $12 PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53 HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73 DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149 SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272 Pentium 3 0.80 $1500 1.5 296 40 9% $417 • From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15

  28. Other Costs Die Test Cost = Test Jig Cost * Ave. Test Time Die Yield Packaging Cost: depends on pins, heat dissipation • Chip Die Package Test & Total cost pins type cost Assembly • 386DX $4 132 QFP $1 $4 $9 • 486DX2 $12 168 PGA $11 $12 $35 • PowerPC 601 $53 304 QFP $3 $21 $77 • HP PA 7100 $73 504 PGA $35 $16 $124 • DEC Alpha $149 431 PGA $30 $23 $202 • SuperSPARC $272 293 PGA $20 $34 $326 • Pentium $417 273 PGA $19 $37 $473

  29. Average Discount Gross Margin Component Cost Cost/PerformanceWhat is Relationship of Cost to Price? • Component Costs • Direct Costs(add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty • Gross Margin(add 82% to 186%) nonrecurring costs: R&D, marketing, sales, equipment maintenance, rental, financing cost, pretax profits, taxes • Average Discountto get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price 25% to 40% Avg. Selling Price 34% to 39% 6% to 8% Direct Cost 15% to 33%

  30. Average Discount Gross Margin Component Cost Cost/PerformanceWhat is Relationship of Cost to Price? • Component Costs • Direct Costs(add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty • Gross Margin(add 82% to 186%) nonrecurring costs: R&D, marketing, sales, equipment maintenance, rental, financing cost, pretax profits, taxes • Average Discountto get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price Avg. Selling Price Discretion Direct Cost

  31. Chip Prices (August 1993) • Chip Area Mfg. Price Multi- Comment • mm2 cost plier • 386DX 43 $9 $31 3.4 Intense Competition • 486DX2 81 $35 $245 7.0No Competition • PowerPC 601 121 $77 $280 3.6 • DEC Alpha 234 $202 $1231 6.1Recoup R&D? • Pentium 296 $473 $965 2.0 Early in shipments • Assume purchase 10,000 units

  32. Summary: Price vs. Cost

  33. Cost/Price/ProfitHow is R&D funded? • R&D 4% to 12%, contributes to gross margin (it is an indirect cost) • Two views: • Only 4% of income on R&D! • Investment: every $1 spent on R&D should lead to $8 to $25 in sales!

  34. PERFORMANCE

  35. DC to Paris Speed Passengers Throughput (pmph) 6.5 hours 610 mph 470 286,700 3 hours 1350 mph 132 178,200 The Bottom Line: Performance (and Cost) Plane Boeing 747 BAD/Sud Concorde • Even if trying to move most people, performance is useless without understanding cost. Else, why not just fly two Concordes at once, doubling throughput? 747-400, $160M in ‘98

  36. Performance Terminology • Time versus Performance: duration vs. rate. • Time: response time = execution time • Rate: throughput • Reciprocals: there is both a time and a performance measure for any performance metric. • “Improve performance”: time decreases, performance increases For computer systems the key performance metric is total execution time

  37. Meaning of “Execution Time”(a.k.a. Response time) • Wall-clock-time, response time, elapsed-time: latency (including idle time) • vs. CPU Time: non-idle • System vs. User time: both elapsed and CPU • system performance: elapsed time on unloaded system (includes OS + idle time) • CPU performance: user CPU time on unloaded system

  38. Terminology • What do we mean when we compare two measures and say that “X is n times faster than Y”?

  39. The Bottom Line: Performance (and Cost) • "X is n times faster than Y" means • ExTime(Y) Performance(X) • --------- = --------------- = n • ExTime(X) Performance(Y) • Speed of Boeing 747 vs. Concorde • Throughput of Boeing 747 vs. Concorde

  40. The Bottom Line: Performance (and Cost) • "X is n times faster than Y" means • 286,700 Performance(X) • ----------------------- = 1.60 • 178,200 Performance(Y) • Speed of Boeing 747 vs. Concorde • Throughput of Boeing 747 vs. Concorde

  41. The Bottom Line: Performance (and Cost) • "X is n times faster than Y" means • 286,700 Performance(X) • ----------------------- = 1.60 • 178,200 Performance(Y) • Speed of Boeing 747 vs. Concorde • Throughput of Boeing 747 vs. Concorde Note: Natural or meaningful units. Hours per passenger-mile is slightly weirder than passenger-miles per hour.

  42. Measurement Tools Measure • Benchmarks, Traces, Mixes • Hardware: Cost, delay, area, power estimation • Simulation (many levels) • ISA, RT, Gate, Circuit • Queuing Theory • Rules of Thumb • Fundamental “Laws”/Principles Experiment Analyze Design ENGINEERING:Convert this to that

  43. Fundamental Principle of Computer Design • Make the common case fast • In every trade-off, favor the frequent case over the infrequent case. • But how do we quantify this? At what point is the cost to the infrequent case sufficiently large as to offset speedups to the frequent case?

  44. Fundamental Principle of Computer Design • Make the common case fast • In every trade-off, favor the frequent case over the infrequent case. • But how do we quantify this? At what point is the cost to the infrequent case sufficiently large as to offset speedups to the frequent case? Amdahl’s Law quantifies this principle

  45. Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = ------------- = ------------------- ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

  46. Amdahl's Law Speedup due to enhancement E: ExTime w/o E Performance w/ E Speedup(E) = ------------- = ------------------- ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected

  47. Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced

  48. Amdahl’s Law: Example • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew= Speedupoverall =

  49. Amdahl’s Law: Example • Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew= ExTimeold x (0.9 + .1/2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95

  50. Amdahl’s Law: Example • Suppose fetching a page from a web cache is 1000 times faster than getting the page over the net, but hit rate on cache is only 30% ExTimenew= Speedupoverall =

More Related