Download
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.uta PowerPoint Presentation
Download Presentation
Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.uta

Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.uta

231 Views Download Presentation
Download Presentation

Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.uta

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah http://www.cs.utah.edu/~rajeev

  2. What is Computer Architecture?

  3. What is Computer Architecture? • If the Intel Pentium4 has a faster clock speed than the • IBM Power4, does it execute your programs faster?

  4. What is Computer Architecture? • If the Intel Pentium4 has a faster clock speed than the • IBM Power4, does it execute your programs faster? Case 1: Completing instruction Clock tick Case 2: Time

  5. What is Computer Architecture? • To a large extent, computer architecture determines: • the number of instructions used to execute a program • the time each instruction takes to execute • the idle cycles when no work gets done • the number of instructions that can execute in parallel

  6. A Typical Microprocessor Branch Predictor L1 Instr Cache Decode & Rename Issue Logic L2 Cache L1 Data Cache ALU ALU ALU ALU Register File

  7. Architecture Trends in the 90s • Performance was the ultimate metric • Transistors were a limiting factor • As on-chip transistors became available in the 90s, more functionality • and complex circuitry was added to boost performance – most of the • low-hanging fruit has now been picked

  8. Hitting the Wall • We have now hit the following walls: • Single core performance • Memory • Complexity • Power, temperature

  9. Hitting the Power Wall From Shekhar Borkar, MICRO’99 Power is as important a metric today as performance

  10. The Advent of Multi-Core Chips Core Cache bank • In the past, performance magically increased by 50% every year • In the future, this improvement will be only ~20% every year • … unless … the application is multi-threaded!

  11. Upcoming Architecture Challenges • Improving single core performance • Functionalities in multi-core chips • Simplifying the programmer’s task • Efficient interconnects • Power and temperature-efficient designs • Designs tolerant of errors For publications, see http://www.cs.utah.edu/~rajeev/research.html

  12. Interconnects as a Bottleneck • In the past, on-chip data transmission on wires cost almost nothing • Interconnect speed and power has been improving, but not at the • same rate as transistor speeds • Hence, relative to computation, communication is much more expensive • In the near future, it will take 100 cycles to travel across the chip • 50% of chip power can be attributed to interconnects

  13. Interconnects in Multi-Core Chips CPU 1 CPU 2 L2 cache L2 control L2 control CPU 3 L1 A A A A A A A

  14. Not all Wires are Created Equal B-Wires L-Wires W-Wires PW-Wires Relative latency 1x 0.5x 1.6x 3.2x Relative area 1x 4x 0.5x 0.5x Dynamic power (W/m) 2.65a 1.46a 2.9a 0.87a Static Power (W/m) 1.02 0.57 1.16 0.31

  15. Data Transfers have Varying Needs • Example of a cache coherence transaction: • Read exclusive request for a shared block

  16. Other Interconnect Choices • Optical interconnects: speed of light, cost in converting • between optical and electrical domains • 3D chips: reduces communication distances, low cost • for vertical signal transmission, increase in power density

  17. 3D Layouts Cluster Cache bank Intra-die horizontal wire Inter-die vertical wire Die 1 Die 0 (a) Arch-1 (cache-on-cluster) (b) Arch-2 (cluster on cluster) (c) Arch-3 (staggered)

  18. Upcoming Architecture Challenges • Improving single core performance • Functionalities in multi-core chips • Simplifying the programmer’s task • Efficient interconnects • Power and temperature-efficient designs • Designs tolerant of errors Clustered architectures: relatively low complexity scalable solution easily handles multiple threads

  19. Upcoming Architecture Challenges • Improving single core performance • Functionalities in multi-core chips • Simplifying the programmer’s task • Efficient interconnects • Power and temperature-efficient designs • Designs tolerant of errors Heterogeneous perf/power Cores that execute the OS Cores that verify results

  20. Upcoming Architecture Challenges • Improving single core performance • Functionalities in multi-core chips • Simplifying the programmer’s task • Efficient interconnects • Power and temperature-efficient designs • Designs tolerant of errors Hardware to support transactional memory

  21. Upcoming Architecture Challenges • Improving single core performance • Functionalities in multi-core chips • Simplifying the programmer’s task • Efficient interconnects • Power and temperature-efficient designs • Designs tolerant of errors Faults are caused by high energy particles that deposit enough charge to toggle bits Variations in conditions may cause a circuit to not produce its result in time

  22. Research Methodologies • It’s all about the simulators! • Simplescalar & Wattch & Hotspot: about 10,000 lines of • C code that models the flow of instructions through a • modern processor • Inputs: configuration file that specifies processor • parameters, benchmark program (say, gzip) • Outputs: how long the program runs on the simulated • processor (Simplescalar), how much power is consumed • (Wattch), what is the peak temperature (Hotspot)

  23. Evaluating a New Idea • Lots of reading (it’s better than waiting for divine inspiration) • Identify bottlenecks, identify problems, develop an idea, repeatedly • question that idea • Understand simulator • Engineer a solution, modify simulator code (perhaps, write fewer than • 1000 lines of C code) • Analyze data (things never work the first time), engineer/optimize/debug • your solution • Write papers • Implement in silicon?

  24. To Learn More… • CS/EE 3810: Computer Organization • CS/EE 6810: Computer Architecture • CS/EE 7810: Advanced Computer Architecture • CS/EE 7820: Parallel Computer Architecture • CS 7937 / 7940: Architecture Reading Seminar

  25. Title • Bullet