1 / 30

Architectural Musings Rethinking Computer Systems Architecture

Architectural Musings Rethinking Computer Systems Architecture. Christopher Vick cvick@qualcomm.com June 3, 2012. Introduction. Vision Talk Mobile computing and current technologies fundamentally change key parameters and constraints for computer system architecture

kueng
Download Presentation

Architectural Musings Rethinking Computer Systems Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architectural MusingsRethinking Computer Systems Architecture Christopher Vick cvick@qualcomm.com June 3, 2012

  2. Introduction • Vision Talk • Mobile computing and current technologies fundamentally change key parameters and constraints for computer system architecture • Vast new opportunities for research of great interest to and great relevance for industry

  3. Outline • Computer System Architecture • Then (Circa 1970) • Scarce Resources & Bottlenecks • Optimizations • Now (Mobile Computing Platforms) • Scarce Resources & Bottlenecks • Optimizations? • Qualcomm Research • Questions?

  4. Computer System Architecture

  5. Computer System Architecture • Hardware • The 5 classic components (Patterson & Hennessy) • Input,Output, Memory, Datapath, Control • Software • System Virtual Machine (Hypervisor, VM, or VMM) • Operating System • Compilers & Tools • Definitions • The way components fit together • The arrangement of the various devices in a complete computer system or network • The instruction set plus a model of the execution of the instruction set (Amdahl et al) • Computer System Architecture • The selection and combination of hardware and software components to assemble an effective computer system

  6. Combination

  7. Effective • An optimization problem • Many variables • Selection of hardware/software components • Selection of interfaces/interconnects • Many constraints • Physical, sociological, technical & cost constraints • Scarce Resources and Bottlenecks • Maximize utilization of scarce resources • Minimize impact of bottlenecks

  8. Then(Circa 1970)

  9. Scarce Resources • CPU Cycles • CPUs expensive • Slow clock rates • Memory Locations • Random Access Memory expensive • Address/Data paths into CPU expensive • Skilled Programmers • Relatively new discipline • Poor language and tools support

  10. Bottlenecks • Programmer Productivity • Software development slow and expensive • Low level programming paradigms • Memory Latency • RAM latency gated overall speed (~2-3 MHz) • Small RAM backed by vastly slower storage • I/O Bandwidth • Limited CPU connectivity • Crude communication mechanisms

  11. Optimizations • Time Sharing • Effective sharing of limited resource • Virtual Memory • Effective sharing, and backing with cheaper alternative • Hardware Improvements • Smaller features provide more resource and faster clock • Large Scale Integration • Better signaling to improve bandwidth • High Level Programming Languages • Broadens productive programmer community • Abstracts away some hardware complexity

  12. Examples • Digital PDP 11 • 16-bit address space • Orthogonal instruction set • Memory mapped I/O • Unix, DOS, many others • IBM System 370 • 24-bit address space • Virtual Memory • VMS, VM/370, DOS/VS • Backward compatibility with System 360

  13. Now(Mobile Computing)

  14. Scarce Resources • Energy • Fixed Energy Budget for mobile devices • Thermal issues at all scales • Tradeoff between performance and energy • Shrinks no longer significantly improving consumption • Memory Bandwidth • Providing bandwidth is expensive • Memory interconnect consumes significant energy

  15. Bottlenecks • Memory Latency • Increasing gap between CPU speed and DRAM latency • Physical distance to DRAM devices a factor • Concurrency • Shortage of programmers who can handle this • Inadequate language/tools support • I/O Bandwidth/Latency • Wireless bandwidth lower than wired • Consumes large amounts of energy

  16. Example • HTC One • Processor: 1.5 GHz Dual Core Qualcomm MSM8960 • OS: Android™ 4.0 (ICS) • Memory RAM: 1 GB DDR2 • Memory Storage: 16 GB onboard storage • Display: 4.7" HD super LCD 1280 x 720 • Network: LTE CAT3 - DL 100 /UL 50 LTE: 700/AWS WCDMA: 2100/1900/AWS/850 EDGE: 850/900/1800/1900 • Battery: 1800 mAh • Camera (Main): 8 MP, f/2.0, BSI, 1080p HD Video (Front): 1.3 MP with 720p video • Dimensions: 134.8 x 69.9 x 8.9mm • This is a General Purpose Computer!

  17. Optimizations? • Multi-core • Aggressive addition of cores and threads • Hardware concurrency outstripping software • New Concurrent Programming Models/Tools? • Memory Subsystem • Significant contributor to total energy consumption • Adding bandwidth is expensive • New technologies addressing some energy issues • Wireless bandwidth enhancements (LTE Advanced,etc.) • Solutions from desktop/server or embedded worlds may not directly apply in mobile space!

  18. Memory System Energy • Retaining data (one second) • DRAM: ~1-10 pJ/bit self-refresh • SRAM: 1200+ pJ/bit, and rising over time [ITRS 2009] • 4 pJ/bit (45nm LP, standby) [Barasinski et al., ESSCIRC ‘08] • Flash, PCM, STT RAM…:Zero ! • Moving Data • 32-bit value: • Recompute: 60 pJ (Razor) • Send 1mm: 10 pJ • Retain in cache for 1 ms: 38 pJ • Retain in DRAM for 1 second: 32+ pJ

  19. Reducing Memory System Energy • Move less! • Caches physically close to CPU • Locality, locality, locality (the first rule of chip real estate) • Retain less! • Power off unused caches lines [Kaxiraset al., ISCA ‘01] • “Drowsy” caches [Flautneret al., ISCA ‘02] • … with compiler analysis[Zhang et al., Trans. Emb. Comp. Sys. 4(3) 2005] • Don’t refresh unused DRAM • … e.g. with garbage collection [Chen et al., CODES+ISSS ‘03]

  20. Extending the Memory Model • Maintaining the illusion of a single flat memory address space is too expensive • On-chip caches can be major consumers of area and energy • Coherence protocols are expensive and difficult to scale • Alternative: software-managed memory hierarchies • Tightly-coupled memory (TCM), scratchpads • Do not require tag memory, address comparison logic • More area- and energy-efficient • Help bridge gap between bandwidth and throughput

  21. New Challenges and Opportunities • Different programming paradigm: software explicitly orchestrates all transfers between on-chip and off-chip memory areas • Major implications on memory management • Scratchpad allocation strategies • Data partitioning strategies • Dynamic relocation between scratchpad and DRAM to track the program’s locality characteristics • Opportunities for compile-time and runtime optimization • Challenges in both Hardware and Software!

  22. Qualcomm ResearchExcellence in Wireless May | 2012 www.qualcomm.com/research

  23. State of the Art Capabilities Fostering Innovation Human Resources Complete Development Labs • 30% of engineers with PhD, 50% Masters • Systems, HW, SW, Standards, Test Engineering • Ventures, Bus Dev, Technical Marketing, Program Mgmt. • Prototype Development Facilities • CPU Simulation Clusters • Antenna Ranges • Outdoor Field Systems

  24. Global Research and Development Organization

  25. Qualcomm Research & University Relations • ACADEMIC Collaboration to Foster Advanced RESEARCH RESEARCH Ongoing relations with more than 30 US and 25 International Universities • Current funding includes MIT, UC Berkeley, Stanford,  UCSD, UT Austin, ASU, UIUC, Univ. of Michigan, EPFL, IISc Bangalore, KAIST, Tsinghua Research collaboration spans variety of technical areas • Computer vision, multicore processing, context aware computing, machine learning,  low power devices,, wireless networks and signal processing, etc.. Qualcomm Innovation Fellowship (QInF) invests on innovative ideas • Close interactions between Qualcomm Researchengineers,graduate students and professors

  26. Qualcomm Research For The Wireless Future Take WWAN to the next level Innovatebeyond WAN Enable Smart Applications Breakthrough performance Application Enablers Processors & Devices Wireless Local Area 3G/4G RE-ARCHITECTING NEXT-GEN MOBILE DEVICES EXCELLING IN ALL FORMS OF WIRELESS TRANSFORMING THE MOBILE USER EXPERIENCE IMPROVING WWAN TECHNOLOGY

  27. Innovate Beyond WAN Wireless Local Area

  28. Enable Smart Applications Elevate the wireless user experience

  29. Breakthrough Device Performance Re-architecting nex-gen devices

  30. Thank You

More Related