1 / 25

Scalable Processor Design

Scalable Processor Design. Kshitij Bantupalli Peter Ding Teddy Mopewou. The Processor. Electronic circuit External data source Memory or data stream Central processing unit (CPU) Specialized Processors Graphics processing unit (GPU) Neural processing unit (NPU). CPU. Fetch

bsam
Download Presentation

Scalable Processor Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Processor Design KshitijBantupalli Peter Ding Teddy Mopewou

  2. The Processor • Electronic circuit • External data source • Memory or data stream • Central processing unit (CPU) • Specialized Processors • Graphics processing unit (GPU) • Neural processing unit (NPU)

  3. CPU • Fetch • Retrieves an instruction from memory • Program counter • Decode • Instruction decoder • Instruction set architecture (ISA) • Convert the instruction into signals that control other parts of CPU • Execute • Perform the instruction

  4. CPU Components • Control unit • Datapath • Arithmetic logic unit and pipelines • Memory • Register files, caches, memory management unit • Clock • Synchronous circuits

  5. ISA • Interface between software and hardware • Specifies the instruction the computer can perform and the formart of the instruction • Complex instruction set computer (CISC) • Reduced instruction set computer (RISC) • Very long instruction word (VLIW) • Explicitly parallel instruction computing (EPIC)

  6. CISC • Attempts to minimize the number of instructions per program • Sacrifices number of cycles per instruction • Multiple operations are embedded in one instruction • Complex instructions of varying lengths • Large number of instructions • Transistors for storing complex instructions

  7. Advantages of CISC • Microprogramming is easy assembly language to implement, and less expensive than hard wiring a control unit. • As each instruction became more accomplished, fewer instructions could be used to implement a given task.

  8. Disadvantages of CISC • The performance of the machine slows down due to the amount of clock time taken by different instructions will be dissimilar • Only 20% of the existing instructions is used in a typical programming event, even though there are various specialized instructions in reality which are not even used frequently.

  9. RISC • Attempts to reduce the cycles per instruction • Sacrificing number of instructions per program • Pipelining • Overlapping the execution of several instructions in a pipeline • One cycle execution time • Fixed length instructions

  10. Advantages of RISC over CISC • Many RISC processors use the registers for passing arguments and holding the local variables • Reduced instructions require less transistors of hardware space • Leaves more room for general purpose registers • Use a fixed length instruction which is easy to pipeline • The speed of the operation can be maximized and the execution time can be minimized • More power efficient • Mobile devices

  11. CISC vs. RISC Implementation • CISC • MULT 2:3, 5:2 • RISC • LOAD A, 2:3 • LOAD B, 5:2 • PROD A,B • STORE 2:3, A

  12. Register Windows • Register window overflow and underflow • Register window management • Register window incremental compilation

  13. Introduction to SPARC • Short for scalable processor architecture. • It is a reduced instruction set computing(RISC) originally developed by Sun Microsystems. • It could have from 72 to 640 general purpose 64-bit registers. • It was scalable as it could scale from embedded processors to large server processors all sharing the same instruction set. • The number of implemented register windows changes in scaling.

  14. VLIW • Processors that execute one instruction after another may be using resources inefficiently • Poor performance • Josh Fisher of Yale University in early 1980s • Exploits instruction level parallelism • Execute more than one instruction at a time (superscalar) • No instruction interdependencies • Moves the complexity from the hardware to the software • The complier handles the rest

  15. VLIW Disadvantages • VLIW instruction sets are not backwards compatible between implementations • Load responses from memory do not have a deterministic delay • Very difficult static scheduling of load instructions for the complier

  16. EPIC • Hewlett-Packard (HP) • Multiple software instructions (bundles) has a stop bit • Dependency information • Software prefetch instruction • Speculative load instruction • Check load instruction

  17. Multithreading • A single CPU core executes multiple processes or threads concurrently • Better utilization of CPU resources • Multiple threads contenting with shared resources can degrade performance

  18. Multithreading Types • Coarse-grained multithreading • Interleaved multithreading • Simultaneous multithreading

  19. Multi-core Processors • Implements multiprocessing with one physical processor • Allows cache coherency circuity to operate at a significantly higher clock rate • Signals travel shorter distances, leading to less degradation • Uses less power than multiple processors equivalent

  20. Multi-core Processors Disadvantages • Maximizing the potential of multi-core processors require adjustments to operating system and software • More difficult to handle thermally • Lower chip production yields

  21. Asynchronous CPUs • No central clock • “Pipeline controls” or “FIFO sequencers” • Starts the next stage of logic after the existing stage is completed • Lower power consumption and electromagnetic interference • Speed only limited by propagation delays of logic gates • Components can run at different speeds • Clocked CPU components are synchronized with the central clock • Biggest disadvantage is most CPU design and testing tools are made for clocked CPUs

  22. Caltech Asynchronous Microprocessor • World’s first asynchronous microprocessor in 1988 • When hot coffee was placed on the chip, the pulse rate slowed down • When liquid nitrogen was poured on the chip, the pulse rate increased • Ran on a potato

  23. Optical Processors • Use light instead of electricity for digital logic • Up to 30% faster • Uses less power • However, current electric computing elements are far cheaper, faster, and more reliable • Economically unfeasible for the foreseeable future

  24. Processor Design • Instructions per second • Floating point operations per second (FLOPS) • Performance per watt • Parallel computing • Low power consumption • Small size or low weight • Portable embedded systems

  25. References • http://www.edgefxkits.com/blog/what-is-risc-and-cisc-architecture/ • https://cs.stanford.edu/people/eroberts/courses/soco/projects/risc/risccisc/ • http://www.async.caltech.edu/cam.html • http://www.async.caltech.edu/~mika/potato/potato.html • https://dl.acm.org/citation.cfm?doid=800046.801649 • https://courses.cs.washington.edu/courses/cse471/01au/epic_cgi.pdf

More Related