1 / 25

AMD Fusion APU

AMD Fusion APU. Johnathan Alsop, Jereme Lamps, James Szczypta. What is AMD APU?. Accelerated Processor Unit (APU) mixture of processor, graphics, and multimedia resources better performance increased battery life lower cost. Architecture (Llano). CPU Features 1/2.

vega
Download Presentation

AMD Fusion APU

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AMD Fusion APU Johnathan Alsop, Jereme Lamps, James Szczypta

  2. What is AMD APU? • Accelerated Processor Unit (APU) • mixture of processor, graphics, and multimedia resources • better performance • increased battery life • lower cost

  3. Architecture (Llano)

  4. CPU Features 1/2 • Instruction Pointer based HW pre-fetcher • learns a ‘stride’ associated with an operation at the instruction pointer • tracks wider range of strides than predecessor https://wiki.engr.illinois.edu/download/attachments/217842128/7-Memory.pdf?version=2&modificationDate=1362506665000

  5. CPU Features 2/2 • ROB holds 84 operations • Reservation Stations holds 30 operations • Load/Store Buffer holds 52 operations

  6. GPU Features 1/2 AMD Radeon VLIW-5 Core: • 4 stream cores, 1 special functions core, branch unit, general purpose registers • Co-issue MUL and dependent ADD in single clock • 4 24-bit Int MUL or ADD per clock • 2 64-bit FP MUL or ADD per clock • 4 32-bit FP MULADD per clock

  7. GPU Features 2/2 • Combine 16 of these to form a SIMD • GPU contains 5 SIMDs • 400 processing units ~480GFlops throughput

  8. Additional Features • Turbo Core • automatically overclocks processor when under heavy load • Power Gating • All 4 cpu cores + gpu can be individually power gated

  9. Memory Requirements GPU Many cores, many threads per core High memory intensity Can hide memory latency with useful work Data level parallelism • CPU • Fewer threads per core • Low memory intensity • Needs low latency mem access • Utilizes large cache, ILP

  10. GPUs: typically discrete accelerators • Usually GPU on separate chip • Has own RAM • Message passing model // copy data to GPU clEnqueueWriteBuffer(host_mem, device_mem, size); // launch kernel clEnqueueNDRangeKernel(kernel); // copy data back from CPU clEnqueueReadBuffer(device_mem, host_mem, size); Image from AMD fusion whitepaper: http://sites.amd.com/cn/ Documents/48423B_fusion_whitepaper_WEB.pdf

  11. Fusion memory system: on-chip GPU • GPU on same chip • Shares interface to RAM • Shared memory model (almost) Image from AMD fusion whitepaper, http://sites.amd.com/cn/Documents/48423B_fusion_whitepaper_WEB.pdf

  12. AMD Fusion Llano memory diagram • 4 CPU Cores • 64KB I-cache • 64KB D-cache • 1MB L2 cache • 4 write-combine buffers • GPU cores, I/O • Memory interconnect • Unified Northbridge • Fusion Compute Link • Radeon Memory Bus Image from "AMD’s ‘Llano’Fusion APU." In Hot Chips, vol. 23. 2011

  13. Memory Types • Cacheable • Uncacheable • Local

  14. Cacheable Memory – CPU access Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  15. Cacheable Memory – GPU access FCL Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  16. Uncacheable Memory – CPU access Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  17. Uncacheable Memory – GPU access RMB Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  18. Local Memory – CPU access Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  19. Local Memory – GPU access RMB Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  20. Memory Bandwidth Overview Image from "Memory system on fusion APUs." AMD Fusion developer summit (2011)

  21. References Branover, Alexander, Denis Foley, and Maurice Steinman. "Amd fusion apu: Llano." Micro, IEEE 32.2 (2012): 28-37. https://www.kernel.org/pub/linux/kernel/people/geoff/cell/ps3-linux-docs/CellProgrammingTutorial/BasicsOfSIMDProgramming.html https://wiki.engr.illinois.edu/download/attachments/217842128/7-Memory.pdf?version=2&modificationDate=1362506665000 Foley, Denis, Maurice Steinman, Alex Branover, Greg Smaus, Antonio Asaro, Swamy Punyamurtula, and Ljubisa Bajic. "AMD’s ‘Llano’Fusion APU." In Hot Chips, vol. 23. 2011. • Boudier, Pierre, and Graham Sellers. "Memory system on fusion APUs." AMD Fusion developer summit (2011). • Daga, Mayank, Ashwin M. Aji, and Wu-chun Feng. "On the efficacy of a fused cpu+ gpu processor (or apu) for parallel computing." Application Accelerators in High-Performance Computing (SAAHPC), 2011 Symposium on. IEEE, 2011.

  22. Supplementary Slides

  23. Separate CPU GPU TLBs

More Related