1 / 20

CPU Computing : Roadmap of the future Devandran Pandian Channel Platform Manager SMG Intel India

pamelia
Download Presentation

CPU Computing : Roadmap of the future Devandran Pandian Channel Platform Manager SMG Intel India

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    2. The Technology Catalyst

    4. Parallel Programming Challenge Ease of Use and Flexibility We now must look at the applications supporting HPC and ensure they are taking advantage of the technology designed into Nehalem. Is the code parallelized? Is it optimized on NHM? For many years applications have been able to take advantage of the increased frequency to improve performance. Now we are offering more cores to gain performance. ISV’s are now taking their serial code and parallelizing it. This is a challenge Intel is trying to make as simple as possible.We now must look at the applications supporting HPC and ensure they are taking advantage of the technology designed into Nehalem. Is the code parallelized? Is it optimized on NHM? For many years applications have been able to take advantage of the increased frequency to improve performance. Now we are offering more cores to gain performance. ISV’s are now taking their serial code and parallelizing it. This is a challenge Intel is trying to make as simple as possible.

    5. The Technical Computing Architecture Communicate how HPC and workstations work together. Technical computing is a combination of workstations and High performance computing clusters. The technical computing industry is driven to deliver results …fast. Workstations are required to create and HPC clusters are needed to simulate and analyze. After you analyze the data you can visualize the results to enable faster innovation and discovery Communicate how HPC and workstations work together. Technical computing is a combination of workstations and High performance computing clusters. The technical computing industry is driven to deliver results …fast. Workstations are required to create and HPC clusters are needed to simulate and analyze. After you analyze the data you can visualize the results to enable faster innovation and discovery

    6. Insatiable Demand for Performance, Density, and Efficiency The insatiable desire for performance will continue through the foreseeable future. The graph on the left takes the #1 systems on the Top500 and projects the future performance through 2029. From weather modeling to understanding how detergent flows through a dishwasher the need for compute performance isn’t going away any time soon. To deliver the added performance we need to be aware of the required power. Intel strives to continually deliver more performance at similar to lower power requirements of previous generation platforms. To meet the performance needs in 2029, Intel will continue to explore ways to deliver greater performance at similar or reduced power envelopes of today.The insatiable desire for performance will continue through the foreseeable future. The graph on the left takes the #1 systems on the Top500 and projects the future performance through 2029. From weather modeling to understanding how detergent flows through a dishwasher the need for compute performance isn’t going away any time soon. To deliver the added performance we need to be aware of the required power. Intel strives to continually deliver more performance at similar to lower power requirements of previous generation platforms. To meet the performance needs in 2029, Intel will continue to explore ways to deliver greater performance at similar or reduced power envelopes of today.

    7. Data Center Convergence Speaker notes Datacenters continue to experience new demands from their customers: scalability on demand, pressure to deliver “IT as a service”, & lowest possible TCO This creates challenges in manageability, security, flexibility, & affordability. So IT managers are looking for simplification. Fortunately a major technology transition is underway which will simplify IT managers’ lives: the convergence of Compute, Networking, & Storage. Compute: Intel’s heritage is compute performance, but with the Nehalem uArch, we brought intelligence to adapt to workloads w/ dynamic features like Turbo. Storage: Standard building blocks are transforming the Storage market – this puts more emphasis on responsiveness of the storage compute engine. In storage, we expect Intel architecture to drive 7 out of every 10 external storage systems shipped by end of 2010 Industry leaders like EMC have chosen the Xeon architecture of choice for storage solutions w/ their recent announcement of EMC Symmetrix solutions. Networking: The industry continues to drive bandwidth up and latency down via ethernet & the world is moving to converged fabrics. Many people don’t know that Intel is the world’s leading supplier of LAN connections – we shipped over a half billion connections over past 10 years. Intel is committed to lead the transition to converged networks w/ our leadership LAN products & technologies. Close: So Xeon is the cornerstone of next-generation Intelligent Data Centers. Speaker notes Datacenters continue to experience new demands from their customers: scalability on demand, pressure to deliver “IT as a service”, & lowest possible TCO This creates challenges in manageability, security, flexibility, & affordability. So IT managers are looking for simplification. Fortunately a major technology transition is underway which will simplify IT managers’ lives: the convergence of Compute, Networking, & Storage. Compute: Intel’s heritage is compute performance, but with the Nehalem uArch, we brought intelligence to adapt to workloads w/ dynamic features like Turbo. Storage: Standard building blocks are transforming the Storage market – this puts more emphasis on responsiveness of the storage compute engine. In storage, we expect Intel architecture to drive 7 out of every 10 external storage systems shipped by end of 2010 Industry leaders like EMC have chosen the Xeon architecture of choice for storage solutions w/ their recent announcement of EMC Symmetrix solutions. Networking: The industry continues to drive bandwidth up and latency down via ethernet & the world is moving to converged fabrics. Many people don’t know that Intel is the world’s leading supplier of LAN connections – we shipped over a half billion connections over past 10 years. Intel is committed to lead the transition to converged networks w/ our leadership LAN products & technologies. Close: So Xeon is the cornerstone of next-generation Intelligent Data Centers.

    9. Intel Processor Product Launch Roadmap Principle: Baseline - Sustaining $10-12M Media to figure out how to use the rest with Burst Content deal funded by BTL Launch Bursts - Leverage WSJ Opinion Leader; JMP; BTL 50% in SEM and Contextual for hot topics, imperatives, Ravi projects 25% on client burst 25% on server burst Cloud and Security via Global Engagement Topic: Cloud Mar-client aware, Security (Mcafee) Principle: Baseline - Sustaining $10-12M Media to figure out how to use the rest with Burst Content deal funded by BTL Launch Bursts - Leverage WSJ Opinion Leader; JMP; BTL 50% in SEM and Contextual for hot topics, imperatives, Ravi projects 25% on client burst 25% on server burst Cloud and Security via Global Engagement Topic: Cloud Mar-client aware, Security (Mcafee)

    12. Sandy Bridge Server Platform Summary New micro-architecture on the 32nm process technology 1 Lower platform power claim based on a Xeon® 5600 CPU and Sandy Bridge-EP CPU with the same TDP specification and comparable platform configurations. Platform power reduction is primarily attributed to TDP reduction from a two-chip solution based on the Intel 5520 chip set and ICH-10R, down to a one-chip south bridge solution(Patsburg chip) on the Sandy Bridge platform.

    13. Xeon® E5 Platform Roadmap 13 *For a full list of technologies, see WW45 NDA Data Center Group Roadmap on SMCR.Intel.com

    14. Xeon® 2S Platform Comparison

    15. Romley EP (Socket R) vs. Romley EN (Socket B2)

    16. Intel® Advanced Vector Extensions (Intel® AVX) Extension to 128-bit SSE Instruction Support for 256-bit wide vectors and SIMD register set Targets floating point operations Benefits these applications: Engineering Visual processing/recognition Data-mining Physics, Cryptography VADDPS instruction allowing you to align data, Ymm1, ymm2 are avx registers. Streaming SIMD Extensions (SSE) SIMD Single-Instruction Stream Multiple-Data Legacy SSE was 128 bit, the new AVX instructions have been widened to 256 bit and targeted at Floating Point intensive Operations and can double in performance. Improves performance via wider vectors This results in better management of data and general purpose applications like image, audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis. Non destructive source – had to do a register copy before; less code; makes it easier for the compiler for vectorization and optimization. Needs Linux 2.6.30 or later and Windows7 SP1 or later, Win2k8 SP1 or later for AVX Compile with the right switch or re write the assembly or intrinsics XMM registersVADDPS instruction allowing you to align data, Ymm1, ymm2 are avx registers. Streaming SIMD Extensions (SSE) SIMD Single-Instruction Stream Multiple-Data Legacy SSE was 128 bit, the new AVX instructions have been widened to 256 bit and targeted at Floating Point intensive Operations and can double in performance. Improves performance via wider vectors This results in better management of data and general purpose applications like image, audio/video processing, scientific simulations, financial analytics and 3D modeling and analysis. Non destructive source – had to do a register copy before; less code; makes it easier for the compiler for vectorization and optimization. Needs Linux 2.6.30 or later and Windows7 SP1 or later, Win2k8 SP1 or later for AVX Compile with the right switch or re write the assembly or intrinsics XMM registers

    17. Intel Technology is Changing HPC TCO, Performance, Reliability SSD’s Extreme Performance >100x IOPS€ performance gains vs. 15k HDD Power Efficient - >5x lower power€ vs. 15k HDD Increased Reliability - 2.0M Hrs MTBF vs, 1.20M Hrs MTBF for 7.2K WD RE2 Reduce system cost - Replace HDD and Memory with SSD’s 10GbE Extreme Performance - iWARP provides low latency over 10GbE Low overhead and high bandwidth Increased Reliability - Over 25 years delivering leading Ethernet products Broad OS Support Designed for Multi-core Power Efficient - Low power design <3.5W Lower TCO Consolidated fabric through industry standardized technology SSD’s Extreme Performance >100x IOPS€ performance gains vs. 15k HDD Power Efficient - >5x lower power€ vs. 15k HDD Increased Reliability - 2.0M Hrs MTBF vs, 1.20M Hrs MTBF for 7.2K WD RE2 Reduce system cost - Replace HDD and Memory with SSD’s 10GbE Extreme Performance - iWARP provides low latency over 10GbE Low overhead and high bandwidth Increased Reliability - Over 25 years delivering leading Ethernet products Broad OS Support Designed for Multi-core Power Efficient - Low power design <3.5W Lower TCO Consolidated fabric through industry standardized technology

    18. Scaling Performance Forward One Development Environment – Multi- to Many-core Debug and Tune become equally important to carry forward to many-core. This is the heterogeneous tool set now, as many-core applications scale to terascale on clients, and these terascale nodes make clusters of petascale machines. Better performance, multi-core advancements and support for Intel® Core™ i7 processors. New versions of SW tools released in Nov. 08. the first step in the cycle is to gain insight into your code by analyzing it with tools such as Vtune performance analyzer and/or Thread Checker Next, you parallelize your code with Intel tools such as Intel® Threading Blocks, Compilers, and Performance Libraries After you parallelize your code you review the resutls for correctness/confidence. If you do not achieve the results you expect you can begin the cycle again with insight. Once you have achieved the desired results you and then performa a final optimization to ensure peak performance with Intel® VTune Performance Analyzer and Thread Profiler. Debug and Tune become equally important to carry forward to many-core. This is the heterogeneous tool set now, as many-core applications scale to terascale on clients, and these terascale nodes make clusters of petascale machines. Better performance, multi-core advancements and support for Intel® Core™ i7 processors. New versions of SW tools released in Nov. 08. the first step in the cycle is to gain insight into your code by analyzing it with tools such as Vtune performance analyzer and/or Thread Checker Next, you parallelize your code with Intel tools such as Intel® Threading Blocks, Compilers, and Performance Libraries After you parallelize your code you review the resutls for correctness/confidence. If you do not achieve the results you expect you can begin the cycle again with insight. Once you have achieved the desired results you and then performa a final optimization to ensure peak performance with Intel® VTune Performance Analyzer and Thread Profiler.

    19. Solving Your HPC Challenges Intelligent performance helping to deliver a lower TCO as well as ~3x the performance of previous generation processors. Intel Software tools enable users to easily optimize their software to maximize performance on current and future generation IA hardware Intel Cluster Ready makes deploying a cluster easy Intelligent performance helping to deliver a lower TCO as well as ~3x the performance of previous generation processors. Intel Software tools enable users to easily optimize their software to maximize performance on current and future generation IA hardware Intel Cluster Ready makes deploying a cluster easy

More Related