Lecture 3: Computer Architectures

Lecture 3:Computer Architectures

Basic Computer Architecture • Von Neumann Architecture Memory instruction data Input unit Output unit ALU Processor CU Reg.

Levels of Parallelism • Bit level parallelism • Within arithmetic logic circuits • Instruction level parallelism • Multiple instructions execute per clock cycle • Memory system parallelism • Overlap of memory operations with computation • Operating system parallelism • More than one processor • Multiple jobs run in parallel on SMP • Loop level • Procedure level

Levels of Parallelism Bit Level Parallelism Within arithmetic logic circuits

Levels of Parallelism Instruction Level Parallelism (ILP) Multiple instructions execute per clock cycle • Pipelining (instruction - data) • Multiple Issue (VLIW)

Levels of Parallelism Memory System Parallelism Overlap of memory operations with computation

Levels of Parallelism Operating System Parallelism • There are more than one processor • Multiple jobs run in parallel on SMP • Loop level • Procedure level

Flynn’s Taxonomy • Single Instruction stream - Single Data stream (SISD) • Single Instruction stream - Multiple Data stream (SIMD) • Multiple Instruction stream - Single Data stream (MISD) • Multiple Instruction stream - Multiple Data stream (MIMD)

Single Instruction stream - Single Data stream (SISD) • Von Neumann Architecture Memory instruction data ALU CU Processor

Single Instruction stream - Multiple Data stream (SIMD) • Instructions of the program are broadcast to more than one processor • Each processor executes the same instruction synchronously, but using different data • Used for applications that operate upon arrays of data data PE data PE instruction CU Memory data PE data PE instruction

Multiple Instruction stream - Multiple Data stream (MIMD) • Each processor has a separate program • An instruction stream is generated for each program on each processor • Each instruction operates upon different data

Multiple Instruction stream - Multiple Data stream (MIMD) • Shared memory • Distributed memory

Shared vs Distributed Memory • Distributed memory • Each processor has its own local memory • Message-passing is used to exchange data between processors • Shared memory • Single address space • All processes have access to the pool of shared memory P P P P Bus Memory M M M M P P P P Network

Distributed Memory • Processors cannot directly access another processor’s memory • Each node has a network interface (NI) for communication and synchronization M M M M P P P P NI NI NI NI Network

Distributed Memory • Each processor executes different instructions asynchronously, using different data instr data M CU PE data data data data data instr M CU PE Network data instr M CU PE data instr M CU PE

Shared Memory • Each processor executes different instructions asynchronously, using different data data CU PE data CU PE Memory data CU PE data CU PE instruction

P P P P P P P P Bus Bus Memory Memory Shared Memory • Uniform memory access (UMA) • Each processor has uniform access to memory (symmetric multiprocessor - SMP) • Non-uniform memory access (NUMA) • Time for memory access depends on the location of data • Local access is faster than non-local access • Easier to scale than SMPs P P P P Bus Memory Network

Distributed Shared Memory • Making the main memory of a cluster of computers look as if it is a single memory with a single address space • Shared memory programming techniques can be used

Multicore Systems • Many general purpose processors • GPU (Graphics Processor Unit) • GPGPU (General Purpose GPU) • Hybrid Memory • The trend is: • Boardcomposed ofmultiple manycorechipssharingmemory • Rack composedof multipleboards • A room full of these racks

Distributed Systems • Clusters • Individual computers, that are tightly coupled by software, in a local environment, to work together on singleproblems or on related problems • Grid • Many individual systems,that are geographicallydistributed, are tightly coupledby software,to work together on singleproblems or on related problems

Lecture 3: Computer Architectures

Lecture 3: Computer Architectures

Presentation Transcript

Universal Mechanisms for Data-Parallel Architectures

Lecture 1: A Short History of Operating Systems

UNIX Lecture 1

Documenting Software Architectures

Lecture 6: Query Processing; Hurry up!

Consciousness and Creativity in Brain-Inspired Cognitive Architectures

The Memory Hierarchy CENG331: Introduction to Computer Systems 10 th Lecture

Lecture 17: Adders

Lecture 0: Introduction

Advanced Computer Architectures – HB49 –

Why Computer Networks?

Middleware, Service-Oriented Architectures and Grid Computing

Scalable Web Architectures

Lecture 0: Introduction

CG Architectures, Image Formation, and Models Angel, Chapter 1

Embedded Systems Introduction to Real-Time Operating Systems

Lecture 5 Assembly Language

SOA Part1 Lecture 2

Algorithms and Architectures for Decimal Transcendental Function Computation

Advanced Computer Architecture 5MD00 / 5Z033 ILP architectures with emphasis on Superscalar

ECE 4100/6100 Advanced Computer Architecture Lecture 15 Static Scheduling Machines