ADVANCED COMPUTER ARCHITECTURE CSE-401 E Faculty: Rajendra Saxena

ADVANCED COMPUTER ARCHITECTURECSE-401 EFaculty: Rajendra Saxena L T P Class Work: 50 3 1 - Examination: 100 Total: 150

Syllabus • Unit–1: architecture and machines: some definition and terms, interpretation and microprogramming. The instruction set, basic data types, instructions, addressing and memory. Virtual to real mapping. Basic instruction timing. • Unit–2: time, area and instruction sets: time, cost-area, technology state of the art, the economics of a processor project: A study, instruction sets, professor evaluation matrix • Unit-3: cache memory notion: basic notion, cache organization, cache data, adjusting the data for cache organization, write policies, strategies for line replacement at miss time, cache environment, other types of cache. Split I and d-caches, on chip caches, two level caches, write assembly cache, cache references per instruction, technology dependent cache considerations, virtual to real translation, overlapping the Tcycle in V-R translation, studies. Design summary. • Unit–4: memory system design: the physical memory, models of simple processor memory interaction, processor memory modeling using queuing theory, open, closed and mixed-queue models, waiting time, performance, and buffer size, review and selection of queueing models, processors with cache. • Unit–5: concurrent processors: vector processors, vector memory, multiple issue machines, comparing vector and multiple issue processors. • Shared memory multiprocessors: basic issues, partitioning, synchronization and coherency, type of shared memory multiprocessors, memory coherence in shared memory multiprocessors. • Text book: • Advance computer architecture by Hwang & Briggs, 1993, TMH • Computer Architecture by Michael J. Flynn

Computer Architecture & Organization Computer Architecture:Those attributes of a system which are visible to a machine language programmer having direct impact on logical execution of a program.These attributes include Instruction set, word size, no of bits used to represent various data types, techniques of addressing memory etc. Computer Organization:The operational units and their inter connections that realize the architecture. Control signals , Memory Technology, Interfaces between computer and peripherals etc. Example:It is an Architectural design issue whether a computer will have Multiply Instruction. It is an Organizational issue whether this will be implemented using a separate Multiply Unit or whether it will be implemented using repetitive add function.

Computer Architecture & Organization (Contd..) • Major Computer manufacturers offer a family of computer models based on same architecture but with different organization • Various Intel CPU’s are based on same architecture but have different organization offering different levels of performance and price. • IBM System 370 architecture introduced in 1970 has survived to this day as the architecture of IBM mainframe product line. • Various implementation of RISC architecture are available in the market like SUN Spark, Power PC etc.

Why Study Computer Organization & Arch. • As a professional in field of computing one should not regard the computer as a black box that executes programs by magic • As a professional in field of computing one should acquire some understanding and appreciation of computer system’s functional components, their characteristics, their performance and their interactions. • As a professional in field of computing one needs to understand computer architecture in order to structure a program so that it runs more efficiently on a real m/c. • As a professional in field of computing one should understand how to select a computer system for your personal use or for your organizational use by properly understanding the tradeoffs involved among various components like CPU clock speed, Cache size and Memory Size etc.

Course Objective • The objective of this course is to provide a through discussion of fundamentals of computer organization and architecture. After doing this course you will be able to appreciate the following :- • The Nature and characteristics of modern day computer systems. • Tremendous variety exists from single chip microprocessors to super computers. The various systems differ not only in costs but also in size, performance and applications. • Impact of rapid pace of change covering all aspects of computer technology from underlying integrated ckt. Technology to increasing use of parallel organization concepts in combining those components. • Certain fundamental concepts that apply to all types of computers. • All the basic performance characteristics of computer systems like processor speed, Memory speed, Memory capacity, and interconnection data rate are increasing rapidly but they are increasing at different rates. So designing a balanced system that maximizes the performance and utilization of all elements is a challenge.

Computer Organization & Architecture A computer is a complex system; Modern day computers contain millions of elementary electronic components. The problem is how to clearly describe them all. Recognizing the hierarchical nature of most complex systems , including computers we employ the top down approach and break a typical computer system into interrelated subsystems, each of the latter , in turn hierarchical in structure until we reach some lowest level of elementary subsystem. We begin with the major components of a computer describing their function and structure and proceed to successively lower layers of hierarchy.

Basic Functions of a Computer • The basic functions that a computer can perform are • Data Processing • Data Movement • Data Storage • Control Data Movement Control Data Storage Data Processing

Basic Components of a Computer • The basic components of a computer are • CPU – Controls the operation of computers and performs its data processing functions. • Main Memory – Storage of Data • I/O Subsystem – Data Movement betn. Computer and its external environment. • System Interconnection – Some mechanism that provides for communication between all the above units.

Basic Components of a Computer Computer I/O Sub System Main Memory System Interconnect CPU

Basic Components of a Computer CPU ALU Registers Set Internal CPU Interconnections Control Unit

Basic Components of a Computer Control Unit Sequencing Logic Control Unit Registers & Decoders Control Memory

Basic Components of a Computer The basicfunctional units of a Computer consists of: Control Unit: It contains registers and decoding hardware required to interpret the current instruction ( In the Instruction Register). It controls the sequence of actions in the data paths to provide correct instruction execution. Data Paths : It consists of ALU ( Arithmetic Logical Unit), any other specialized execution unit (Floating Point Etc.), Address Generation Hardware, data and address registers, and the inter connect between all these units. Both these units are generally combined in one unit called CPU and in case of microprocessors its fabricated on single chip. Memory :The memory unit is another crucial piece of hardware. It includes a Memory Address Register ( MAR ), A Storage Register ( SR ) and Memory Cells.

Some Definitions and Terms State: It is a particular configuration of storage units like Registers or Memory, and a state transition is a change in that configuration. Cycle: It is the Time between state transitions. If storage registers are being reconfigured , its called Machine Cycle. If Memory is being reconfigured it is called Memory Cycle. Command: A term used to describe various Instructions, is responsible for affecting state changes. Process: It is a sequence of commands and an initial state. These sequence of commands apply to the initial state and generate a final state. Machine: The Implementation that interprets the commands and make the state transitions happen. This Implementation can in turn be Implemented using another machine having its own storage and instruction sets. In such circumstances the outermost machine is called Image (or Micro) Machine and other is called host machine. The set of all Image Commands and Storage is defined as the Architecture of the machine.

Some Definitions and Terms (Contd.) Storage: This the storage referred by the Instruction Set of the machine and includes Memory and Register Set. There can be some hidden registers which can not be addressed by a Instruction Set, such registers are not considered part of storage but are part of implementation.

The Machine: Interpretation & Microprogramming OP CODE A B C The Instruction Decoder (A part of the implementation mechanism) controls the Data Paths (which connects output of one register to input of other registers and vice versa ) consisting of combinational logic. Each OP Code defines which of the various data paths will be used in its Execution. The Collection of all OP codes ( Instruction Set ) define all the Data Paths required by a specific Architecture. The activation of a particular Data Path is done through a Control Point activated and defined for each particular cycle of operation by the Instruction Decoder. The Interpretation Process begins with the Instruction (Stored in the memory being Fetched or transferred to Instruction Register ) OP Code field being decoded by the Decoder.

The Machine: Interpretation & Microprogramming ( Contd…) The Decoder activates Storage and Registers for a series of state transitions that correspond to the action of OP Code. The Storage and Registers used in Instructions can be both Explicit and Implicit. Explicit Registers Include: General Purpose Registers ( GPR ) Accumulators (ACC) Address Registers ( Index or Base Registers ). Implicit Registers Include: PC (Program or Instruction Counter) – Contains address of next instruction in sequence. Most Instruction Formats Imply this to be current location plus the length of current instruction. Instruction Register – This register holds the Instruction being interpreted or executed. Memory Address Register ( MAR )- Contents of this register are used as address to locate information in the memory.

The Machine: Interpretation & Microprogramming ( Contd…) Storage Register-Also referred as memory buffer register is used to Read or Write data to Memory. Special Use Register – Usage depending on Instruction.

The Machine: Interpretation & Microprogramming Instruction Decoder which has the responsibility of activation and defining of every control point in the processor for every cycle of operation can be implemented both Directly or as a Micro programmed storage. Direct Decodersare designed using combinational logic (Usually PLA’s) to represent the various desired control point actions. The logical input comes from the OP Code (The type of Instruction to be performed), The Sequence Counter ( A small counter to keep track of which cycle with in an Instruction execution is being activated), and some test info from the data registers ( Eg. Sign value), to correctly set the next control action

The Machine: Interpretation & Microprogramming ( Contd…) Destination Register A Data Register Destination Register B X X Control Points OP Decoder Sequence Counter Control Points

The Machine: Interpretation & Microprogramming Micro programmed Decoder are designed using ROM. The OP Code provides an initial address to an entry which specifies the control point values as well as the address of the next micro instruction. In Micro programmed machines the micro instruction defines the control point values required throughout the system as well as controls the sequencing of the interpretation of a operation. In most machines the control points are encoded in some fashion in micro instruction representation and most micro instruction formats include the address of next micro instruction to perform desired sequencing.

The Machine: Interpretation & Microprogramming ( Contd…) OP Micro program Storage Micro MAR Next Micro Instruction Address C.P.S. Additional Decode Micro Instruction Register Micro Programmed Decoder

Direct Decoders Vs Micro programmed Decoders

The Instruction Set Instruction Sets define the many different kinds of data and their manipulations by different processors. Since Instruction set details vary widely from processor to processor , three generic approaches are used to describe the different architecture types. Consistent with most modern machines, each of these generic approaches are based on a register set to hold operands and addresses. These register sets vary from 8 to 32 words with each word consisting of 32 bits. Additional sets of floating point registers and associated floating point execution hardware is assumed to be available whenever floating point arithmetic operations are available in the architecture. ( These can be provided as a separate chip with close coupling to to the microprocessor or integrated on the main processor chip as floating point unit.)

The Instruction Set ( Contd..) Reg Reg An ALU ADD instruction must have both Operands and Result specified as Registers ( Three Address Format). Operand in Memory is not allowed OP The major three Instruction Set Types are: The L/S Architecture: The L/S or Load Store architecture specifies that all operand values must be loaded from Memory into Registers before an execution can take place. Reg Mostly used in RISC machines. RISC architecture tries to reduce the amount of complexity in the Instruction Set itself and regularize the instruction format so as to simplify decoding of Instructions.

The Instruction Set ( Contd..) Reg Reg/Mem An ALU ADD instruction one source operand lies in Memory and the other source operand lies in Register which also serves as Destination Two address Format OP The R/M Architecture: The R/M or Register Memory architecture includes instructions that can operate both on registers and one operand in Memory. Reg Most general purpose modern mainframe computers like IBM, Hitachi, Fujitsu etc as well as several microprocessors ( Intel X86 Series) follow R/M Style.

The Instruction Set ( Contd..) Reg/Mem Reg/Mem Two address Format (One source operand in Register or Memory is also the Destination) Three address Format (Three operands independently specified and each may be a register or Memory In an ALU ADD instruction all operand lie in Memory or in Registers or any combination there off. OP The R+M Architecture: The R+M or Register Plus Memory architecture includes instructions that can operate on operands both in registers and Memory. Reg/Mem Digital Equipments (DEC) VAX series of machines And Motorola M680X0 series of microprocessors use this architecture.

Basic Data Types The most important aspect of an architecture is the format of data values that are operated on by the Instruction Set. The Data Types defines the format and use of data objects and implies the operations that are valid for each type. The different data types available on most machines can be broken into following classes. • Integers • Floating Point ( Real ) Numbers • Decimal Digits • Characters • Bit / Logical

16 b S Integers 32b S Integers are the fundamental data types used in computers. Different formats may be used to represent signed numbers all of which involve treating the most significant (left most) bit as sign bit. The number is treated as negative if this bit is ‘1’. Sign – Magnitude Representation: This is the simplest form of representation where rightmost n-1 bits in an n bit number represent the magnitude in binary format and left most bit decides if the number is positive or negative. +18 = 00010010 -18 = 10010010

Integers (Contd..) Sign-Magnitude Representation has several drawbacks like cumbersome arithmetic and two representations of Zero. Due to These drawbacks this is rarely used to represent integers in computers. The most popular method of Integer Representation is called Two’s Compliment representation:Like Sign – Magnitude representation, It also uses the most significant bit as sign bit making it easier to see if a number is positive or negative. But rest of the bits in a negative number are used as Two’s compliment of the number’s magnitude. +18 = 00010010 -18 = 11101110

Integers (Contd..) So an n bit integer A can be best represented as Two’s Compliment Representation is best understood by defining it in terms of a weighted sum of bits. In signed integer representation the weight of most significant bit is For a positive integer so Positive integer The Range of Positive Numbers is from 0 to

Integers (Contd..) The Range of Negative Numbers is from -1 to Let us consider an example to represent -18 using 8 bit integer in the two’s compliment representation. Since it’s a negative number the sign bit is ‘1’. So value of first term in our equation will be For a negative number the value of sign bit is one ie The weighted sum of remaining bits is 18. so second term will be +18. Putting these values in the equation we get our integer = -128+18 = -110. -110 when converted to binary form is 1110 1110 which is two’s compliment of 18.

Integers (Contd..) Advantage of Two’s Compliment Representation is that arithmetic can be handled in straight forward manner. To subtract integer B from A we simply require to take the twos compliment ( which can be easily done by inverting all the bits of Integer B and adding 1 to it) of B and ADD it to A. Additions of any two numbers ( Whether positive or negative ) is also straight forward. In some machines Multiply is implemented as Repetitive ADD and Division is implemented as Repetitive Subtract. To get the two's complement representation for a negative number, take the binary representation for the number's absolute value and then flip all the bits and add 1.

Reals -Fixed Point Representation In Fixed Point Representation radix point is fixed ( In case of Integers it is assumed to be right of right most digit.) Same representation can be used for Binary Fractions by scaling the numbers so that binary point is implicitly positioned at some other location. EXAMPLE: Binary Fraction 0101.01 represents

Reals –Floating Point Representation Fixed Point Representation has limitations and it can not be used to represent very large numbers or very small fractions. For such representation Floating Point Format is used. Any Number can be represented in the form There are various binary representation of Floating Point, the most popular one has following format for a 32 bit word. 1 Bit 8 Bit 23 Bit S Biased Exponent Significand The number is stored in a binary word with following three fields. 1. Sign : One bit field indicating positive or negative number

Reals –Floating Point Representation to the real exponent value. Where k is no of bits in exponent field. In above case this value is 127. So range of exponent is from -127 to +128. To simplify operations on floating point numbers, its typically required that they be Normalized. A Normalized Number is one where exponent is so adjusted that most significant bit is ‘1’. Since MSB is ‘1’ it need not be stored and 23 bits are used to store 24 bit significand. 2. Biased Exponent: An eight bit field storing exponent plus Bias. 3. Significand ( Mantissa): 23 bit field to store significand. Base is assumed to be 2 and is not stored. Since exponent can be both Negative or Positive , Biased Exponent is stored instead of using two’s compliment. Here a Bias typically equal to

Reals –Floating Point Representation Sign Bit = 0 ( Positive Number) Biased Exponent = 127+20=147 = 10010011 Normalized Significand = 1010 0010 0000 0000 0000 000 Let us look at one Typical Example of Floating Point representation. 0 10010011 10100010000000000000000

Decimals MSD ……. LSD SIGN Length in Bytes Starting Address Binary Coded Decimal Representation Decimal numbers are stored in two formats. 1. Packed Format: Two Digits per byte Binary Coded Decimals. Example: Number -123 0001 0010 0011 1011 in Hex #12 3b

2.Un Packed Format: One digit per byte in ASCII format. Decimals ( Contd ..) Example: -123 0011 00010011 00100011 00110010 1101 Hex # 31 32 33 2d

Decimals ( Contd ..) • Advantages: • Used in calculations performed by business applications • No loss of Precision by data conversion. • Disadvantages: • Not Natural for most machines to perform calculations • Specific instructions needed to deal with these numbers • No representation standard, Manufacturers choose different implementation for storing and processing of decimal data. • Many early microprocessors used this format and often high end business machines like IBM mainframe implement features to efficiently process these numbers.

Characters The character strings may be used to represent decimal or text information. Character strings are simply a sequence of a variable number of bytes. The 256 representations available in a byte are defined by ASCII standard format to represent various upper and lower case letters, numerals and symbols. Compatibility between machines is an issue as some use 6 bit ASCII some 7 Bit ASCII and some 8 Bit ASCII. IBM Uses EBCDIC. Byte ordering also varies. Some (SUN SPARC) store Most significant bit first (Called Big endian) while other (DEC, Intel) store Least significant bit first (Called Little-endian).

Bits String of Bits ( Generally limited to word size) are used to represent vectors of single bit elements, which may be tested and changed mostly using logical instructions. The main application of bit strings is communication and control of Input / Output Devices.

Instructions • The Instruction set that defines all actions for all data types is said to have the Orthogonal Property. • Most machines have Instruction sets to perform following common core of operations. • Integer Arithmetic : add, subtract, multiply, divide • Floating Point arithmetic : add, subtract, multiply, divide, square root • Logical: and, or, nor, xor, shift, rotate • Bit manipulations: extract, insert, test, set, clear • Control Transfer: jump, branch, trap • Comparison tests: less than or equal to, odd parity, carry

Instructions (Contd..) Some machines use complex instructions to perform certain specific operations and some use combination instructions such as test and branch. Restricting the core processor to commonly used operations results in significant performance improvement in the majority of applications. There is considerable diversity among machines with regard to simple operations also. IBM S/370 uses about 10 ADD instructions , while the VAX machines have more than 25 different forms of ADD instructions. Instruction Mnemonics and Assembly language syntax also vary widely among machines. The convention used to define the destination in arithmetic operations also are different.

Instructions (Contd..) As per General Machine Conventions, Instruction mnemonics consists of an operation and data type specification concatenated with a “.”( If there is no explicit data type specification it is assumed that data type is standard machine world.) A similar format is used for branch conditions. In place of the data type specification condition code is specified. Data Type Specifications (OP.Modifiers) B Byte H half world UB Unsigned Byte UH Unsigned half word W word UW unsigned word F floating point D Double precision floating point C charcter or decimal P Decimal in a packed format

Instructions (Contd..) Branch Conditions T True LE Less than or Equal F False LT Less Than V Overflow EQ Equal C Carry or Borrow NE Not equal PE Even Parity GE Greater Than or Equal PO Odd Parity GT Greater Than Destination Convention: ALU Instructions Case 1: OP.X Destination, Source 1, Source 2 ( Three operand Format) Case 2: OP.X Destination, source (Two Operands) Case3: OP.X Destination / Source 1, Source 2 ( Result in Source 1 Location)

Instructions (Contd..) Some Common Instructions: ST A, R1 Store the contents of Register R1 in Memory location A ST.F A, R1 Store the contents of floating register R1 in Location A MOVE A, B Replace the Contents at location A with contents at Location B MOVE.C A, B Move Ch. String starting at B to Location A ZMOVE.P A, B The string length at A is greater, all leading digits to be zeroed.

Branch or Jump Instructions: These instructions determine program control flow. Mainly two types BR ( Unconditional Branch) & BC (Conditional Branch) The BC tests the state of the condition code or CC ( Four Bits That reside in PSW and set by ALU Instructions) Branch Conventions BR Target (Unconditional branch to instruction contained in target) BC Target (A conditional branch without a specific condition code) BC.CC Target ( Same as BC ) BC.NE Target (conditional branch on satisfying the condition specified) BCT.NE R1, Target (A count in R1 is decremented and control goes to target if Result is not equal to zero. Used for Loop Control) BAL & BALR Target / Register (unconditional branch saving current IC in implied register.) Instructions (Contd..)

Register sets and Addressing Modes • The simplest form of data addressing is accessing Registers. • Some Processors use Numbered Registers while others use Named Registers • Some instructions use Implied Registers • Some Processors define Register 0 ( R0) to have value ‘0’ stored in it. • Addressing Mode Summary Instructions (Contd..)

Instructions (Contd..) Instruction Code Example:The following code example implements a vector summation ( For an R/M Architecture). Entry: LD.W R1, xCounter :Get x size from memory and load in R1 LD.W R2, xBaseAddress :Get the base value and load in R2 LD.W R3, #0 : Initialize Sum Register to zero Loop: ADD.W R3, [R2] : Add the next element ADD.W R2, # WordSize : Contents of R2 point to next element SUB.W R1, #1 : decrement Length counter BC.NE Loop : If R1 is not zero go to ‘Loop’ ST.W xSumAddress, R3 :Write out the Sum

ADVANCED COMPUTER ARCHITECTURE CSE-401 E Faculty: Rajendra Saxena

ADVANCED COMPUTER ARCHITECTURE CSE-401 E Faculty: Rajendra Saxena

Presentation Transcript

CSE 520 Advanced Computer Architecture Lec 2 - Introduction

CSE 8383 - Advanced Computer Architecture

CSE 502: Computer Architecture

CSE 520 Advanced Computer Architecture Lec 2 - Introduction

CSE 502: Computer Architecture

CSE 8383 - Advanced Computer Architecture

Advanced Computer Architecture CSE 8383

Advanced Computer Architecture CSE 8383

CSE 8383 - Advanced Computer Architecture

CSE 8383 - Advanced Computer Architecture

Advanced Computer Architecture

Advanced Computer Architecture CSE 8383

Advanced Computer Architecture CSE 8383

CSE 502: Computer Architecture

CSE 502: Computer Architecture

CSE 8383 - Advanced Computer Architecture

CSE 520: Advanced Computer Architecture: Reliability

CSE 8383 - Advanced Computer Architecture