1 / 149

System Software (CS 1203) Assemblers

System Software (CS 1203) Assemblers. Outline. Basic Assembler Functions (Sec. 2.1) Machine-dependent Assembler Features (Sec. 2.2) Machine-independent Assembler Features (Sec. 2.3) Assembler Design Options (Sec. 2.4). Basic Assembler Functions. Section 2.1. Introduction to Assemblers.

anoush
Download Presentation

System Software (CS 1203) Assemblers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. System Software (CS 1203)Assemblers

  2. Outline • Basic Assembler Functions (Sec. 2.1) • Machine-dependent Assembler Features (Sec. 2.2) • Machine-independent Assembler Features (Sec. 2.3) • Assembler Design Options (Sec. 2.4)

  3. Basic Assembler Functions Section 2.1

  4. Introduction to Assemblers • Fundamental functions • Assign machine addresses to symbolic labels used by the programmers • Translate mnemonic operation codes to their machine language equivalents • Machine dependency • Depend heavily on the source language it translates and the machine language it produces • Ex. different machine instruction formats and codes

  5. Role of Assemblers Source Program Assembler Object Code Linker Executable Code Loader

  6. SIC Example Program (Fig. 2.1) • Purpose • Read records from input device (code F1) • Copy them to output device (code 05) • Repeat the above steps until encountering EOF • Write EOF to the output device • RSUB to the operating system

  7. SIC Example Program (Fig. 2.1 )(contd.) • Consists of a main routine that reads record from input device(F1) & copies to output device(05) • Main routine calls subroutine RDREC to read a record into buffer & subroutine WRREC to write a record from buffer into output device • Each subroutine must transfer the record, one character at a time, b’coz the only I/O instructions available are RD and WD • Buffer is necessary, b’coz the I/O rates for the 2 devices, such as disk & a slow printing terminal may be different

  8. SIC Example Program (Fig. 2.1)(contd.) • End of each record is marked by a null character (Hexa 00) • If a record is longer than the length of the of the buffer(4096 bytes), only the first 4096 bytes are copied • End of file to be copied is indicated b a zero-length record • When end-of-file is detected, the program writes EOF on the O/P device & terminates by executing a RSUB instruction

  9. Assembler Directives(or Pseudo-Instructions) • Assembler directives • Not translated into machine instructions • Provides instructions to the assembler • Basic assembler directives • START • Specify name and starting address for the program • END • Indicate the end of the source program, and (optionally) the first executable instruction in the program

  10. Assembler Directives (cont.) • BYTE • Generate character or hexadecimal constant, occupying as many bytes as needed to represent the constant • WORD • Generate one-word integer constant • RESB • Reserve the indicated number of bytes for a data area • RESW • Reserve the indicated number of words for a data area • BYTE & WORD – directs the assembler to generate constants • RESB & RESW – instructs the assembler to reserve memory locations without generating data values

  11. SIC Example Program (Fig. 2.1) (cont.) Specify name and starting address for the program Main program End-of-file End-of-record Call subroutine Forward reference When End-of-file is reached Line numbers are not part of the program. They are for reference only. char

  12. SIC Example Program (cont.) Comment line “<“ means ready End-of-record-null character Indexed addressing Hexadecimal number

  13. SIC Example Program (cont.) Subroutine entry point Subroutine return point

  14. SIC Example Program (cont.) • Data transfer • A record is a stream of bytes with a null character (0016) at the end • If a record is longer than 4096 bytes, only the first 4096 bytes are copied • EOF is indicated by a zero-length record (i.e., a byte stream with only a null character – hexadecimal 00) • Because the speed of the input and output devices may be different, a buffer is used to temporarily store the record • Subroutine call and return • On line 10, “STL RETADDR” is called to save the return address that is already stored in register L • Otherwise, after calling RD or WR, this COPY cannot return back to its caller

  15. An Assembler’s Job • Convert mnemonic operation codes to their machine language codes {Eg: translate STL to 14 (line 10)} • Convert symbolic (e.g., jump labels, variable names) operands to their machine addresses {Eg: translate RETADR to 1033 (line 10)} • Use proper addressing modes and formats to build efficient machine instructions • Translate data constants in the source program into internal machine representations {Eg: translate EOF to 454F46 (line 80)} • Output the object program and provide other information (e.g., for linker and loader)

  16. An Assembler’s Job (contd.) • All but statement 2, can be easily accomplished by sequential processing of source program, 1 line at a time • Forward Reference • Consider line 10 10 1000 FIRST STL RETADR 141033 • This contains a forward reference, (i.e.,) a reference to a label RETADR that is defined later in the program • Line 10 stores the value of L register in RETADR, but RETADR isn’t defined yet. It is defined on line 95 only. • If we attempt to translate the program line by line, we will be unable to process this statement, b’coz we don’t know the address that will be assigned to RETADR • B’coz of this most assemblers use 2 passes • 1st pass – scan source pgm for label definitions & assign addresses • 2nd pass – performs most of the actual translation

  17. Fig. 2.1 with Object Code There is no object code corresponding to addresses 1033-2038. This storage is simply reserved by the loader for use by the program during execution.

  18. Fig. 2.1 with Object Code (cont.)

  19. Fig. 2.1 with Object Code (cont.)

  20. Examples • Mnemonic code (or instruction name)  opcode • Examples: STL RETADR  14 10 33 STCH BUFFER,X  54 90 39 0001 0100 0 001 0000 0011 0011 0101 0100 1 001 0000 0011 1001

  21. Object Program Format • Header record Col. 1 H Col. 2~7 Program name Col. 8~13 Starting address of object program (hex) Col. 14~19 Length of object program in bytes (hex) • Text record Col. 1 T Col. 2~7 Starting address in this record (hex) Col. 8~9 Length of object code in this record in bytes (hex) Col. 10~69 Object code in hex (2 colums per byte of object code) • End record Col. 1 E Col. 2~7 Address of first executable instruction in object pgm(hex)

  22. Object Program Format (contd.) Length of object pgm in bytes (207A – 1000) H^COPY ^001000^00107A T^001000^1E^141033^482039^001036^281030^301015^482061^3C1003^00102A^0C1039^ 00102D T^00101E^15^0C1036^482061^081044^4C0000^454F46^000003^000000 T^002039^1E^041030^001030^E0205D^30203F^D8205D^281030^302057^549039^2C205E ^38203F T^002057^1C^101036^4C0000^F1^001000^041030^E02079^302064^509039^DC2079^2C1036 T^002073^07^382064^4C0000^05 E^001000 Hex(42/2) = 15 Hex(60/2) = 1E Hex(56/2) = 1C Hex(14/2) = 07 Length of object code in this record in bytes Hex(object code/2) = Hex(60/2) = Hex(30) = 1E

  23. Symbolic Operands • Writing memory addresses directly in the program is inconvenient • Instead, we define variable names • Other examples of symbolic operands • Labels (for jump instructions) • Subroutines • Constants

  24. COPY START 1000 • … • LDA LEN • … • … • LEN RESW 1 Converting Symbols to Values or Addresses • Isn’t it simply the sequential processing of the source program, one line at a time? • Not so, if there are forward references – the value of the symbol is unknown now, because it is defined later in the code Forward reference: reference to a label that is defined later in the program

  25. Two-Pass Assemblers • Pass 1 • Assign addresses to all statements in the program • Save the values (addresses) assigned to all labels (including label and variable names) for use in Pass 2 (deal with forward references) • Perform some processing of assembler directives (e.g., BYTE, RESW these can affect address assignment) • Pass 2 • Assemble instructions (generate opcode and look up addresses) • Generate data values defined by BYTE, WORD • Perform processing of assembler directives not done in Pass 1 • Write the object program and the assembly listing

  26. Two-Pass Assembler (cont.) • From input line: LABEL, OPCODE, OPERAND • Operation Code Table (OPTAB) • Symbol Table (SYMTAB) • Location Counter (LOCCTR) • The information in OPTAB is predefined, when the assembler itself is written

  27. Two-Pass Assembler (cont.) Source program Intermediate file Object code Pass 1 Pass 2 OPTAB SYMTAB SYMTAB • OPTAB looks up mnemonic opcodes & translates them to their machine language equivalents • SYMTAB stores values (addresses) assigned to labels

  28. Operation Code Table (OPTAB) • In pass 1, OPTAB is used to look up and validate mnemonic opcodes in the source program • In pass 2, OPTAB is used to translate mnemonic opcodes to machine instructions • In SIC both passes could be done in either pass 1 or pass2 • However for SIC/XE, having instructions of different length we use both pass 1 & pass 2 • Search OPTAB in pass1 to find instruction length for incrementing LOCCTR • In pass2, tell which instruction format to use to assemble the instruction

  29. Operation Code Table (OPTAB) (contd.) • Content • The mapping between mnemonic and machine code. Also include the instruction format, available addressing modes, and length information • Characteristic • Static table • The content will never change • Contents are not normally added/deleted (predefined) • Implementation • Array or hash table, easy for search • Gives optimum performance for the particular set of keys being stored

  30. COPY 1000 FIRST 1000 CLOOP 1003 ENDFIL 1015 EOF 1024 THREE 102D ZERO 1030 RETADR 1033 LENGTH 1036 BUFFER 1039 RDREC 2039 WRREC 2061 Symbol Table (SYMTAB) • Content • Include the label name and value (address) for each label in the source program • Include data type and length information • With flags to indicate errors (e.g., a symbol defined in two places) • In pass1, labels are entered into SYMTAB as they are encountered in the source program, along with assigned addresses from LOCCTR • In pass2, symbols used as operands are looked up in SYMTAB to obtain the addresses to be inserted in the assembled instructions

  31. Symbol Table (SYMTAB) (contd.) • Characteristic • Dynamic table (i.e., symbols may be inserted, deleted, or searched in the table) • Implementation • Hash table can be used to speed up search • Organized generally as hash table, for efficiency of insertion & retrieval • Because variable names may be very similar (e.g., LOOP1, LOOP2), the selected hash function must perform well with such non-random keys

  32. Location Counter (LOCCTR) • This variable is used to help in the assignment of addresses • It is initialized to the beginning address specified in the START statement • After each source statement is processed, the length of the assembled instruction or data area to be generated is added to LOCCTR • When a label in the source program is reached, the current value of LOCCTR gives the address associated with that label

  33. Pseudo Code for Pass 1 (SIC) • 1st find starting address of the program • START – its operand will be the starting address

  34. Pseudo Code for Pass 1 (contd.) • Whenever we find a label, save it in the symbol table • Set the error flag if an unrecognized opcode is found OR if a symbol is encountered more than 1 time

  35. Pseudo Code for Pass 1 (contd.)

  36. Pseudo Code for Pass 2 (SIC) • Write the HEADER

  37. Pseudo Code for Pass 2 (contd.)

  38. Pseudo Code for Pass 2 (contd.)

  39. Assembler Design • Machine Dependent Assembler Features (Sec. 2.2) • instruction formats and addressing modes • program relocation • Machine Independent Assembler Features (Sec. 2.3) • literals • symbol-defining statements • expressions • program blocks • control sections and program linking

  40. Machine-dependent Assembler Features Section 2.2

  41. SIC/XE Assemblers • What’s new for SIC/XE? • more addressing modes • program relocation

  42. Differences Between the SIC and SIC/XE Programs • Register-to-register instructions are used to improve execution speed • Fetching a value stored in a register is much faster than fetching it from the memory, b’coz they are shorter & don't require another memory reference • In line 150, COMP ZERO is changed to COMPR A,s • II’ly in line 165, TIX MAXLEN is changed to TIXR T • Immediate addressing mode is used whenever possible • Operand is already included in the fetched instruction. There is no need to fetch the operand from the memory • Denoted by prefix #

  43. Differences Between the SIC and SIC/XE Programs(contd.) • Indirect addressing mode is used whenever possible • Just one instruction rather than two is enough • Denoted by the prefix @ • Instructions referring memory are normally assembled by PC relative or base relative mode • If displacements for both PC relative & base relative mode are too large to fit into a 3-byte instruction, then 4-byte extended format is used • Denoted by the prefix +

  44. Differences Between the SIC and SIC/XE Programs(contd.) • Larger main memory of SIC/XE means, has more room to load & run several programs at the same time • This kind of sharing of the machine between programs is called multiprogramming • Results in more productive use of hardware • To take full advantage of this feature, we must be able to load programs into memory wherever there is room, rather than specifying a fixed address • This introduces the idea of relocation

  45. Instruction Formats and Addressing Modes • SIC/XE • PC-relative or Base-relative addressing: op m • Indirect addressing: op @m • Immediate addressing: op #c • Extended format (4 Bytes): +op m • Index addressing: op m,x • register-to-register instructions • larger memory -> multi-programming (program allocation)

  46. Relative Addressing Modes • PC-relative or base-relative addressing mode is preferred over direct addressing mode. • Save one byte from using format 3 rather than format 4 • Reduce program storage space • Reduce program instruction fetch time • Relocation will be easier

  47. An SIC/XE Program (fig 2.5) For relocation Format 4 Immediate addressing Indirect addressing

  48. An SIC/XE Program (cont.)

  49. An SIC/XE Program (cont.)

  50. SIC/XE Program with Object Code (fig 2.6)

More Related