Elements of Assembly Language

To allow us to type in ADD A,15 rather than 11000110 00010101, weneed another program to do the conversion. This program is called anassembler.

The program allows us to type in the assembly code, called thesource code, and converts it to machine code referred to as objectcode. • SyntaxHelp • Syntax is the structure of statements in a language, whether it beEnglish or a computer language. In English, most people wouldrecognize something is incorrect about saying ‘He are going’ ratherthan ‘He is going’. This is an example of a syntax error.

As an example, If we mistyped the instruction LD A C as LD A V thenthe assembler would be unable to convert this to object code since itwill not recognize the ‘V’ as a valid register. A real cheapie assemblermay just stop or miss out this instruction. A somewhat better one mayput a message on the screen saying ‘syntax error code not recognized’and a very helpful one may suggest a likely cause of the trouble. It maysay, ‘syntax error. Invalid register. Register name may only be A, B, C,D , E, H or L.’

Elements of Assembly Language • Machine language is very difficult to program in directly. Decipheringthe meanings of the numerical-coded instructions is tedious for humans. • For example, the instruction that says to add the EAX and EBX registerstogether and store the result back into EAX is encoded by the following hexcodes: 03 C3

An assembly language program is stored as text (just as a higher levellanguage program). Each assembly instruction represents exactly one machineinstruction. For example, the addition instruction described abovewould be represented in assembly language as: • addeax, ebx • Here the meaning of the instruction is much clearer than in machine code. • The word add is a mnemonic for the addition instruction. The general formof an assembly instruction is: • mnemonicoperand(s)

Machine code instructions have varying number and type of operands;however, in general, each instruction itself will have a fixed number of operands(0 to 3). Operands can have the following types: • register: These operands refer directly to the contents of the CPU’s registers. • memory: These refer to data in memory. The address of the data may bea constant hardcoded into the instruction or may be computed usingvalues of registers. Address are always offsets from the beginning of asegment. • immediate: These are fixed values that are listed in the instruction itself.They are stored in the instruction itself (in the code segment), not inthedata segment. • implied: These operands are not explicitly shown. For example, the incrementinstruction adds one to a register or memory. The one isimplied.

Basic Instructions • The most basic instruction is the MOV instruction. It moves data from onelocation to another (like the assignment operator in a high-level language). • Ittakestwooperands: • mov dest, src • The data specified by src is copied to dest. One restriction is that bothoperands may not be memory operands. • Here is an example • moveax, 3 ; store 3 into EAX register (3 is immediate operand) • movbx, ax ; store the value of AX into the BX register

Basic Instructions • The ADD instruction is used to add integers. • add eax, 4 ; eax = eax + 4 • add al, ah ; al = al + ah • The SUB instruction subtracts integers. • sub bx, 10 ; bx = bx - 10 • subebx, edi ; ebx = ebx– edi • The INC and DEC instructions increment or decrement values by one.Since the one is an implicit operand, the machine code for INC and DEC issmaller than for the equivalent ADD and SUB instructions. • incecx ; ecx++ • dec dl ; dl--

Basic Instructions • The %define directive • This directive is similar to C’s #define directive. It is most commonlyused to define constant macros just as in C. • %define SIZE 100 • moveax, SIZE • The above code defines a macro named SIZE and shows its use in a MOVinstruction. Macros are more flexible than symbols in two ways. Macroscan be redefined and can be more than simple constant numbers

Basic Instructions • Data directives • Data directives are used in data segments to define room for memory.There are two ways memory can be reserved. The first way only definesroom for data; the second way defines room and an initial value. The firstmethod uses one of the RESX directives.The X is replaced with a letter thatdetermines the size of the object (or objects) that will be stored.

Basic Instructions • The second method (that defines an initial value, too) uses one of theDX directives. The X letters are the same as those in the RESX directives. • It is very common to mark memory locations with labels. Labels allowone to easily refer to memory locations in code. Below are several examples: • L1 db 0 ; byte labeled L1 with initial value 0 • L2 dw 1000 ; word labeled L2 with initial value 1000 • L3 db 110101b ; byte initialized to binary 110101 (53 in decimal) • L4 db 12h ; byte initialized to hex 12 (18 in decimal) • L5 db 17o ; byte initialized to octal 17 (15 in decimal) • L6 dd 1A92h ; double word initialized to hex 1A92 • L7 resb 1 ; 1 uninitialized byte • L8 db "A" ; byte initialized to ASCII code for A (65)

Basic Instructions • Consecutivedata definitions are stored sequentially in memory. That is, the word L2 is storedimmediately after L1 in memory. Sequences of memory may also be defined. • L9 db 0, 1, 2, 3 ; defines 4 bytes • L10 db "w", "o", "r", ’d’, 0 ; defines a C string = "word" • L11 db ’word’, 0 ; same as L10 • For large sequences, NASM’s TIMES directive is often useful. This directiverepeats its operand a specified number of times. For example, • L12 times 100 db 0 ; equivalent to 100 (db 0)’s • L13 resw 100 ; reserves room for 100 words

Basic Instructions • Therearetwoways that a label can be used. If a plain label is used, it is interpreted as theaddress (or offset) of the data. If the label is placed inside square brackets([]), it is interpreted as the data at the address. In other words, one shouldthink of a label as a pointer to the data and the square brackets dereferencesthe pointer just as the asterisk does in C. • 1 mov al, [L1] ; copybyte at L1 into AL • 2 moveax, L1 ; EAX = address of byte at L1 • 3 mov [L1], ah ; copy AH into byte at L1 • 4 moveax, [L6] ; copy double word at L6 into EAX • 5 add eax, [L6] ; EAX = EAX + double word at L6 • 6 add [L6], eax ; double word at L6 += EAX • 7 mov al, [L6] ; copy first byte of double word at L6 into AL

Basic Instructions • A statement that is more than just a comment almost always contains amnemonic that identifies the purpose of the statement, and may have three other fields: • name, operand, and comment. • These components must be in the following order: • name mnemonicoperand(s) ;comment • For example, a program might contain the statement • ZeroCount: movecx, 0 ; initialize count to zero

Basic Instructions • One use for the name field is to label what will be symbolically, following assemblyand linking of the program, an address in memory for an instruction. Other instructionscan then easily refer to the labeled instruction. If the above add instruction needsto be repeatedly executed in a program loop, then it could be coded • addLoop: addeax, 158 • The instruction can then be the destination of a jmp (jump) instruction, the assemblylanguage version of a goto: • jmpaddLoop ; repeataddition

Basic Instructions • It is sometimes useful to have a line of source code consisting of just a name,forexample • EndIfBlank: • Such a label might be used as the last line of code implementing an if-then-else-endifstructure.

Basic Instructions

Basic Instructions • Data Movement • moveax,num1 ; load eax with the contents of num1 • mov num2,eax ; store the contents of eax in num2 • Thesecodesworkswell with numbers, but what if one wanted to move a characterfrom one location to another? • For example, how would the following C code segment be implemented inassemblylanguage? • char letter1,letter2; • letter1 = 'A'; • letter2 = letter1;

Basic Instructions • Thefollowingassemblylanguagecodesegmentimplementstheabove C code: • .data • letter1 byte ? • letter2 byte ? • .code • ; letter1 = 'A' • mov letter1,'A' ; store 'A' in letter1 • ; letter2 = letter1 • mov al,letter1 ; load al with letter1 • mov letter2,al ; store al in letter2

Complete Program: Implementing Inline Assembly in C • In order to include assembly language instructions in a C program, one must includethe __asm{ statement at the beginning of the assembly language code segment, whichis a double underscore, followed by the word asm and an opening brace. • #include <stdio.h> • int main(){ • int num1,num2; • num1 = 5; • num2 = num1; • printf("%s%d\n","The answer is: ",num2); • return 0; • }

Complete Program: Implementing Inline Assembly in C • #include <stdio.h> • int main(){ • int num1,num2; • num1 = 5; • __asm { • mov eax,num1 • mov num2,eax • } • printf("%s%d\n","The answer is: ",num2); • return 0; • }

ArithmeticInstructions • AdditionandSubtraction • After learning how to load a register, transfer data between memory locations, and performI/O, the next step is to learn how to perform various arithmetic operations. • One of thesimplestways to learn how to perform arithmetic in assembly language is to first write theequation as a high-level statement. • sum = num1 + num2;

ArithmeticInstructions • The following assembly language code segment implements the C • statementfromabove: • ; sum = num1 + num2 • mov eax,num1 ; load eax with the contents of num1 • add eax,num2 ; add the contents of num2 to eax • movsum,eax ; store eax in sum

ArithmeticInstructions • Although it is possible to use any of the other three registers and accomplish the sametask, it is usually better to use the accumulator, the eax register, because the arithmeticinstructions that use the eax register tend to use less memory and are also a little faster. • Also, just like there are often many ways to solve a problemin high-level languages, the same is true in low-level languages. Further, just like somesolutions are better solutions in high-level languages, the same is also true in low-levellanguages. For example, the previous assembly code segment could have been written asfollows: • mov sum,0 ; initializesumtozero • moveax, num1 ; load eax with the contents of num1 • add sum, eax ; add the contents of eax to sum • moveax, num2 ; load eax with the contents of num2 • add sum, eax ; add eax to sum

ArithmeticInstructions • Although the above code segment works, in that sum contains the sum ofboth num1 and num2, it is not necessarily implementing the original C statement:sum=num1 + num2; but rather it is implementing the following C code segment: • sum = 0; • sum = sum + num1; • sum = sum + num2;

ArithmeticInstructions • Note again that a memory to memory instruction does not exist. As before, a simplehigh-level subtraction statement such as • difference = num2 - num1; • would be implemented in assembly language as follows: • ; difference = num2 - num1 • mov eax,num2 ; load num2 intoeax • sub eax,num1 ; subtract num1 from eax • movdifference,eax ; store answer in variable difference

ArithmeticInstructions • MultiplicationandDivision • While addition and subtraction seem to be fairly straightforward, multiplication and divisionan be just a little more complicated. When adding two numbers together, it is possiblethat the answer will be larger than the size of the register or memory location that can holdthat value which would cause an overflow error. For example, adding the numbers 999and 999 in base 10 will result in the number 1,998, which is one digit larger than thetwo original numbers. The same applies to base 2, where adding the numbers 111 and 111would result in the number 1110. • However, with multiplication, the situation is worse. For example, when multiplying thenumbers 999 and 999 in base 10, the answer is 998,001, where there is not just one extradigit but potentially twice as many digits as is the case in this example. The same holdstrue for binary, where multiplying the numbers 111 and 111 results in the answer 110001,where again there are twice as many digits.

ArithmeticInstructions • The way these two one-operand versions of the signed multiplication instruction workis that the eax register must first be loaded with the number that needs to be multiplied (themultiplicand). • Then, the number to be multiplied by (the multiplier) either is placedinto a register or can be located in a memory location. Note that with the one-operandimulinstruction, there is no provision for an immediate operand and that the use of theeaxregister for the multiplicand is implied.

ArithmeticInstructions • Given the above, one can implement thefollowingC instruction • product = num1 * num2; • as follows in assembly language: • ; product = num1 * num2 • mov eax,num1 ; load eax with the contents of num1 • imul num2 ; multiplyeaxby mum2 • movproduct,eax ; store eax in product

ArithmeticInstructions • Given the above description of the imul instruction, how would one implement thefollowingC statement? • product = num1 * 2; • ; product = num1 * 2 • moveax, num1 ; load eax with the contents of num1 • mov ebx,2 ; load ebx with the value 2 • imulebx ; multiply eax by ebx • mov product, eax ; store eax in product

ArithmeticInstructions OpcodeMeaningDescription cbwConvert byte to word Extends the sign from al to ax cwdConvert word to double Extends sign from ax to eax cdqConv.doubto quadExt.sign from eax to edx:eax pair • For example, what if one wanted to implement the following C statement? • answer = number / amount; • The solution to the previous C code is as follows: • ; answer = number / amount • moveax,number ; load eax with number • cdq ; propagate sign bit into the edx register • idiv amount ; divide edx:eax by amount • movanswer,eax ; store eax in answer

ArithmeticInstructions • Implementing Unary Operators: Increment, Decrement, and Negation • In high-level languages, the arithmetic operations presented in the previous two sectionsare known as binary operators, not because they perform arithmetic on binary numbersbut rather because they have two operands as in x + y. • Although it is possible to implement all of the arithmetic necessary to implementunary operators with the instructions presented previously, there are some extra arithmeticinstructions that tend to take up a little less memory, might be a little faster, and also makelife a little easier for the assembly language programmer. • For example, if one needed to increment a variable x by 1 and decrement a variable y • by 1, such as • x = x + 1; • y = y - 1; • or one could alternatively use the increment and decrement operators • x++; or ++x; • y--; or --y

The above can be implemented by merely using the add and subinstructions: • add x,1 • suby,1 InstructionInstruction incregdecreg incmemdecmem • In fact, on older 16-bit processors if one needed to add orsubtract the number 2 to or from a register, it was faster to use two inc or decinstructionsthan it was to use a single add or sub instruction to add or subtract the number 2. Althoughthis is not true with newer 32-bit processors, a single inc or dec instruction is still morememory efficient than using an add or a sub instruction to increment or decrement by 1

To help understand the order of operation and sharpen one’s skillsusing assembly language arithmetic instructions, this section examines how slightly morecomplicated arithmetic statements might be implemented. Again, it helps to first write itout as a high-level instruction: • answer = num1 + 3 - num2; • moveax,num1 ; load eax with the contents of num1 • add eax,3 ; add 3 to eax • sub eax,num2 ; subtract num2 from eax • movanswer,eax ; store the result in answer • As before, there is usually more than one way to solve a problem in assembly language,such as the following code segment suggests: • add num1,3 ; add 3 to num1 • mov eax,num1 ; load num1 intoeax • sub eax,num2 ; subtract num2 from eax • movanswer,eax ; store the result in answer

To illustrate further the rules concerning order of operation, consider the following Cstatement: • answer = num1 + 3 * num2; • ; answer = num1 + 3 * num2 • mov eax,3 ; load eax with the number 3 • imul num2 ; multiply eax by num2 • add eax,num1 ; add the contents of num1 to eax • movanswer,eax ; store the contents of eax in answer

Assuming that Value1 is stored in the EAX register and Value2 is stored in the EBXregister, the above swap can be coded as • xchgeax, ebx ; swap Value1 and Value2 • Instead of using the xchg instruction, one could code • movecx, eax ; swap Value1 and Value2 • moveax, ebx • movebx, ecx

Elements of Assembly Language

Elements of Assembly Language

Presentation Transcript

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language

Assembly Language