Computer Organization
This presentation is the property of its rightful owner.
Sponsored Links
1 / 121

Computer Organization X86 Assembly Language PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on
  • Presentation posted in: General

Computer Organization X86 Assembly Language. Handouts + IBM PC Assembly Language & Programming , Peter Abel, Prentice Hall, 5th edition. Chap.: 1, 4, 6, 7,8. Evolution of Microprocessor. Evolution of Microprocessor cont. Basic Concepts. What is Registers?.

Download Presentation

Computer Organization X86 Assembly Language

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Computer organization x86 assembly language

Computer Organization

X86Assembly Language


Computer organization x86 assembly language

Handouts

+

IBM PC Assembly Language & Programming,

Peter Abel, Prentice Hall, 5th edition.

Chap.: 1, 4, 6, 7,8


Evolution of microprocessor

Evolution of Microprocessor


Evolution of microprocessor cont

Evolution of Microprocessor cont.


Computer organization x86 assembly language

Basic Concepts


What is registers

What is Registers?

  • You can consider it as variables inside the CPU chip

They are all 16-bits


General purpose registers

General Purpose Registers

  • AX, BX, CX, and DX: They can be assigned to any value you want

    • AX (Accumulator Register): Most of arithmetical operations are done with AX

    • BX (Base Register): Used to do array operations. BX is usually worked with other registers like SP to point to stacks

    • CX (Counter Register): Used for counter purposes

    • DX (Data Register). Used for storing data value


Index registers

Index Registers

  • SI and DI: Usually used to process arrays or strings:

    • SI (Source Index): is always pointed to the source array

    • DI (Destination Index): is always pointed to the destination array


Segment registers

Segment Registers

  • CS, DS, ES, and SS:

    • CS (Code Segment Register): Points to the segment of the running program. We may NOT modify CS directly

    • DS (Data Segment Register): Points to the segment of the data used by the running program. You can point this to anywhere you want as long as it contains the desired data

    • ES (Extra Segment Register): Usually used with DI and doing pointers things. The couple DS:SI and ES:DI are commonly used to do string operations

    • SS (Stack Segment Register): Points to stack segment


Pointer registers

Pointer Registers

  • BP, SP, and IP:

    • BP (Base Pointer): used for preserving space to use local variables

    • SP (Stack Pointer): used to point the current stack

    • IP (Instruction Pointer): denotes the current pointer of the running program. It is always coupled with CS and it is NOT Modifiable. So, the couple of CS:IP is a pointer pointing to the current instruction of running program. You can NOT access CS nor IP directly


16 bit register

16-bit Register

  • The general registers AX, BX, CX, and DX are 16-bit

  • However, they are composed from two smaller registers For example: AX

    The high 8-bit is called AH, and the low 8-bit is called AL Both AH and AL can be accessed directly

  • However, since they altogether embodied AX

    • Modifying AH is modifying the high 8-bit of AX

    • Modifying AL is modifying the low 8-bit of AX

  • AL occupy bit 0 to bit 7 of AX, AH occupy bit 8 to bit 15 of AX


Extended register

Extended Register

  • X386 processors introduce extended registers

  • Most of the registers, except segment registers are enhanced into 32-bit

  • So, we have extended registers EAX, EBX, ECX, and so on

  • AX is only the low 16-bit (bit 0 to 15) of EAX

  • There are NO special direct access to the upper 16-bit (bit 16 to 31) in extended register


Flag register

Flag Register

  • Flag is 16-bit register that contains CPU status

  • It holds the value of which the programmers may need to access. This involves detecting whether the last arithmetic holds zero result or may be overflow

  • Intel doesn't provide a direct access to it; rather it is accessed via stack. (via POPF and PUSHF)

  • You can access each flag attribute by using bitwise AND operation since each status is mostly represented by just 1 bit


Flag register cont

Flag Register cont.

  • C carry flag: is turned to 1 whenever the last arithmetical operation, such as adding and subtracting, has carry or borrow otherwise 0

  • P parity flag: It will set to 1 if the last operation (any operation) results even number of bit 1

  • A auxiliary flag: It is set in Binary Coded Decimal (BCD) operations

  • Z zero flag: used to detect whether the last operation (any operation) holds zero result

  • S sign flag: used to detect whether the last operation holds negative result. It is set to 1 if the highest bit (bit 7 in bytes or bit 15 in words) of the last operation is 1


Flag register cont1

Flag Register cont.

  • T trap flag:used in debuggers to turn on the step-by-step feature

  • I interrupt flag: used to toggle the interrupt enable or not. If the bit is set (= 1), then the interrupts are enabled, otherwise disabled. The default is on

  • D direction flag: used for directions of string operations. If the bit is set, then all string operations are done backward. Otherwise, forward. The default is forward (0)

  • O the overflow flag: used to detect whether the last arithmetic operation result has overflowed or not. If the bit is set, then it has been an overflow


Memory

Memory

  • X86 CPU only has 16-bit registers, so the maximum amount of memory that can be addressed is:

    216 = 65536 (64K)

  • However, after XT arrives, the memory is extended to 1 MB. That is 16 times bigger than the original

  • Segmentation: means the memory is divided virtually into several areas called Segment

  • The segment registers are 16 bit

  • The idea of the segmentation is NOT dividing 1 MB into 16 exact parts


Memory cont

Memorycont.

  • Interleaved: means that if we say the segment number 0, then we can access the memory 0 to 65536. Segment number 1 allows us to access memory number 16 to 65552. Segment 2 from 32 to 65568, and so on with the increment of 16

65568

Seg 2

32

65552

Seg 1

16

65536

Seg 0

0


Memory interleaved

Memory Interleaved

  • Why did they do that?

    It is for the sake of the operating system OS memory management stuff

    Therefore, OS align the executed code to the nearest 16 bytes alignment


Memory cont1

Memory cont.

  • The memory access must be done in a pair of registers

  • The first is the segment register and next is any register, usually BX, DX, SI or DI

  • The register pair usually written like this:  ES:DI with a colon between them

  • The pair is called the Segment:Offsetpair

    So, ES:DImeans that the segment part is addressed by ES, and the offset part is addressed by DI


Memory cont2

Logical address

Absolute

or Physical address

Memory cont.

Example:

  • If the ES contains 1, and DI is 5, means that we access the memory 5.

  • If ES:DI = 0001:0005 then it actually access the actual address 21

    (1 * 16 + 5 = 21)

  • So, 0000:0021 and 0001:0005 is actually the same address


Stacks

Stacks

  • The stack (LIFO) is a temporary area to store temporary things

  • It is mainly used to pass the parameter value to procedures or functions

  • Sometimes, it also acts as temporary space to allocate for local variables. Therefore, the role of the stack is very important


Interrupts

Interrupts

  • Upon a request of an interrupt, the CPU usually stores context of running program, then it goes to the interrupt routine

  • After processing the interrupt, the processor restores all states stored and resume the program. There are 3 kinds of interrupts:

  • Hardware interrupts occurs if one of the hardware inside your computer needs immediate processing

  • Software interrupts occurs if the running program requests the program to be interrupted and do something else

  • CPU-generated interrupts occurs if the processor knows that is something wrong with the running code. (Divide a number with 0)


Why assembly

Why Assembly?

  • It's difficult

  • Error prone

  • Hard to debug

  • Takes a lot of time to develop


Why assembly1

Why Assembly?

However:

  • Assembly is fast. A LOT faster than any compiler of any language could ever produce

  • Assembly is a lot closer to machine level than any language because the commands of assembly language are mapped 1-1 to machine instructions

  • Assembly code is a lot smaller than any compiler of any language could ever produce

  • In Assembly, we can do a lot of things that we can't do in any higher level language


Notes

Notes

  • The assembly language is NOT case-sensitive

  • A comment in assembly begins with a semicolon (;). Everything after a semicolon until the end of the line is ignored


Com structure

COM Structure

ideal

p286n

model tiny

codeseg

org 100h

jmp start

; your data and subroutine here

start:

mov ax, 4c00h

int 21h

end


Com program explanation

Com Program Explanation

  • ideal says that we're using ideal syntax of TASM

  • p286n or .286 says that we're using 80286 processor instructions

  • model tiny or .model tiny says that we're using COM format

  • codeseg or .code says that this is the beginning of our code

  • org 100h

  • COM programs are almost always begin with a jump, i.e. jump to the beginning of the code. Between the jump and the beginning of your code, you place your variables here. The jump is denoted by the word jmp and followed with a label (here we call it start)

  • After the label start, the next two lines is just the code to terminate your program

  • end or .end entry specify the end point of your program


Making labels

Making Labels

  • Put any name and stick it with a colon (:)

  • Label usually serves as a tag of where you'd like to jump and so on

  • You have to pick unique names for each label, otherwise the assembler will fail

  • There is a way to make it local: to prefix it with a @@ in front of the label name and still end it with a colon


Computer organization x86 assembly language

Variables in Assembly


Variables declaration

Variables Declaration

  • Our ideal syntax (TASM based) looks like this:

    Ideal

    p286n

    model tiny

    codeseg

    org 100h

    jmp start

    ; your data and subroutine here (this is a comment)

    start:

    mov ax, 4c00h

    int 21h

    end

  • Put variable declarations after the jmp start statement.


Variables declaration1

Variables Declaration

:

bits db 101001b

var2 dw 4567h

var3 dw 0BABEh : 

  • There are 3 main types of variable declarations in assembly:

    • db is to declare the 1-byte-length

    • dw is for the word (2 bytes)

    • dd is for the double-word (4 bytes)

  • The declaration syntax is as follows:

    var_name db value

Ideal

P286n

model tiny

Codeseg

org 100h

jmp start

score db 100

year dw 2001

money dd 1000000

start:

mov ax, 4c00h

int 21h

end


Variables declaration cont

Variables Declaration cont.

  • Variable Limits and Negative Values

  • You can assign the variables as negative values, too. However, assembler will convert them to the corresponding 2’s complement value. For example: If you assign -1 to a db variable, assembler will convert it to 255 integer

2’s Complement


Moving around values

Moving Around Values

  • If you need to do some calculations or commands involving the variables you'll have to load the variable values to the registers

  • The syntax of the mov command is: mov a , b

    which means assign b to a

Var1

Var2

MM

Reg 1

mov ax, [var2]

mov [var1],ax

Reg 2


Moving around values example

Moving Around Values: example

:

jmp start

our_var dw 10

start:

mov bx, [our_var]

mov cx, bx

mov [our_var], cx

mov ax, 4c00h

int 21h

end

The square brackets [ ] are to distinguish the variable from its address


Moving around values cont

Moving Around Values cont.

  • When we deal with byte variables (i.e. db), we need to use byte registers (e.g. AL, AH, BL, BH, and so on) to do our bidding

  • AX, BX, CX, DX, and so on are word registers

  • You can use double-word registers which is available in 80386 processors or better (use p386n instead of p286n to enable double-word registers)

  • The double-word registers includes EAX, EBX, ECX, EDX, and so on


Moving around values cont1

Moving Around Values cont.

  • We can assign variables with constants with mov instruction. However, this will work only with 80286 or better processors:

    mov [word ptr our_var], 1

    Notice the word ptr modifier must be used when you assign constants to variables. Since our_var is a word variable, we need to use word ptr modifier

    Likewise, byte variable uses byte ptr modifier and double-word variable uses dword ptr


Moving around values example1

Moving Around Values example

Notice the way that Intel assembler store a word value

It stores the least significant byte first, then the most significant byte later


Big endian little endian

Big-endian & Little-endian

  • Describe the order in which a sequence of bytes is stored in a computer’s memory

  • In a big-endian system, the most significant value in the sequence is stored at the lowest storage address (i.e., first)

  • In a little-endian system, the least significant value in the sequence is stored first


Moving around values cont2

Moving Around Values cont.

  • Recall that variables in assembly are treated as addresses

AX  0502h


Moving around values cont3

Moving Around Values cont.

  • Double-word variables are also stored similarly

    my_var dd 1234BABEh


Impacts on registers

Impacts on Registers

  • Recall that the word register AX consists of AH and AL

  • Modifying either AH or AL will modify the contents of AX

  • Likewise, modifying AX will be likely modify AH and AL


Question marks on variables

Question Marks on Variables

  • If you are not certain about the default value of a variable you can give a question mark ("?") instead. For example:

    another_var dw ?

    String Variables

  • You can define strings variables in assembly. It is as follows:

    message db "Hello World!$"

    String variables are required to be stored as db variables. The string is then surrounded by quotes, either single or double, up to you


String variables

String Variables

  • Why do we have to end our string with a dollar sign ("$")?

  • Each characters of the string is converted to its corresponding ASCII code

message db "Hello World!$"


Multi valued variables

Multi-Valued Variables

  • The variables defined as db means each value is defined as bytes

  • However, there is no restriction on how many values we can define for each variable names

multivar db 12h, 34h, 56h, 78h, 00h, 11h, 22h, 00h


Multi valued variables1

Multi-Valued Variables

  • So multi valued variables are stored contiguously

    multivar2 dw 1234h, 5678h, 0011h, 2200h


Using dup

Using dup

  • Another way to declare a multi-valued variables are using dup command:

    my_array db 5 dup (00h)

    That example above is similar to:

    my_array db 00h, 00h, 00h, 00h, 00h

    dup is kind of shortcut to define variables with the same values

  • Of course you can define something like this:

    bar_array db 10 dup (?)


Computer organization x86 assembly language

Arithmetic Instructions


Computer organization x86 assembly language

Addition & Subtraction


Addition subtraction

Addition & Subtraction

  • You may actually add or subtract variables with constants. But don't forget to add the wordptr or dword ptr as appropriate

  • If the result of an addition overflows, the carry flag is set to 1, otherwise it is 0

  • Similarly, if the result of subtraction requires a borrow, then the carry flag is also set to 1, otherwise it is 0


Addition subtraction1

Addition & Subtraction

  • Suppose you'd like to add a 32-bit integers with 16-bit registers

  • Intel processor has a special instruction calledadc

  • For the subtraction, we have similar instruction called sbb


Multiplication division

Multiplication & Division

  • Multiplication and division always assume AX as the place holder

  • If there is an overflow in multiplication, the overflow flag will be set

  • Note: mul and div will treat every numbers as positive. If you have negative values, you'll need to replace them imul and idiv respectively


Increment decrement

Increment & Decrement

  • Often times, we'd like to incrementing something by 1 or decrement thing by 1

  • You can use add x, 1 or sub x, 1 if you'd like to, but Intel x86 assembly has a special instruction for them

  • Instead of add x, 1 we use inc x. These are equivalent

  • Likewise in subtraction, you can use dec x

  • Beware that neither inc nor dec instruction sets the carry flag as add and sub do


Computer organization x86 assembly language

Tips

  • The arithmetic operations can have special properties

  • For example: add x, x is actually equal to multiplying x by 2

  • Similarly, sub x, x is actually setting x to 0

  • In 8086 processor, these arithmetic is faster than doing mul or doing mov x, 0. Even more, its code size is smaller


Computer organization x86 assembly language

Bitwise Operations


And or xor

And, Or, Xor

  • and, or, and xor takes two operands

  • You can have both operands as registers, one of them as variables, etc.

    The syntax is as follows:


And or xor example

And, Or, Xor: example

AH = 76 and AL = 45

AH = 01001100 and AL = 00101101


Computer organization x86 assembly language

Not

  • The not operation takes a single operand


Bit masking flipping

Bit Masking & Flipping

  • Sometimes, one byte can contain several information decoded in bits (like flag register)

  • Example: Suppose AL = 00101100. However you only need the lower four bits (i.e. 1100)

  • This can be done creating a mask based on the and behavior

  • Since we need only the lower four bits, the mask would be: 00001111


Bit masking example

Bit Masking example

  • Suppose you have AL = 00101100. Now, you'd like to store the lower 4 bits of your data in CL = 00000011 into the lower 4 bits of AL


Bit masking flipping1

Bit Masking & Flipping

  • There are times we only want to flip the bits around

  • We can use xor with it. You can observe that anything xorred with 1 will be flipped

  • Suppose, we'd like to flip the middle four bits of AL:


Bit shifting

Bit Shifting

  • Shifting left one position means take one bit at the left, then shift the remaining bits, then add one 0 at the end

  • Shifting right is analogous

  • The x and y usage is just like add or sub, you can have registers, variables or constants. Of course the x part cannot be a constant

  • What happened to the missing bits that get shifted out?

    The carry flag will hold the last shifted-out bit


Shift and rotate

Shift and Rotate


Bit rolling

Bit Rolling

  • Bit rolling is similar to bit-shifting. Instead of shifted out, the bits gets rolled back

  • Rolling to the right is similar

  • There is another variant on rolling bits, using carry flag. Rolling bits using carry flag is done by rcl and rcr


Shift and rotate cont

Shift and Rotate cont.


Computer organization x86 assembly language

Branching & Loop Instructions


Unconditional conditional jumps

Unconditional & Conditional Jumps

  • Conditional jumps always consider some condition

  • If the condition is satisfied, then the jump is taken, otherwise it is not

  • The conditions are usually reflected in the processor flags

  • On the other hand, unconditional jumps do not regard any conditions

  • So, it is more like goto in a sense


Making labels1

Making Labels

  • Labels are essential to jump instructions

  • It marks the destination. Of course you need to set where to jump, Making labels in assembly are easy

  • Labels can be made like this:

    example:

  • So, we can pick out any names and stick a colon after it (:)

  • You must make sure that all label names throughout your program are unique, no duplicates


Unconditional jumps

Unconditional Jumps

  • For unconditional jump, the instruction is jmp

  • unconditional jumps takes no regard on conditions. So, whenever the processor arrives at the instruction jmpsomewhere, it will directly skip all the instructions below it up to until the instruction marked by the label somewhere


Conditional jumps

Conditional Jumps

  • Before the jump instruction, we (usually) have to put a comparison or testing instruction

  • The comparison instruction is cmp


Conditional jumps cont

Conditional Jumps cont.


Conditional jumps cont1

Conditional Jumps cont.

  • Note that jg, jge, jl, and jle will work for signed variables only

  • For unsigned variables, use ja"jump if above", jae, jb"jump if below", and jbe as the substitution respectively

  • The rest (i.e. je, jne, and jc) work with both signed and unsigned variables


Testing instruction

Testing Instruction

  • The syntax of test instruction:

    test x, y

  • It behaves like an and but it does not store the result back to x

  • So it is more like x and y

  • Usually after this instruction, we usually check whether the result of the and-ing is zero or not using jz or jnz (i.e. "jump if zero")


Testing instruction example 1

Testing Instruction example 1

Add 1+2+3+...+10


Testing instruction example 2

Testing Instruction example 2

8! Factorial


Loop construct

Loop Construct

  • This structure is just like do..while construct in C/Java

  • When the processor takes loop instruction, it will first decrease the register CX by one

  • After that, CX is tested whether it is zero or not. If it is not zero, then jump to mylabel

  • It's kind of countdown counter


Loop construct example

Loop Construct example

Let's take 1+2+...+10 example


Computer organization x86 assembly language

Interrupt Essentials


Introduction to interrupt

Introduction to Interrupt

  • Interrupt is just like a procedure provided by the system and You can invoke it

  • These two lines actually request the operating system to terminate the program

  • The interrupt is called using intinstruction with a number after it

  • This number is referred as Interrupt Number


Introduction to interrupt cont

Introduction to Interrupt cont.

  • Interrupt number alone is not enough

  • Interrupt behaves differently depending on which Service Number is called

  • Service numbers are usually placed in AH

  • Sub-Servicenumber is usually placed in AL

  • This interrupt mechanism is pretty much like a phone number


Output to screen

Output to Screen


Output to screen1

Output to Screen

  • After the start label we are invoking interrupt number 21h, service 09h

  • Interrupt 21h is reserved for Operating System calls

  • When you look up what service 09h does on interrupt 21h in interrupt list

  • To insert a new line simply change the message declaration into:


Input from keyboard

Input from Keyboard

  • Interrupt 21h service 0Ah offers a mean to input from keyboard. The interrupt lists say:


Input from keyboard example

Input from keyboard example

Buffer


Output a better version

Output: A Better Version

  • There is one way to cope with “$” issue by output characters one by one using a loop

  • The loop terminates if the character being read is 0

  • Zero in ASCII number is defined as a blank and usually used to terminate stuffs

  • Interrupt 21h, service 06h used to print one character on screen


Input one character

Input one Character


Number to string

Number to String

  • The output routines we discussed so far are intended only for outputting strings

  • How can we output numbers?

  • We have to convert the numbers to string first


Computer organization x86 assembly language

Stacks


Why stack

Why Stack?

There are several reasons why we need stacks:

  • To save register values if we ran out of registers

  • To pass parameters to subroutines

  • To make space for local variables in subroutines

  • To preserve original register values if we change them in a subroutine

  • To fetch processor flag status


Stack operations

Stack Operations

  • last in first out (LIFO)

  • Stack operations mainly done by two instructions either push or pop

  • The instruction push will push values into the stack, while pop will pop it out

  • The syntax is like this:

  • The operand X is a 16-bit

  • You can push 8-bit too, but the processor will push a 16-bit value anyway


Memory layout

Memory Layout

  • You should know that register CS by default points to the segment where the code resides. DS will point to the data segment. ES usually pointed to data segment too. SS will point to stack segment. Since CS, DS, ES, and SS point to the same segment, it means code, data, and stack resides in the same region

MM

Code Seg.

&

Data Seg.

&

Extra Seg.

&

Stack Seg.

Code Seg.

---------------------

Data Seg.

---------------------

Extended Seg.

---------------------

Stack Seg.

CS

DS

ES

SS


How can we manage this

How can we manage this?

  • The stack is not only pointed by SS register. But also SP register

  • So, the pair SS:SP points the top of the stack. Initially, SP is set to the very bottom of the segment in "tiny" mode, at address FFFEh

  • Each time we push something into the stack, this SP register will be decremented up by 2. If we pop something, SP will be incremented down by 2

  • Whereas, our code and our data starts at offset 100h


Computer organization x86 assembly language

So, the layout looks something like this:


Application

Application


Other uses

Other Uses

  • Can we push a constant? In 8086NO. In 80286 or above YES. So, doing push 1, this will be treated as if a 16-bit value. No need to specify word ptr and stuff

  • The more useful usage of push and pop is to push flag and then pop it into register. That way, we can examine the flag content directly. Look at the following code:

    pushf ; top stack  flag register

    pop AX ; AX  stack top

  • There we can examine the flag values in register AX, The net effect is the same like assigning AX with flags

  • Likewise, you can set the flag values using push AX then popf


Computer organization x86 assembly language

Subroutines

&

Macros


Subroutine syntax

Subroutine Syntax


More on parameters local variables

More on Parameters & Local Variables

  • Note that we can not initialize local variables

  • Of course you can do a mov to assign it with a value later on

  • The parameters are passed down through stack using push and pop


A word of caution

A Word of Caution

  • Since procedures are built with the help of stacks, you have to remember not to modify SP and BP anytime in the subroutines

  • It's because SP is used to store stack position and BP is used to store the stack position before entering the subroutine

  • Moreover, when you modify certain registers in a subroutine, it is likely you interfering the main program


How to cope this situation then

How to cope this situation then?

  • pusha "push all ":

    which basically stores (almost) all registers

  • popa "pop all" :

    to pop into the appropriate registers


How about functions

How About Functions?

  • Subroutines that can return some values too

  • Usually, we designate registers to hold the output or result for our subroutine

  • Many programmers tend to choose AX for this purpose. If you have more than one output from the subroutine, you can select multiple registers to hold the results

  • Due to this nature, the output registers need not to be saved nor restored because the caller itself expects those designated registers to change


Functions example

Functions example

  • Let's make a subroutine to calculate 1+2+...+n


Document a subroutine

Document a Subroutine

  • It is a good habit to document a subroutine. At least give a comment above it


Routine placement

Routine Placement


Macros

Macros

  • Notice :

  • We use macro and endm keyword instead

  • We may not specify the parameter type

  • There is no ret instruction at the end

  • There is no call keyword


Recap

Recap

The main differences (behavior-wise) are:

  • Macros use String replacement for its invocation whereas subroutines use Calls

  • Due to replacement nature, macro can exist Multiple copies in the programs whereas subroutine can exist only in One copy

  • Because of multiple copies possibility, you cannot obtain a macro's Address, whereas you can obtain a subroutine's address

  • Macros can be faster since it doesn't have calling and return time penalty

  • Macros can be harder to debug


Computer organization x86 assembly language

Arrays


Array revisited

Array Revisited

  • To refresh our mind, declaring a ten-byte array is like this:

  • To load the 1st element of the array into register al is like this:

  • Accessing the 2nd, the 3rd, and the 4th element

    is like this:


Access array through a loop

Access Array through a loop


Reverse array example

Reverse array example

Note:

BX is nicked as ‘Base register'

SI as ‘Source Index'

DI as ‘Destination Index'


Computer organization x86 assembly language

String Instructions


Computer organization x86 assembly language

5

  • There are five basic string instructions:

    • LES, LDS

    • MOVS

    • CMPS

    • SCAS

    • STOS , LODS

  • These instructions can be "emulated“ with mov, cmp, loop and jmp. However, these five brothers are a lot faster since they are "built-in" instructions


Les di and lds si

LES DI and LDS SI

  • String instructions typically uses DS:SI pair to denote the source string and ES:DI pair to denote the destination string

  • The only thing we care is to set the register SI and DI to point to the source and destination offset respectively

    LES DI, [SomeStringVar]

    LDS SI, [OtherStringVar]

  • These instructions are used to set both ES and DI or both DS and SI respectively


Direction flag

Direction Flag

  • After setting source and/or destination register pairs, you may want to specify on how the string instruction is performed: Should it be performed Backwards or Forwards?

  • Assembly can do these instructions in both directions

  • Determining which way to go involves setting the direction flag. Intel x86 assembly has two instructions for this:

    CLD ;Clear Direction Flag

    STD ; Set Direction Flag

  • Clearing direction flag will cause the string instructions done forward. Setting it will make a reverse direction


Computer organization x86 assembly language

MOVS

  • The instruction movsis used tocopysource string into the destination. This instruction comes in two variants: movsbandmovsw

  • Since we'd like to move several bytes at a time, these movsinstructions are done in batches using repprefix. The number of movements is specified byCXregister


Computer organization x86 assembly language

CMPS

  • The instruction cmpsis used to compare two strings. It also has two variants: cmpsb and cmpsw

  • After the rep cmpsb, the zero flag is set if the result is equal


Computer organization x86 assembly language

SCAS

  • The instruction scasis used to scan a string pointed by ES:DI

  • Typically used for searching a particular character in a string

  • scas has two variants: scasband scasw. In scasb, the string ES:DI is searched for the occurrence of the element specified by the register AL, whereas in scasw, the element to be searched is in AX


Computer organization x86 assembly language

STOS

  • The stos instruction fill the string pointed by ES:DI pair with the value in AX. So, it is great when you'd like to initialize arrays (usually with zeroes)

  • It has two variants: stosb and stosw. In stosb, all bytes in the string ES:DI is replaced with whatever AL contains. In stosw, the initializator is AX contains


Computer organization x86 assembly language

LODS

  • The lodsinstruction will load a chunk (either a byte or a word) from the string pointed by DS:SI into AX

  • It has two variants: lodsb and lodsw


  • Login