1 / 7

Why Assembly?

Why Assembly?. Speed Not affected by compiler optimization. Registers that can be used without saving. r0 r18-r25 r25-r27 (X) r30-r31 (Z) r1 (must be cleared before returning). Assembler function arguments. Arguments allocated left to right (r25 to r18) Even register aligned.

Download Presentation

Why Assembly?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why Assembly? • Speed • Not affected by compiler optimization

  2. Registers that can be used without saving • r0 • r18-r25 • r25-r27 (X) • r30-r31 (Z) • r1 (must be cleared before returning)

  3. Assembler function arguments • Arguments allocated left to right (r25 to r18) • Even register aligned

  4. Single argument examples

  5. Data returned by a function

  6. Simple assembler example uint32_t subit(uint32_t ul, uint8_t b){ return(ul-b);} #include <avr/io.h> .text .global subitsubit: sub r22, r20 ; subtract b (r20) from ul (r25-r22)sbc r23, r1 ; .. NOTE: gcc makes sure r1 is always 0sbc r24, r1 ; ..sbc r25, r1 ; .. ret .end

  7. More complex example: #include <avr/io.h>; defines the # of cpu cycles of overhead; (includes the ldi r16,byte0; ldi r17,byte1; ldi r18, byte2, ; ldi r19, byte3, and the call _delay_cycles)OVERHEAD = 24; some register aliasescycles0 = 22cycles1 = 23cycles2 = 24cycles3 = 25temp = 19 .text .global delay_cyclesdelay_cycles:;; subtract the overheadsubi cycles0,OVERHEAD ; subtract the overheadsbc cycles1,r1 ; ..sbc cycles2,r1 ; ..sbc cycles3,r1 ; ..brcsdcx ; return if req’d delay too short ;; delay the lsbmov r30,cycles0 ; Z = jtable offset to delay 0-7 cycles com r30 ; ..andi r30,7 ; ..clr r31 ; ..subi r30,lo8 (-(gs(jtable))) ; add the table offsetsbci r31,hi8 (-(gs(jtable))) ; ..ijmp ; vector into table for partial delayjtable: nopnopnopnopnopnopnop;; delay the remaining delayloop: subi cycles0,8 ; decrement the count (8 cycles per loop)sbc cycles1,r1 ; ..sbc cycles2,r1 ; ..sbc cycles3,r1 ; ..brcsdcx ; exit if donenop ; .. add delay to make 8 cycles per looprjmp loop ; ..dcx: ret .end void delay_cycles(uint32_t cpucycles);

More Related