1 / 28

ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation

ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation. Maxwell Souza, Daniel Nicácio and Guido Araujo. Motivation. Architecture diversity is increasing There is a need for legacy code to use new architecture features Code portability between architectures is also desirable

heinz
Download Presentation

ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo

  2. Motivation • Architecture diversity is increasing • There is a need for legacy code to use new architecture features • Code portability between architectures is also desirable • Dynamic Binary Translation (DBT) enables it

  3. ArchC • Processor description language • SystemC compatible • 8 researchers for the last 5 years • Features: • Fast interpreted/compiling simulation • Linux OS syscall emulation • Runs code directly from GCC (allows gdb support) • Processors: • MIPS, SPARC, PPC, 8051, ARM, OR10K, etc. • Runs Mediabench, Mibench and SPEC CInt • Simulation speed: from 100 KIPS to 570 MIPS

  4. Instruction Set Architecture (AC_ISA) AC_ISA(mips1){ ac_format Type_R = "%op:6 %rs:5 %rt:5 %rd:5 0x00:5 %func:6"; ac_format Type_I = "%op:6 %rs:5 %rt:5 %imm:16"; ac_instr<Type_R>add; ac_instr<Type_I>load; ISA_CTOR(mips1) { add.set_asm("add %reg, %reg, %reg“, rd,rs,rt); add.set_decoder(op=0x00, func=0x20); load.set_asm("lw %reg, %imm(%reg)“,rt,imm,rs); load.set_decoder(op=0x23); }; }; Binary field Instruction declaration Decoding order

  5. Architecture Resources (AC_ARCH) AC_ARCH(mips1){ ac_mem MEM:256K; ac_regbank RB:32; ac_reg lo,hi; ac_pipe PIPE = {IF,ID,EX,MEM,WB}; ac_format Fmt_EX_MEM = "%alures:32 %wdata:32 %rdest:5 %regwrite:1 %memread:1 %memwrite:1"; ac_reg<Fmt_EX_MEM> EX_MEM; ac_wordsize 32; ARCH_CTOR(mips1) { ac_isa("mips1_isa.ac"); . . . . }; };

  6. Instruction Behavior (ac_behavior) void ac_behavior( Type_R, int stage ){ switch(stage){ case IF: case ID: /* Checking forwarding for the rs register */ if ( (EX_MEM.regwrite == 1) && (EX_MEM.rdest != 0) && (EX_MEM.rdest == ID_EX.rs) ) operand1 = EX_MEM.alures.read(); else if( (MEM_WB.regwrite == 1) && (MEM_WB.rdest != 0) && (MEM_WB.rdest == ID_EX.rs) ) operand1 = MEM_WB.wbdata.read(); else operand1 = RB.read(rs); ... default: break; } }

  7. Jump and Branches Semantics • Additional information • jump() : target computation • delay() : conditional call.set_decoder(op=0x01); call.jump(ac_pc+(disp30<<2)); call.delay(1, true); call.behavior(writeReg(15, ac_pc)); be.set_decoder(op=0x00, cond=0x01, op2=0x02); be.branch(ac_pc+(disp22<<2)); be.cond(PSR_icc_z); be.delay(1, PSR_icc_z || !an);

  8. ArchC Overview ArchC Description ArchC Pre-processor(acpp) ArchC IR Assembler Generator Simulator Generator Linker Generator Back-end Generator ISAMAP

  9. ISAMAP • Instruction Mapping Description Driven by DBT • Descriptions use ArchC language ISA models • Source architecture ISA • Target architecture ISA • Mapping between source and target • Low-level ISA mapping

  10. Instruction Set Architecture (AC_ISA) ISA(powerpc) { isa_format XO1 = "%opcd:6 %rt:5 %ra:5 %rb:5 %oe:1 %xos:9 %rc:1”; isa_instr <XO1> add, subf; isa_regbank r:32 = [0..31]; ISA_CTOR(powerpc) { add.set_asm(”add %reg %reg %reg", rt, ra, rb); add.set_decoder(opcd=31, oe=0, xos=266, rc=0); subf.set_asm(”subf %reg %reg %reg", rt, ra, rb); subf.set_decoder(opcd=31, oe=0, xos=40, rc=0); }

  11. Instruction Set Architecture (AC_ISA) ISA(x86) { isa_format op1b_r32 = "%op1b:8 %mod:2 %regop:3 %rm:3"; isa_instr <op1b_r32> add_r32_r32, mov_r32_r32; isa_reg eax = 0; isa_reg ecx = 1; ... isa_reg edi = 7; ISA_CTOR(x86) { add_r32_r32.set_operands(”add %reg %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); mov_r32_r32.set_operands(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3);

  12. ISA Mapping isamap_instrs { isamap_instrs { add %reg %reg %reg; subf %reg %reg %reg; $0 $1 $2 $0 $1 $2 } = { } = { mov_r32_r32 edi $1; mov_r32_r32 edi $2; add_r32_r32 edi $2; sub_r32_r32 edi $1; mov_r32_r32 $0 edi; mov_r32_r32 $0 edi; }; }; (add) (subf)

  13. ISAMAP Flow Target ISA ISA Mapping Source ISA acpp ArchC Host Code DBT Source DBT Libraries Compiler ISAMAP

  14. Overall ISAMAP Structure • Standard DBT implementation • 16MB Code Cache • Block linkage (at first touch) • No traces • Syscall mapping • In addition it provides mapping support • Instruction semantics (load, store, branch, fp) • Register read/write status • Conditional mapping

  15. Register Read Semantics • Avoids unnecessary register reads/writes add_r32_r32.set_asm (”add %reg, %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); add_r32_r32.set_read(regop); mov_r32_r32.set_asm(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3); mov_r32_r32.set_write(rm);

  16. Conditional Mappings isamap_instrs { or %reg %reg %reg; } = { if ($1 = $2) { mov_r32_m32disp edi $1; mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; or_r32_m32disp edi $2; mov_m32disp_r32 $0 edi; } }

  17. Conditional Mapping (cont.) isamap_instrs { rlwinm %reg %reg %imm %imm %imm; } = { if($2 = 0) { mov_r32_m32disp edi $1; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; rol_r32_imm8 edi $2; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; };

  18. Mapping PPC Instruction cmp 1 2 3 4 5 6 7 8 • Which Whx CR < > = ov 4 bits Which group out of 8?

  19. Mapping PPC Instruction cmp (cont.) • Careful analysis pays off….

  20. At the end: Optimization Steps • Local register allocation • Copy-propagation • Dead-code ellimination

  21. Optimization Results

  22. ISAMAP vs. QEMU (Int) • Speed-ups ranging from 1.12 to 3.01

  23. ISAMAP vs. QEMU (FP) • Not fair, as QEMU was not using SSE

  24. ISAMAP Good Side • Allows for a fast implementation • Isolates the translator issues from mapping • Let the focus be on the mapping • Can reuse simulator descriptions

  25. ISAMAP Bad Side • Does not allow high-level C descriptions • Still needs to go through asm details • But on the other hand…. • 1 PhD in one year for the tool • 4-6 months for both descriptions and the mapping (no previous experience)

  26. Related Work • Dynamo • ADORE • Aries • Digital FX!32 • UQDBT • Yirr-Ma • DAISY • QEMU • IA-32 EL

  27. Future Work • Additional issues • Self-modifying code • Cover more SPEC programs • Measure mapping vs. tool speedup contribution • Evaluate the translation overhead • From C to x86 • From C to PPC to x86 • Mappings to embedded engines

  28. The End • Work supported by FAPESP and CNPq • Thanks for the feedback !!

More Related