1 / 60

Retarget Open64 to an Embedded CPU  A practice for automatic approach

This paper discusses the motivation behind using Open64 for automatic retargeting and explores a solution for retargeting to PowerPC. The current design and prototype system are presented along with the principles and flowchart of the code generator.

petillo
Download Presentation

Retarget Open64 to an Embedded CPU  A practice for automatic approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 1/60 Retarget Open64 to an Embedded CPU  A practice for automatic approach SS&SE Group (System Software & Software Engineering ) Department of Computer Science and Technology Tsinghua University

  2. 2/60 Outline • Background and Motivation • Overview of the Current Design • Prototype System • Perspective • Acknowledgment

  3. 3/60 Background and Motivation • Why Open64 • Not Difficult retarget manually • Based on a short procedure guideline and some guidance, one of our students had made a preliminary retarget of Open64 to PowerPC within 6 weeks • Work is still laborious and tedious changes are scattered in many places wrong results due to erroneous changes are hard to find for a new developer

  4. 4/60 Background and Motivation • Why Open64 • Good Research Platform for Automatic Retarget and Computer Architecture • High level IR (WHIRL) is machine independent. • Performance of code generated is already of high quality right after retarget • There is no contribution in open source to explore automatic retarget. It brings an “clean” platform to research on computer architecture and ISA enhancement

  5. 5/60 Background and Motivation • Our Current Practice • Objective • Explore a reasonable solution to automatic retarget for Open64 without changing the current CG framework • Experience a realistic new target CPU (we chose PowerPC) • Seek more opportunities in research about automatic retarget (software engineering, machine description, etc.)

  6. 6/60 Background and Motivation • Our Current Practice • Status • A preliminary solution to automatic retarget (exercised with PowerPC) Overview of the Current Design follows • A Prototype system Prototype System will be discussed later

  7. 7/60 Overview of the Current Design • Principle of Current Design • Keep the basic structure unchanged • Determine automatable part incrementally • Make machinedescription as abstract as possible

  8. 8/60 Overview of the Current Design • Flowchart of Code Generator • From Tutorial on the SGI Pro64 Compiler Infrastructure by Gao et. al., PACT 2000

  9. 9/60 Overview of the Current Design • Targeting Pro64 to a New Processor • From Tutorial on the SGI Pro64 Compiler Infrastructure by Gao et. al., PACT 2000

  10. 10/60 Overview of the Current Design • Automation retarget approach • Generate target information including ISA information and some ABI information from machine description automatically • Produce expanding code automatically by using Olive tool (Steve Tjiang) as the code-generator generator

  11. 11/60 Overview of the Current Design • Machine Description • Regular Target Information (ISA, ABI, etc. to generate TARG_INFO) • Tree Patterns for WHIRL Operators (to generate Olive rules) • Others • Information for other retargetable part • Abstract model for processor properties (to be developed)

  12. 12/60 Overview of the Current Design • Design of Prototype System Machine Description C source programs to collect target information Regular target information Tree patterns ParserA ParserB C source programs to perform code generation Framework for Olive rules Olive rules Complete manually Code Generator Generator

  13. 13/60 Overview of the Current Design • Regular Target InformationDescription • ISA information • Registers, Operators, Operands, … • ABI information • Calling convention, …

  14. 14/60 Overview of the Current Design • Regular Target InformationDescription • Example {SECTION "architecture" ARCH = "PPC32"; END} {SECTION “registers“ …… END}{SECTION “operands“ …… END}…… {SECTION "abi_properties" …… END} ……

  15. 15/60 Overview of the Current Design • Files Produced from Machine Description • By Regular Target Information(ParserB) • isa_registers.cxx, isa_operands.cxx, isa_subset.cxx, isa_bundle.cxx, isa_decode.cxx, isa_enums.cxx, isa_print.cxx, isa_properties.cxx, isa_pseudo.cxx, isa_hazards.cxx, isa_lits.cxx, isa_pack.cxx, isa.cxx, (under ../common/targ_info/isa/) • abi_properties.cxx (under ../common/targ_info/abi/) • proc_properties.cxx, proc.cxx, (PPC specific)ppc_si.cxx (under ../common/targ_info/proc/) /*To do*/

  16. 16/60 Overview of the Current Design • Produce expanding code automatically • Olive tool • Code generator generator • A follow-up to Aho, Ganapathi & Tjiang's TWIG [TOPLAS89] • Generate C source program to perform optimal instruction selection ( the program implements dynamic programming algorithm with cost function, performing tree pattern matching and graph covering )

  17. 17/60 Overview of the Current Design • Produce expanding code automatically • Grammar for Olive Rules rulenonterm  tree [cost] action tree  term ( tree_list )  term  nonterm treelist  tree_list , child  child child  tree  _ cost  C-code  C-expr action  C-code

  18. 18/60 Overview of the Current Design • Produce expanding code automatically • Expand WHIRL to TOP • Produce the expander by Olive • Input VL-WHIRL tree to the expander (Very Low WHIRL, some registers are exposed) • The expander produces TOP instruction sequence equivalent to the input WHIRL tree semantically (TOP  CGIR-level abstraction)

  19. 19/60 Overview of the Current Design • Produce expanding code automatically • Expand WHIRL to TOP • Only expand expressions in the current design • Why not expanding the whole tree? Tradeoff  benefit change proportion of original CG structure how easy in writing Olive rules • To investigate further on this in the future

  20. 20/60 Overview of the Current Design • Produce expanding code automatically • 2-Stage Editing for Olive rules • Stage 1: Abstract description of Olive rules (tree patterns) which will produce the framework used in the next stage • Stage 2: Fill uncompleted Olive rules in the framework description for the specific target

  21. 21/60 Overview of the Current Design • 2-Stage Editing for Olive rules • Stage 1 • Ex.1 Abstract description of a special Olive rule #reg : I4ADD(reg, reg) (I4ADD res(0, reg, int32); src(0, reg, int32); src(1, reg, int32); => "add res(0) src(0) src(1)" 1 ) Cost (count of cycles)

  22. 22/60 Overview of the Current Design • 2-Stage Editing for Olive rules • Framework Description Produced by ParserA • Ex.1 A Olive rule automatically produced by the special Olive rule above reg : I4ADD(reg, reg) { $cost[0].cost = 1 + $cost[2].cost + $cost[3].cost; } = { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Build_OP(TOP_add, $0->result, $2->result, $3->result, ops); }

  23. 23/60 Overview of the Current Design • 2-Stage Editing for Olive rules • Stage 1 • Ex.2 Abstract description of a general Olive rule (for PowerPC) # reg : I4F8TRUNC(f8reg) ( => )

  24. 24/60 Overview of the Current Design • 2-Stage Editing for Olive rules • Framework Description Produced by ParserA • Ex. 2 A Olive rule automatically produced by the general Olive rule above (which is an uncompleted Olive rule) reg : I4F8TRUNC(f8reg) { } = { }

  25. 25/60 Overview of the Current Design • 2-Stage Editing for Olive rules • Stage 2 • Complete uncompleted Olive rules

  26. 26/60 Overview of the Current Design • Files Produced from Machine Description • By Olive Rules • Update Expand_Expr( ) (under ../be/cg/whirl2ops.cxx) • Replace expand.cxx, exp_loadstore.cxx, exp_divrem.cxx, exp_branch.cxx, etc. (under ../be/cg/ppc32/, where ppc32 is target specific)

  27. 27/60 Prototype System • Prototype for Retargeting to PowerPc • Connect the Machine Description Get regular target information from the machine description and distribute them into source trees (in proper form) • Expand WHIRL to TOP Expander is produced automatically by the Olive tool, to which specific Olive rules is input

  28. 28/60 Prototype System • Description for Regular Target Information • ISA and ABI Information Syntax definition reflects directly the data organization in source code where these information is processed To be further improved in the future • Connecting to the Compiler The parser, produced by YACC, translates these information to C programs, then connected to the compiler by Makefile

  29. 29/60 Prototype System • Examples: Target InformationDescription • ISA and ABI Information {SECTION "architecture" ARCH = "PPC32"; END} {SECTION "isa_list“ isa = add, add_i, adds, addl, …END}

  30. 30/60 Prototype System • Examples: Target InformationDescription • ISA and ABI Information {SECTION "operand“ #name=size,type,lit_class Literal_Type={ simm16=16,SIGNED,LC_simm16; uimm16=16,UNSIGNED,LC_uimm16; uimm5 =5, UNSIGNED,LC_uimm5; } Register_Type={ …… } Enum_Type={ ……} Use_Type={ ……} Instruction_Group={ ……} END}

  31. 31/60 Prototype System • Examples: Target InformationDescription • ISA and ABI Information {SECTION "registers" # registers definition # isa_register_class definition NAME = "integer", BIT_SIZE = 32, CAN_STORE = true, MULTIPLE_SAVE = false; ……# isa_register_set definition RCLASS = rc_integer, MIN_REGNUM = 0, MAX_REGNUM =31, …… END}

  32. 32/60 Prototype System • Examples: Target InformationDescription • ISA and ABI Information {SECTION "abi_properties" #ABI properties definition (integer, ABI_PROPERTY) = { {……}; # list of integer registers (REG_LOW_BOUND, REG_UPPER_BOUND) = (0, 31); ALLOCATABLE(0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 31, -1) CALLEE(1, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, -1) CALLER(0, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, -1) FUNC_ARG(3, 4, 5, 6, 7, 8, 9, 10, -1) FUNC_VAL(3, 4, -1) STACK_PTR(1, -1) FRAME_PTR(1, -1) GLOBAL_PTR(13, -1)} (float, ABI_PROPERTY) = { … } … END}

  33. 33/60 Prototype System • Expand WHIRL to TOP • Interface to Olive • Example Rules Specific to PowerPC

  34. 34/60 Prototype System • Interface to Olive • Costs typedef struct COST { int cost;} COST; static COST COST_INFINITY = { MAX_INT16 }; static COST COST_ZERO = { 0 }; #define COST_LESS(x,y) ((x).cost < (y).cost)

  35. 35/60 Prototype System • Interfacing to Olive • Trees typedef struct burm_state * STATE;typedef struct olive_node * NODEPTR;typedef struct olive_node * TREE; #define GET_KIDS(r) ((r)->get_kids())#define OP_LABEL(r) ((r)->op_label())#define STATE_LABEL(r) ((r)->state_label())#define SET_STATE(r,s) (r)->set_state(s)

  36. 36/60 Prototype System • Interfacing to Olive • Tree Nodes struct olive_node{ OPCODE opcode; OPERATOR opr; TOP top; int num_opnds; WN * wn; WN * parent; INTRINSIC intrn_id; TN * result; TN * opnd_tn[OP_MAX_FIXED_OPNDS]; NODEPTR kids[OP_MAX_FIXED_OPNDS]; STATE state; int opc; olive_node(WN * w, WN * p, TN * res, INTRINSIC iid); virtual ~olive_node() ; void set_state(STATE s) { state = s; } STATE state_label() { return state; } NODEPTR* get_kids() { return kids; } int op_label() { return opc; } void Print() { /* printf("WN\n%s\n", dump_wn(wn));*/ } };

  37. 37/60 Prototype System • Example Rules Specific to PowerPC • Classification of PowerPCOperators • Integer (arithmetic/compare/logical/rotate/shift) • Floating-point (arithmetic/multiply-add/rounding and conversion/compare/status and control register/move) • Load/Store (integer/floating-point/integer byte-reverse /integer multiple/string) • Branch (unconditional/conditional/conditional to LR/conditional to CTR) • Misc (system call/trap/ condition register logical)

  38. 38/60 Prototype System • Example Rules Specific to PowerPC • Load/Store reg : I4I4LDID // Integer load { $cost[0].cost = 3; // Cycles } = { $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Handle_Load($0->wn, $0->result, TOP_lwz, ops); } static TN * Handle_Load(WN * , TN *, TOP, OPS *);

  39. 39/60 Prototype System • Example Rules Specific to PowerPC • Load/Store null : I4STID(reg)// integer store { $cost[0].cost = 3 + $cost[2].cost; } = { $action[2](ops); $0->result = $2->result; Handle_Store($0->wn, $0->result, TOP_stw, ops); } static void Handle_Store(WN * , TN *, TOP, OPS *);

  40. 40/60 Prototype System • Example Rules Specific to PowerPC • Load/Store f4reg : F4F4LDID // floating-point load { $cost[0].cost = 4; } = { $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Handle_Float_Load($0->wn, $0->result, TOP_lfs, ops); } static TN * Handle_Float_Load(WN * , TN *, TOP, OPS *);

  41. 41/60 Prototype System • Example Rules Specific to PowerPC • Load/Store null : F4STID(f4reg) // floating-point store { $cost[0].cost = 4; } = { $action[2](ops); $0->result = $2->result; Handle_Float_Store($0->wn, $0->result, TOP_stfs, ops); } static void Handle_Float_Store(WN * , TN *, TOP, OPS *);

  42. 42/60 Prototype System • Example Rules Specific to PowerPC • Call null : I4CALL { $cost[0].cost = 2; } = { Handle_Call_Site($0->wn, $0->opr); }; static void Handle_Call_Site (WN *, OPERATOR);

  43. 43/60 Prototype System • Example Rules Specific to PowerPC • Addition reg : I4ADD(reg, reg) { $cost[0].cost = 1 + $cost[2].cost + $cost[3].cost; } = { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Build_OP(TOP_add, $0->result, $2->result, $3->result, ops); }

  44. 44/60 Prototype System • Example Rules Specific to PowerPC • Addition of Immediate const : I4INTCONST { $cost[0].cost = 0; } = { $0 = $1; }; reg : I4ADD(reg, const) // small immediate { if (!(ISA_LC_Value_In_Class(WN_const_val($3->wn), LC_simm16))) return 0; $cost[0].cost = 1 + $cost[2].cost; }= { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Build_OP(TOP_addi, $0->result, $2->result, Gen_Literal_TN(WN_const_val($3->wn), 4), ops);};

  45. 45/60 Prototype System • Example Rules Specific to PowerPC • Addition of Immediate (continue) reg : I4ADD(reg, const) // big immediate { $cost[0].cost = 2 + $cost[2].cost; }= { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); INT64 val = WN_const_val($3->wn); Build_OP(TOP_addi, $0->result, $2->result, Gen_Literal_TN((short)(val & 0xffff), 4), ops); Build_OP(TOP_addis, $0->result, $2->result, Gen_Literal_TN((short)(val >> 16), 4), ops);};

  46. 46/60 Prototype System • Example Rules Specific to PowerPC • Floating-Point Arithmetic (multiply-add) f4reg : F4MADD(f4reg, f4reg, f4reg) { $cost[0].cost = 5+ $cost[2].cost + $cost[3].cost + $cost[4].cost ; } = { $action[2](ops); $action[3](ops); $action[4](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Build_OP(TOP_fdivs,$0->result, $2->result, $3->result, ops); }

  47. 47/60 Prototype System • Example Rules Specific to PowerPC • Floating-Point Rounding and Conversion reg : I4F8TRUNC(f8reg) { $cost[0].cost = 11 + $cost[2].cost; } = { $action[2](ops); TN* tmp_tn = Build_TN_Of_Mtype(MTYPE_F8); Build_OP(TOP_fctiwz, tmp_tn, $2->result, ops); ST * tmp_sym = CGSPILL_Get_TN_Spill_Location(tmp_tn, CGSPILL_LRA); INT64 ofst = TN_offset(tmp_tn); ST* base_sym; INT64 base_ofst; Base_Symbol_And_Offset_For_Addressing(tmp_sym, 0, &base_sym, &base_ofst); Build_OP(TOP_stfd, tmp_tn, FP_TN, Gen_Literal_TN(base_ofst, 4), ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Build_OP(TOP_lwz, $0->result, FP_TN, Gen_Literal_TN(base_ofst + 4, 4), ops ); }

  48. 48/60 Prototype System • Example Rules Specific to PowerPC • Conditional Branch reg : I4F4GT(f4reg, f4reg) { $cost[0].cost = 7 + $cost[2].cost + $cost[3].cost ; } = { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Handle_Cond_Branch(TOP_bgt, TOP_fcmpu, $0->result, $2->result, $3->result, ops); } static void Handle_Cond_Branch(TOP, TOP, TN *, TN *, TN *, OPS *);

  49. 49/60 Prototype System • Example Rules Specific to PowerPC • Conditional Branch static void Expand_Cond (TOP top_branch, TOP top_cmp, TN *dest, TN *src1, TN *src2, OPS *ops) /*Expand_Cond is an auxiliary function shared by compare operators */ /* For example */ reg : I4F4NE(f4reg, f4reg) vs Expand_Cond(TOP_bne, …) reg : I4F4GT(f4reg, f4reg) vs Expand_Cond(TOP_bgt, …) reg : I4F4EQ(f4reg, f4reg) vs Expand_Cond(TOP_beq, …) reg : I4F4GE(f4reg, f4reg) vs Expand_Cond(TOP_bge, …) reg : I4F4LE(f4reg, f4reg) vs Expand_Cond(TOP_ble, …) reg : I4F4LE(f4reg, f4reg) vs Expand_Cond(TOP_ble, …)

  50. 50/60 Prototype System • Example Rules Specific to PowerPC • Condition Move reg : I4I4GT(reg, reg) { $cost[0].cost = 3 + $cost[2].cost + $cost[3].cost ; } = { $action[2](ops); $action[3](ops); $0->result = Build_TN_Of_Mtype (WN_rtype($0->wn)); Handle_Cond_Move(OPR_GT, TOP_cmpw, $0->result, $2->result, $3->result, ops); } static void Handle_Cond_Move(OPERATOR, TOP, TN *, TN *, TN *, OPS *)

More Related