1 / 38

Generating a software loop with memory accesses

Generating a software loop with memory accesses. TigerSHARC assembly syntax. Concepts. Learning just enough TigerSHARC assembly code to make a software loop “work” Comparing the timings for rectification of integer and floating point arrays, using debug C++ code, Release C++ code

Download Presentation

Generating a software loop with memory accesses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generating a software loop with memory accesses TigerSHARC assembly syntax

  2. Concepts • Learning just enough TigerSHARC assembly code to make a software loop “work” • Comparing the timings for rectification of integer and floating point arrays, using • debug C++ code, • Release C++ code • Our FIRST_ASM code • Looking in “MIXED mode” at the code generated by the compiler TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  3. Test Driven Development Work with customer to check that the tests properly express what the customer wants done. Iterative process with customer “heavily involved” – “Agile” methodology. CUSTOMER DEVELOPER TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  4. Note Special marker Compiler optimization FLOATS 927  304 -- THREE FOLD INTS 960  150 – SIX FOLD Why the difference, and can we do better, and do we want to? Note the failures – what are they TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  5. Write tests about passing values back from an assembly code routine TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  6. More detailed look at the code As with 68K and Blackfin needs a .section But name and format different As with 68K need .align statement Is the “4” in bytes (8 bits)or words (32 bits) As with 68K need .globalto tell other code that this function exists Single semi-colons Double semi-colons Start function label End function label Used for “profiling code” Label format similar to 68K Needs leading underscore and final colon TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  7. Return registers • There are many, depending on what you need to return • Here we need to use J8 • Many registers available – need ability to control usage • J0 to J31 – registers (integers and pointers) (SISD mode) • XR0 to XR31 – registers (integers) (SISD mode) • XFR0 to XFR31 – registers (floats) (SISD mode) • Did I also mention • I0 to I31 – registers (integers and pointers) (SISD mode) • YR0 to YR31 , YFR0 to YFR31 (SIMD mode) • XYR, YXR and R registers (SIMD mode) • And also the MIMD modes • And the double registers and the quad registers ……. #define return_pt_J8 J8 // J8 is a VOLATILE, NON-PRESERVED register TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  8. Parameter passing • Spaces for first four parameters ARE ALWAYS present on the stack (as with 68K) • But the first four parameters are passed in registers (J4, J5, J6 and J7 most of the time) (as with MIPS) • The parameters passed in registers are often stored into the spaces on the stack (like the MIPS) when assembly code functions call assembly code functions • J4, J5, J6 and J7 are volatile, non-preserved registers TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  9. Can we pass back the start of the final array Still passing tests byaccident and this needs to be conditional returnvalue TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  10. What we need to know based on experiences from other processors • Can we return from an assembly language routine without crashing the processor? • Return a parameter from assembly language routine • (Is it same for ints and floats?) • Pass parameters into assembly language • (Is it same for ints and floats?) • Do IF THEN ELSE statements • Read and write values to memory • Read and write values in a loop • Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( ) TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  11. Why is ELSE a keyword FOUR PART ELSE INSTRUCTION IS LEGAL IF JLT; ELSE, J1 = J2 + J3; // Conditional execution – if true ELSE, XR1 = XR2 + XR3; // Conditional – if true YFR1 = YFR2 + YFR3;; // Unconditional -- always IF JLT; DO, J1 = J2 + J3; // Conditional execution -- if true DO, XR1 = XR2 + XR3; // Conditional -- if true YFR1 = YFR2 + YFR3;; // Unconditional -- always Having this sort of format means that the instruction pipeline is not disrupted when we do IF statements TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  12. Label name is not the problem NOTE: This is “C-like” syntax, But it is not “C” Statement must end in ;; Not ; TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  13. Add dual-semicolons everywhereWorry about “multiple issues” later This dual semi-colon Is so important that you MUST code review for it all the time or else you waste so much time in the Lab. Key in exams / quizzes At last an error I know how to fix TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  14. Well I thought I understood it !!! • Speed issue – JUMPS can’t be too close together. • Not normally a problem when “if” is larger TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  15. Add a single instruction of 4 NOPsnop; nop; nop; nop;; • Fix the last error as part of Assignment 1 Fix the remaining error In handling the IF THEN ELSE as part of assignment 1 Worry about code efficiency later (refactor) when all code working TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  16. What we need to know based on experiences from other processors • Can we return from an assembly language routine without crashing the processor? • Return a parameter from assembly language routine • (Is it same for ints and floats?) • Pass parameters into assembly language • (Is it same for ints and floats?) • Do IF THEN ELSE statements • Read and write values to memory • Read and write values in a loop • Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( ) TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  17. Target for this week. Changing this code into assembly (more speed) • Code we generated yesterday was similar to parts of this, but not equivalent. Refactor to make equivalent TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  18. The code was not exactly what we designed (C++ equivalent) – refactor and retest after the refactoring NEXT STEP TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  19. Refactored C++ code I THINK I UNDERSTANDENOUGH TO CHANGE THEFORMAT OF THE IF-THEN-ELSE IN THIS CASE Avoiding JUMPS in the mainflow of the code will speedthe flow of the code Almost right. Look in the manual to findthe correct syntax IF NJLE; DO, J8 = 0 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  20. No syntax errors (No ERRORS). Code does not work (DEFECTS) TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  21. Run “forensic tests” to find out where DEFECT is being introduced TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  22. Add another line to the codeCan now spot the error New format of IF-THEN-ELSE Is doing exactly the opposite of what we want Need JLE not NJLE TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  23. Assignment 1 – code the following as a software loop – follow MIPS approach int CalculateSum(void) { int sum = 0; for (int count = 0; count < 6; count++) { sum = sum + count; } return sum; } TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  24. Reminder – software for-loopbecomes “while loop” with initial test int CalculateSum(void) { int sum = 0; int count = 0; while (count < 6) { sum = sum + count; count++; } return sum; } Do line by line translation TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  25. USE SOFTWARE LOOP HEREDo loop control first • Have some jumps too close together TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  26. Run the tests with 4 nop padding to check that get out of loop as expected TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  27. Accessing memory • Basic mode • Special register J31 – acts as zero when used in additions • Pt_J5 is a pointer register into an array • Value_J1 is being used as a data register • J registers like MIPS registers (used as pointer and data).NOT like 68K or Blackfin registers – either data or address but not both • Value_J1 = [Pt_J5];; read value from memory location pointed to by J5 -- Compare to Blackfin Value_R0 = [Pt_P0];; • Value_J1 = [Pt_J5 + J31];; read value from memory location pointed to by J5 – but read somewhere that this CAN be faster than just Value_J1 = [Pt_J5];; -- NEED TO CONFIRM TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  28. Accessing memory – step 2 • Basic mode • Pt_J5 is a pointer register into an array • Offset_J4 is used as an offset • Value_J1 is being used as a data register • Read_J1 = [Pt_J5 + Offset_J4];; read value from memory location pointed to by (J5 + J4) PRE-MODIFY – address used J5 + J4, no change in J5 • Read_J1 = [Pt_J5 += Offset_J4];; read value from memory location pointed to by J5, and then perform add POST-MODIFY – address used J5, then perform J5 = J5 + J4 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  29. Add in the memory accessesFORGET TigerSHARC = RISC PROCESSOR LOAD/STORE ONLYLike MIPS Must place value intoregister, and then copyregister to memory TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  30. Understand the error messageToo many J resource usage = missing ;; TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  31. Note: Missing label is not an assembler error, it’s a linker error TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  32. Now the assembler know where “CONTINUE” is, then it can tell you that you have two JUMP too close together • Fix with magic 4 nops; and lose one cycle TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  33. Not getting expected Test resultsSomething is logically wrong (DEFECT) TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  34. Obvious question – are we even getting into the loop. Add BREAKPOINT to test (not to code follow) NEVER GOT TOBREAKPOINT meansnever entered loop Forgot to do count = 0 So not even getting into loop as there isa garbage value inCount_J0 TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  35. Not bad for a first effortFaster than compiler in debug mode TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  36. Where did the float ASM code suddenly appear from? • Integer 0 has bit pattern 0x0000 0000 • Float 0.0 has bit pattern 0x0000 0000 • Integer +6 has format b 0??? ???? ???? ???? ???? ???? ???? ???? • Float +6.0 has format b 0??? ???? ???? ???? ???? ???? ???? ???? • Integer -6 has format b 1??? ???? ???? ???? ???? ???? ???? ???? • Float -6.0 has format b 1??? ???? ???? ???? ???? ???? ???? ???? • Format’s are very different, but the sign bit is in the same place • Float algorithm - if S == 1 (negative) set to zero Otherwise leave unchanged – same as integer algorithm • Just re-use integer algorithm with a change of name EXPONENT TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  37. Final code – Float rectify code just has a different name TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

  38. What we NOW KNOW • Can we return from an assembly language routine without crashing the processor? • Return a parameter from assembly language routine • (Is it same for ints and floats?) • Pass parameters into assembly language • (Is it same for ints and floats?) • Do IF THEN ELSE statements • Read and write values to memory • Read and write values in a loop • Do some mathematics on the values fetched from memory All this stuff is demonstrated by coding HalfWaveRectifyASM( ) TigerSHARC assemble code 2, M. Smith, ECE, University of Calgary, Canada

More Related