1 / 19

Reducing Average CPE Time On A Y86 Pipelined Processor

Reducing Average CPE Time On A Y86 Pipelined Processor. Darren Stikes DS58062. Y86 Processor. Has A pipeline architecture. Has 5 Stages, FDEMW Its Pipeline allows 5 instructions at a time. The Five Stages (Fetch).

finleym
Download Presentation

Reducing Average CPE Time On A Y86 Pipelined Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reducing Average CPE Time On A Y86 Pipelined Processor Darren Stikes DS58062

  2. Y86 Processor • Has A pipeline architecture. • Has 5 Stages, FDEMW • Its Pipeline allows 5 instructions at a time.

  3. The Five Stages(Fetch) • Gets the address at the current program counter and reads the instruction there • Fetches the address of the next instruction

  4. Decode Stage • Reads in registers used in the instruction • Places them in the correct processor registers so they can be used by the execute stage

  5. Execute stage • Computes memory address • uses the ALU to make a computation to registers

  6. Memory Stage • Reads and writes data needed to be used for the current instruction

  7. Write stage • Saves the computed value from the other stages into a register or memory address.

  8. E M W F F D E M W Problems? • Since more than one instruction is being ran though the pipeline at one time, problems occur when data needed from the previous instruction isn’t computed yet. • Some of the ways these problems can be fixed is by adding bubbles or stalling F D E M W F D E M W F D E M W D D E M W F F D E M W

  9. Common Problems • Most Problems in Pipeline architecture can be handled by forwarding, which is using pipeline registers to obtain a value before it is written normally

  10. Unsolved Problems • Load/use Data hazards. • Branch-Missprediction

  11. Happens when something is needed from memory of the previous instruction. This doesn’t work because it simply needs more cycles in between the two. Put instructions between the two that uses registers independent of the one with a problem. This will give time for the information to be collected. Load/Use Data Hazards Solutions

  12. With the Y86 Pipeline Architecture, if a condition branch is in the pipeline, it assumes that the branch will be taken. The problem Arises when the branch ends up not taken and falls through. All the instructions that have started down the pipeline would then not be the correct instructions, and they would need to be removed. If you have an idea of which way the condition codes will be at the time of the branch, you can rearrange your code to where your branch would be taken most of the time. Branch - Miss prediction Solution

  13. Hardware Added instruction iaddl which will allow a number added to a register. This saves lines of code used to simply place a number in a register, just to add it once. Added instruction leave, which takes the place of the two instructions: rrmovl %ebp, %esp popl %ebp with: leave Software Replaced old hardware instructions with the new ones. Rearranged the conditional branch inside the loop to be taken most of the time. Placed a line of code between a use/load hazard. Rearranged the place where I subtracted the length, then removed the and instruction that was specifically designed to set condition codes. Enhancements!

  14. BEFORE: Loop: mrmovl (%ebx), %eax rmmovl %eax, (%ecx) andl %eax, %eax jle Npos irmovl $1, %edi addl %edi, %esi Npos: irmovl $1, %edi subl %edi, %edx irmovl $4, %edi addl %edi, %ebx addl %edi, %ecx andl %edx,%edx jg Loop AFTER: Loop: mrmovl (%ebx), %eax rmmovl %eax, (%ecx) andl %eax, %eax jle Npos addl $1, %esi Npos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx,%edx jg Loop Also added the leave instruction at bottom of code Changes #1

  15. Before: Loop: mrmovl (%ebx), %eax rmmovl %eax, (%ecx) andl %eax, %eax jle Npos addl $1, %esi Npos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx,%edx jg Loop AFTER: Loop: mrmovl (%ebx), %eax rmmovl %eax, (%ecx) iaddl $1, %esi andl %eax, %eax jg pos iaddl $-1, %esi pos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx, %edx jg Loop Changes#2

  16. Before: Loop: mrmovl (%ebx), %eax rmmovl %eax, (%ecx) iaddl $1, %esi andl %eax, %eax jg pos iaddl $-1, %esi pos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx, %edx jg Loop AFTER: Loop: mrmovl (%ebx), %eax iaddl $1, %esi rmmovl %eax, (%ecx) andl %eax, %eax jg pos iaddl $-1, %esi pos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx, %edx jg Loop Changes #3

  17. Before: Loop: mrmovl (%ebx), %eax iaddl $1, %esi rmmovl %eax, (%ecx) andl %eax, %eax jg pos iaddl $-1, %esi pos: iaddl $-1, %edx iaddl $4, %ebx iaddl $4, %ecx andl %edx, %edx jg Loop AFTER: Loop: mrmovl (%ebx), %eax iaddl $1, %esi rmmovl %eax, (%ecx) andl %eax, %eax jg pos iaddl $-1, %esi pos: iaddl $4, %ebx iaddl $4, %ecx iaddl $-1, %edx jg Loop Changes #4

  18. Results • The results after each change are as follows: • This resulted in lowering my CPE time by 36%, from 18.15 to 11.59.

  19. Results in order of the way I changed them

More Related