1 / 33

TigerSHARC CLU Closer look at the XCORRS

TigerSHARC CLU Closer look at the XCORRS. M. Smith, University of Calgary, Canada smithmr@ucalgary.ca. Overview. Recap GPS correlation Look at XCORRS instruction in detail This was part of Take home quiz for 5005 Additional information on the web

Download Presentation

TigerSHARC CLU Closer look at the XCORRS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TigerSHARC CLUCloser look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca

  2. Overview • Recap GPS correlation • Look at XCORRS instruction in detail • This was part of Take home quiz for 5005 • Additional information on the web • Xcorrs.asm – assembly code discussed in class • Xmain.cpp – demonstrates the use of the xcorrs.asm code • XcorrsTest.cpp – demonstrates testing of all the functions being used • Additional correlation presentations (not XCORRS) from Analog Devices developers • In 2005, we pointed out many errors in TigerSHARC XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors 

  3. GPS Positioning Concepts • For now make 2 assumptions: • We know the distance to each satellite • We know where each satellite is • With this information from 2 satellites – you know you are on a “plane of intersection. • Require 3 satellites for a 3-D position in this “ideal” scenario • Requires 4 satellites to account for local receiver clock drift. (1)

  4. Determining Time Signal send by satelliteSignal received by you You know the signal sent Perform correlations till you get a match • Use the PRN code to determine time • Use time to determine distance to the satellite distance = speed of light * time (1)

  5. The practice • Suppose we have the vector – in-phase and out-of-phase data gathered over an antenna from a satellite for example. Gain issues make it x16 -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc • Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, How much is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc

  6. Tackle the issue with FIR • First – modify correlation function to handle complex values • Ignore that issue at the moment – 1 add + 1 multiplication + 2 memory fetches to 3 adds + 4 multiplications plus 4 memory fetches • Imagine 1024 data points + 1024 PRN • Need to do 1024 FIR each of 1024 taps • We know how to optimize to do 2 taps every cycle (one in X and one in Y) • Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz • XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster

  7. Where does the CLU fit in?

  8. XCORRS definition

  9. THEORYMathematicaldefinition Uses registers TR -- accumulate D -- 8 data? C -- 1 coefficient? And something called CUT – essentially awindow operation fcut = 0 -- don’t use

  10. 2005 Lab. 4Satellite data Quad fetch brings in 8 complex values 8 bits each Pattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….

  11. PRN code – 2 bit complex number Seems strange to have two dummy bits But actually makes sense PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ………. +1, -1 are associated with the PSK – more another lecture Problem BINARY means 1 and 0, so how represent 1 and -1 -1 are stored as 1’s, +1 stored as 0’s (DAMY)

  12. PRN

  13. PRN 0x3 value go in as C15 and C16 0011 -- C15 = -1 –j C16 = +1 + j

  14. Loading the THR registers

  15. Standard XCORRS instruction Lower 46 bits ofTHR1:0 R7:3 TR0, TR1, TR2 ……. TR15

  16. TR15:0 = XCORRS(R7:4, THR3:0) Doing 8 complex taps of 16 correlationat each cycle TR0 += D7 * C22 + D6 * C21 +… 8 taps TR1 += D7 * C21 + D6 * C20 +… 8 taps ……….. ……….. TR15 += D7 * C7 + D6 * C6 + … 8 taps 64 taps each cycles – on both x and y compute blocks – if set up properly 128 taps each cycle – these are “complex taps” compared to 2 real taps / cycle after lab. 3

  17. TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7) Because of offsets, sometimes wemust only use “some of the taps” TR0 += D7 * C22 + D6 * C21 + … 8 taps TR1 += D7 * C21 + D6 * C20 + … 8 taps ……….. ……….. TR14 += D7 * C8 + D6 * C7 2 taps TR15 += D7 * C7 1 taps

  18. TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15) TR0 += D7 * C22 + D6 * C21 … 8 taps TR1 += D7 * C21 + D6 * C20 … 7 taps ……….. TR7 += D7 * C15 … 1 taps TR0 += 0 … 0 taps ……….. TR15 += 0 … 0 taps

  19. TR15:0 = XCORRS(R7:4, THR3:0) (CUT +7?) TR0 += 0 … 0 taps TR1 += D0 *C14 1 taps ……….. TR7 += D6 * C14 + D5 * C13 + … 7 taps TR0 += D7 * C14 + D6 * C13 + … 8 taps ……….. TR15 += D7 * C7 + D6 * C7 + … 8 taps

  20. TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15) TR0 += D7 * C22 + D6 * C21 … 8 taps TR1 += D7 * C21 + D6 * C20 … 7 taps ……….. TR7 += D7 * C15 … 1 taps TR0 += 0 … 0 taps ……….. TR15 += 0 … 0 taps

  21. TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7) TR0 += D7 * C22 + D6 * C21 + … 8 taps TR1 += D7 * C21 + D6 * C20 + … 8 taps ……….. ……….. TR14 += D7 * C8 + D6 * C7 2 taps TR15 += D7 * C7 1 taps

  22. TR15:0 = XCORRS(R7:4, THR3:0) TR0 += D7 * C22 + D6 * C21 +… 8 taps TR1 += D7 * C21 + D6 * C20 +… 8 taps ……….. ……….. TR15 += D7 * C7 + D6 * C6 + … 8 taps 64 taps each cycles – on both x and y compute blocks – if set up properly 128 taps each cycle – these are “complex taps” compared to 2 real taps / cycle after lab. 3

  23. Problem at this point -- THR3:2 emptyNeed to bring in more PRN values

  24. TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15) TR0 += 0 … 0 taps TR1 += D0 *C14 1 taps ……….. TR7 += D6 * C14 + D5 * C13 + … 7 taps TR0 += D7 * C14 + D6 * C13 + … 8 taps ……….. TR15 += D7 * C7 + D6 * C7 + … 8 taps

  25. Final Result Maximum correlation occurs every 3 shifts – which is what we expect Is it the correct result?

  26. Correlation – result expected In step -1 +0j, 1 + 0j, 1 + 0j, … 16 times with -1 - j, 1 + j, 1 + j, … 16 times -1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component Out of step -1 +0j, 1 + 0j, 1 + 0j, … 16 times with 1 + j, 1 + j, -1 - j, … 16 times -1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0

  27. Final Result 1) Now have correlation values for 16 shifts in TR registers – store to external memory Repeat for all other necessary shifts – find the maximum 2) Now make parallel in SISD mode 3) Now make parallel in SIMD

  28. Overview • Recap GPS correlation • Look at XCORRS instruction in detail • This was part of Take home quiz for 5005 • Additional information on the web • Xcorrs.asm – assembly code discussed in class • Xmain.cpp – demonstrates the use of the xcorrs.asm code • XcorrsTest.cpp – demonstrates testing of all the functions being used • Additional correlation presentations (not XCORRS) from Analog Devices developers • In 2005, we pointed out many errors in TigerSHARC XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors 

More Related