PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo - PowerPoint PPT Presentation

slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo PowerPoint Presentation
Download Presentation
PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo

play fullscreen
1 / 56
PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo
141 Views
Download Presentation
vinnie
Download Presentation

PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. MTO HEALICs ProgramJuly 9, 2010Cogno: A Self-Healing Mixed-Signal Baseband Processor for Cognitive Radios PI: Behzad Razavi Co-PIs: Danijela Cabric, Dejan Markovic, Ali Sayed, and Jason Woo Electrical Engineering Dept. University of California, Los Angeles COTRs: Jay Rockway, SPAWAR Cynthia Hanson, SPAWAR

  2. Outline • System and Schedule Review • MTO Questions • Technical Description - Measured ADC Slice Results - ADC Design and Layout - Jitter Reduction - RF Spectrum Sensing • Backup Slides - Teaming and Personnel - Financials

  3. System and Schedule Review Razavi: PLL Design Markovic: Logic Synthesis Spectrum Sensing Implementation Sayed: Jitter Healing Algorithms Cabric: Spectrum Sensing Algorithms Razavi: ADC Design Markovic: Logic Synthesis Mismatch Modeling Sayed: Self-Healing Algorithms Woo: Mismatch Modeling

  4. MTO Questions • When presenting the metrics chart, please provide updates to table with measured and /or simulated values. Updated. • Please provide an update on testing of the single ADC comparator “slice”. To be presented • There is still concern from the government team that a design of the full self-healing ADC should be finalized and prototyped as soon as possible to mitigate risk going into the latter half of Phase I. July tapeout to mitigate risks • Provide an update regarding the expected performance from both the interleaved and single slice ADC architectures. Are you on track to tape-out the different design variants (for risk mitigation) in July? To be discussed on slide 6. • Document co-PI contributions to the ADC design process. To be presented. • Please present the justification of the effectiveness of your proposed jitter reduction technique. To be presented.

  5. Summary of Accomplishments • ADC: - Finalized and froze ADC architecture. - Successfully measured one ADC slice. - Incorporated comparator noise averaging during self-healing. - Laid out, extracted, and verified coarse stage and one slice of fine stage (including self-healing machinery). • 8-dB reduction in ADC clock jitter by self-healing with 6 mW of power • Taped out spectrum sensing chip

  6. Present Challenges • ADC: - Higher power consumption than expected because of rush to tape out. (High risk) - Complex self-healing due to comparator noise averaging. (Moderate risk)  tape out coarse ADC and one slice of fine ADC in July. - Difficult to interleave two ADCs on one chip because of the large number of high-speed digital outputs (~40)  Interleave two ADCs on PC board.

  7. Phase I Program Metrics ADC Program Metrics

  8. Tapeout Schedule

  9. Coarse Healing Slice • To test coarse healing cancellation scheme: • One coarse comparator • A chain of FFs • Ladder • Healing scheme (sense/correct OS) • Begin/end of the chain • Timing Use the negative output of the comp to clock the chain

  10. Test Assembly • Die photograph • Chip is bonded to • a copper board

  11. Measurement Modes • There are two operation modes: • ADC Operation mode • Calibration mode

  12. Measurement Results • Step1: test comp in ADC operation mode • Low decision • Metastable

  13. Measurement Results • Step 2: reset self-healing register; start reference sliding • ocomp output during healing • Ccomp out after healing

  14. Measurement Summary • VDD : supply voltage • Vsilde-reset : ref voltage at the beginning of chain • Vslide-settled: ref voltage after calibration is done  Slice operates as expected, demonstrating the self-healing operation.

  15. Architecture IX • Coarse decision slides one input of preamp to a close estimate of the analog input voltage. • Maximum output voltage generated by preamp is half of the value in Arch. VIII  linearity is improved. • But timing budget is still tight.

  16. Coarse ADC Comparators • Coarse ADC comparators incorporate: • Digital healing of comparator offset by sliding the reference input of the comparator along the ladder. • Distributed sample-and-hold • Avoids corruption of comparator decision by kick-back from other comparators. • Self resetting of the sampling capacitors • Preventing memory effects. • Kick-back cancellation circuit • Resulting in a maximum of 3 mV systematic offset of a comparator as opposed to 15 mV in the absence of the circuit.

  17. Coarse Comparator Self Resetting Switches Kick-back Cancellation Circuit

  18. Coarse ADC Layout  Self-Healing Logic Resistor Ladder & Bypass Capacitors   127 Comparators 1.6 mm Total height  1.60 mm Total length  0.36 mm 0.36 mm

  19. Fine ADC • 16 Fine blocks (FB) each consists of • one preamp • 25 comps with built-in offsets • Every two FBs share one preamp • Fine healing corrects offset of preamp and offset of comparator

  20. Fine ADC • To remove kick-back noise of the fine comps: • Distributed sampling to isolate the sampling nodes of comparators • Pipelining: to improve speed • Share one cap between two comps to reduce loading on preamp

  21. Fine ADC • Each preamp slides over 8 sub-ranges • The amplified sub-range is finely digitized by fine comps • An overlap of 14-LSB is chosen to correct: • Coarse comp uncorrected offset (4-LSB) • Coarse comp thermal noise (4-LSB) • Timing mismatch between Fine stage and Coarse stage (±3-LSB)

  22. ADC Timing and Clock Phases • Two clock phases • PhS: sample input signal • PHH: hold, clock of the coarse ADC • Self_timed operation: coarse ADC generates clock for the fine • En_Fine: is the output of the coarse to enable the fine • PHS,early: early sampling phase to clock the fine comps

  23. Fine ADC: Offset Healing • Fine comparator: • VDD = 1.2 V • I_avg = 80 uA • σn,in = 630 uV-rms • Offset tuning: • Tune gate voltage of a transistor that is in parallel with input transistors, i.e. Vcal

  24. Fine Healing Logic (I) • Fine healing logic heals the offset of the fine comparators by sliding one of the inputs of the fine comparator along a ladder. • The key blocks in the fine healing logic are: • Shift Register with a moving “1” that tells which comparator is being healed. • Input scan chain to load the control signals for the switches that tap on to the resistor ladder. • Self-healing logic slice incorporating: • Registers  storing comparator state and which comparator is being healed • Tri-states to enable reading out of each state • Decoders that control the switches • Output scan chain to read out the state of each comparator. • Each comparator is healed many times and the results are compiled to determine the “favorable” tap voltage.

  25. Fine Healing Logic (II)  Shift Register with a shifting “1” that tells which comparator is being healed.  Input scan chain.  Output scan chain.

  26. Fine Calibration Logic (III) • The healing sequence is as follows: • Toggle “RESETB” from “0” to “1”. This resets all the registers in the shift-register along the slice to “0” and the bottom most register to “1”. • Toggle “ShiftClk”  shifting “1” to the register of “Slice 0”. • Scan data into the input scan-chain using “DinScan” and “ScanClk”. • Toggle “LoadClk”. This loads data in the input scan chain to the corresponding registers of the slice that is enabled by the “1” in the shift register. • Clock the comparator a few times and average the comparator output. • The value of the register which corresponds to 50% “1s” and “0s” is the value that has removed the offset of the comparator. • The register value can be read out using the output scan chain. • Toggle “ShiftClk” again to shift the “1” and move on to the next comparator.

  27. Fine ADC Layout • Fine ladder: poly resistors with bypass caps • Preamp layout: 32 um 11 um • Requires: • 1023 resistors • 128 bypass caps at sliding ref taps • Fine comp & SR Latch layout: 15 um 3.9 um

  28. Fine ADC Layout • Fine slice: 2 x (comp and its calibration logic) • One half of slice: (comp and its calibration logic) Fine Decoder Fine Comp Bypass caps 5-1 MUX Switches 7.8 um 133 um • Fine slice: 2 x (comp and its calibration logic) 250 um 7.8 um

  29. Jitter Redcution & Spectrum Sensing • Jitter reduction – major updates • Quantization noise is factored into simulation • Fixed-point implementation of the recovery algorithm • Hardware performance metrics are computed (Area/Power) • RF spectrum sensing – major updates • RF sensing with (PD, PFA) = (0.9, 0.1) detection performance • Experimentally verified signal detection with SNR = −5dB • Taped out spectrum sensing chip (9.53 mW, 250 MHz band)

  30. Jitter Reduction Summary • Assumptions: • The SNR is computed by inputting a tone at 250 MHz and computing the SNR at the output of the system. • The area is computed assuming a direct ratio of: 10,000 LUTs = 0.5mm2 • The power is computed assuming 5 GOPS/mW efficiency

  31. Jitter Reduction: Proposed Architecture • A high frequency tone is modulated with a low frequency tone and injected into the incoming signal. These tones will serve as training for the estimation method. • The training tones are scaled by the coefficient α and the quantization noise q(n) is incorporated:

  32. Jitter Estimation and Compensation The block diagram is illustrated below where the jitter estimation and jitter compensation is based on Q5 report. Analysis on jitter estimation performance indicates that the estimates will be: Where the variance of the error is: Where α is a scale factor to scale the power of the training tones compared to the incoming signal.

  33. Matlab/Simulink Hardware Model The fixed point design was implemented in Simulink using the Synplify DSP fixed-point blockset: Linear interpolation based on the derivative of the incoming signal

  34. Architecture Design Architecture #1: 128-tap low-pass filter 11-tap derivative filter • Area:9610 LUTs(meets 100 MHz timing) • Architecture #2: • accumulator low-pass filter • Direct-form 9-tap derivative filter • Area:2942 LUTs(meets 89 MHz timing) • Architecture #3: • Accumulator low-pass filter • Anti-symmetric 9-tap derivative filter 10-16 bit wordlengths • Area:1339 LUTs(meets 89 MHz timing)

  35. Wideband Spectrum Sensing: Issues and Specifications Interfering power issues Introduced by strong adjacent-band signals Makes weak signals difficult to detect • Specifications

  36. Design Challenges in Sensing Algorithms Interfering power increases estimation error Design goal can’t be met even with max sensing time Ignoring interfering power leads to underestimated thr. False-alarm rate increases Desired op. region 1 0.8 Increased sensing time 0.6 PD 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 PFA

  37. Proposed Wideband Spectrum Sensing Processor Multitap windowed power detector Reduce Tsensing by interfering power suppression Number of averages and threshold adaptation schemes Meet PD and PFA by adapting to interfering power

  38. Adapts decision thresholds to in-band interfering power Threshold Adaptation • Includes in-band interferer into the binary hypothesis test

  39. Threshold Adaptation Analysis Tmw(c) as a function of noise power and interfering power Allows dynamical threshold adaptation The number of averages and decision threshold depend on SNR, PD, PFA, noise power, and interfering power a: fitting factor

  40. Sensing Performance Analysis Multitap windowed FPD Modeled by M-dependent sequence As M is large enough, Tmw(c) can be modeled by Gaussian distribution Used to derive # of averages and decision threshold

  41. Improvement in Detection Rate Only multitap windowed FPD meets the detection rate constraint within 100 FFT averages 1 0.8 0.6 PD 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 PFA More than 1.3x increase in detection rate

  42. Improvement in False-Alarm Rate More than 2x decrease in false-alarm rate @ INR > 10 dB

  43. Improvement in Sensing Time Only multitap windowed FPD meets sensing time constraint for a 1-bin adjacent-band interferer More than an order of magnitude reduction in sensing time

  44. Hardware Emulation Setup Cognitive radio test bed BEE2 platform: 5 high performance Xilinx FPGAs with Power PC processors, Gigabit serial channels, Ethernet, support for up to 20GB of RAM Mixed signal Front End Baseband processor (2 12-bit ADCs at 64MS/s and 2 14-bit DACs at 128 MS/s) Analog front-end radio

  45. Experimental Settings Wideband radio BW is limited by the analog front-end Scales the target 500 MHz spectrum to 64 MHz Scales the frequency resolution to 64.5 kHz (to keep the FFT size)

  46. Experimental Results Proposed threshold adaptation algorithm Meets PFA constraints Proposed number of accumulation adaptation scheme Meets PD constraints Meets sensing time constraints (M < 104)

  47. Design Challenges in Processor Architecture Low power design FFT consumes more than 50% of the total power Parallelism and voltage scaling for power reduction Factorize optimal FFT PU combination to minimize power-area product Small area (low cost) design Large dynamic range of the spectrum leads the data path after FFT processor requires large WL Memory for storing FFT power, noise power, and interfering power occupies > 50% of the total area Changing data format to reduce WL for area saving

  48. Parallel-Pipelined FFT for Power Reduction Parallel architecture allows VDD scaling to reduce power 8x parallelism achieves minimum power-area product FFT factorization determines optimal PU combination 128-pt FFT with radix-4/radix-4/radix-8 architecture achieves 4x power-area-product reduction

  49. Changing Data Format for Area Reduction Reduces memory size by changing the data format 60% area reduction is achieved by changing 2’s complement to floating-point representation

  50. Power-Area Summary Synthesis estimates • Total chip power: 19.06 mW (500 MHz band) (9.53 mW for 250 MHz band)