1 / 27

Intro. To Audio Post-Processing and Optimization Strategy

Intro. To Audio Post-Processing and Optimization Strategy. Ivan Lee , Jan, 2005. Outline. Post-Processing Overview Dolby ProLogic II Principle & Features Float to Fixed-point Translation Code Optimization on Lx5280 Q&A. Post-Processing Overview (1). BBE ( HD Sound, Mach3Bass, ViVA (+) )

iona
Download Presentation

Intro. To Audio Post-Processing and Optimization Strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intro. To Audio Post-Processingand Optimization Strategy Ivan Lee , Jan, 2005

  2. Outline • Post-Processing Overview • Dolby ProLogic II Principle & Features • Float to Fixed-point Translation • Code Optimization on Lx5280 • Q&A

  3. Post-Processing Overview (1) • BBE ( HD Sound, Mach3Bass, ViVA(+) ) Natural Musical Realism, Full Frequency Operation, Speech Intelligibility • SRS ( WOW, TruSurround(XT), Circle Surround II) Clarity Improvement, 3D-Audio(SRS), Bass Enhancement Down-mix, Matrix-decode

  4. Post-Processing Overview (2) • Dolby ( ProLogic II, Headphone, Virtual Speaker ) Matrix-decode, Down-mix to Headphone, Down-mix to 2 Speakers system • Realtek Sound Effect EAX(Reverb), Chorus, Equalizer, Pitch Shift, Voice Canceller

  5. ProLogic II Principle & Features (1) • ProLogic II matrix decoder receives 2 channels and delivers feeds for three, four or more loudspeakers. • To express Lt and Rt in terms of the intended direction of a source,independent of the magnitude.

  6. ProLogic II Principle & Features (2) See ProLogic II Block Diagrams

  7. ProLogic II Principle & Features (3) See ProLogic II Block Diagrams

  8. ProLogic II Principle & Features (4) See ProLogic II Block Diagrams

  9. Float to Fixed-point Translation- Translation Process • Keeping an acceptable accuracy is the only criterion when judging the end of translation process.

  10. Float to Fixed-point Translation- float v.s. fixed • IEEE 754 single precision float point range : -3.4 x 1038 ~ 3.4 x 1038 • 16bit integer with 2’s complement range : -32,768 ~ 32,767 • 32bit integer with 2’s complement range : -2,147,483,648 ~ 2,147,483,647

  11. Float to Fixed-point Translation- Numeral Representation • Interpretation of bits in a 4-bit integer ┌─────── sign │ ┌───── 22 = 4 0110  0*(-8) + 1*4 + 1*2 + 0*1 = 6 │ │ ┌─── 21 = 2 │ │ │ ┌─ 20 = 1 1111  1*(-8) + 1*4 + 1*2 + 1*1 = -1 ↓ ↓ ↓ ↓ ┌─┬─┬─┬─┐ │S │b2│b1│b0│ └─┴─┴─┴─┘ • Interpretation of fractional numbers ┌─────── sign │ ┌───── 22 = 4 0101  0*(-1)+1*0.5+0*0.25+1*0.125 = 0.625 │ │ ┌─── 21 = 2 │ │ │ ┌─ 20 = 1 1010  1*(-1)+0*0.5+1*0.25+0*0.125 = -0.75 ↓ ↓ ↓ ↓ ┌─┬─┬─┬─┐ │S │b2│b1│b0│ └─●─┴─┴─┘

  12. Float to Fixed-point Translation- Arithmetic Functions (1) • Replace arithmetic operators with arithmetic functions (ETSI std. and Lx5280 extension) Ex. Y = a + b; Y = L_add(a,b); M = x * y;  M = L_mult(x,y); S += x * y;  S = L_mac(S,x,y);

  13. Float to Fixed-point Translation- Arithmetic Functions (2) • Addition add( ), sub( ), L_add( ), L_sub( ) • Multiplication mult( ), mujlt_r( ), L_mult( ), EL_mult( ), EL_mult_r( ) • Division divide_s( ) • Arithmetic shifts shr( ), shl( ), L_shr( ), L_shl( ), shift_r( ), L_shift_r( ) • Absolute value abs_s( ), L_abs( )

  14. Float to Fixed-point Translation- Arithmetic Functions (3) • Multiply accumulate msu_r( ), mac_r( ), L_mac( ), L_msu( ), EL_mac( ), EL_mac_72( ) • Negation negate( ), L_negate( ) • Accumulator manipulation L_deposit_l( ), L_deposit_h( ), extract_l( ), extract_h( ), extract_h32( ) • Round round( ), round32( ) • Normalization norm_l( ), norm_s( )

  15. Float to Fixed-point Translation- Example Code • See fixed math library source code ex. L_add( ), L_shl( ),EL_mult( ) • See float and fixed source code ex. polezero( ) v.s. fix_polezero( )

  16. Float to Fixed-point Translation- Example Code : L_add( ) INT32 L_add(INT32 L_var1, INT32 L_var2) { INT32 L_Sum,L_SumLow,L_SumHigh; L_Sum = L_var1 + L_var2; if ((L_var1 > 0 && L_var2 > 0) || (L_var1 < 0 && L_var2 < 0)) { /* an overflow is possible */ L_SumLow = (L_var1 & 0xffff) + (L_var2 & 0xffff); L_SumHigh = ((L_var1 >> 16) & 0xffff) + ((L_var2 >> 16) & 0xffff); if (L_SumLow & 0x10000) L_SumHigh += 1; /* carry into high word is set */ /* update sum only if there is an overflow or underflow */ if ((0x10000 & L_SumHigh) && !(0x8000 & L_SumHigh)) L_Sum = LW_MIN; /* underflow */ else if (!(0x10000 & L_SumHigh) && (0x8000 & L_SumHigh)) L_Sum = LW_MAX; /* overflow */ } return (L_Sum); }

  17. Float to Fixed-point Translation- Example Code : polezero( ) void polezero(DSPfract *inptr, DSPshort inoff, DSPfract *outptr, DSPshort outoff, POLEZERO_CFS *filtcfs, POLEZERO_VARS *filtvars, DSPshort sampcount) { DSPfract accum; int samp; for (samp = 0; samp < sampcount; samp++) { accum = -filtvars->y1 * filtcfs->a1; accum += *inptr * filtcfs->b0; accum += filtvars->x1 * filtcfs->b1; filtvars->x1 = *inptr; *outptr = DSPrnd(PCMBITS, PCMRND, accum); filtvars->y1 = *outptr; inptr += inoff; outptr += outoff; } }

  18. Float to Fixed-point Translation- Example Code : fix_polezero( ) void fix_polezero(INT32 *inptr, INT16 inoff, INT32 *outptr, INT16 outoff, FIX_POLEZERO_CFS *filtcfs, FIX_POLEZERO_VARS *filtvars, INT16 sampcount) { INT32 accum; INT16 samp; for (samp = 0; samp < sampcount; samp++) { accum = EL_mult(filtvars->y1 , filtcfs->a1); accum = L_sub(EL_mult(*inptr , filtcfs->b0),accum); accum = L_add(accum, EL_mult(filtvars->x1 , filtcfs->b1)); filtvars->x1 = *inptr; *outptr = accum; filtvars->y1 = *outptr; inptr += inoff; outptr += outoff; } }

  19. Float to Fixed-point Translation- Substitute by Module Float point level program flow ─┬─→┬─→┬─→┬─→┬─→┬─→ │ │ │ │ │ │ ↓ ↑ ↑ ↑ ↑ ↑ └─→┴─→┴─→┴─→┴─→┘ Fixed point level

  20. Code Optimization on Lx5280- Iterative Optimization Process • Different performance metrics execution speed, memory use, power consumption, quality

  21. Code Optimization on Lx5280- Optimization Strategy • Processor-independent small interface, inline function, recycling memory buffer flatting function call hierarchy • Processor-specific algorithmic modifications and transformations, assembly language programming • Memory access issue

  22. Code Optimization on Lx5280- Radiax DSP instructions • MAC 40/72-bit Accumulator Reg. , Saturation Detect, SIMD • Data Addressing Twinword data movement, Post-modified Pointer, Circular Buffer • ALU Saturation Detect, SIMD, Absolute, Normalization

  23. Code Optimization on Lx5280- Example Code : fix_polezero( ) fix_polezero: /* setup ZOH loop */ la t3, fix_polezero_start # get fix_polezero_start la t4, fix_polezero_end-4 # get fix_polezero_end ori t5, zero, 8-1 # loop_count = sampcount - 1 mtru t3, lps0 # set ZOH loop start address mtru t4, lpe0 # set ZOH loop end address mtru t5, lpc0 # set ZOH count /************************************************************ * state variables & coefficients mapping to register files * (a1,b0,b1,y1,x1) = (t3,t4,t5,t6,t7) ************************************************************/ lw t3, 0x00(t0) # t3 = filtcfs->a1 lw t4, 0x04(t0) # t4 = filtcfs->b0 lw t5, 0x08(t0) # t5 = filtcfs->b1 lw t6, 0x00(t1) # t6 = filtcfs->y1 lw t7, 0x04(t1) # t7 = filtcfs->x1 sll a1, a1, 0x02 # inptr offest length in byte sll a3, a3, 0x02 # outptr offest length in byte

  24. Code Optimization on Lx5280- Example Code : fix_polezero( ) [con’t] fix_polezero_start: lw t8,0x00(a0) # t8 = *inptr multa m0, t6, t3 # MAC0 = -y1 * a1 multa m1, t8, t4 # MAC1 = (*inptr) * b0 multa m2, t7, t5 # MAC2 = x1 * b1 addu a0, a0, a1 # inptr += inoff mfa v0, m0h mfa v1, m1h mfa t9, m2h subr.s v1, v1, v0 addr.s v1, v1, t9 # v1 = accum or t7, zero, t8 # x1 = *inptr sw v1, 0x00(a2) # *outptr = accum or t6, zero, v1 # y1 = *outptr addu a2, a2, a3 # outptr += outoff fix_polezero_end: nop sw t6, 0x00(t1) # filtcfs->y1 = t6 sw t7, 0x04(t1) # filtcfs->x1 = t7 fix_polezero_exit: jr ra # return to caller nop

  25. Code Optimization on Lx5280- comparison table

  26. References • “The Scientist and Engineer's Guide to Digital Signal Processing”  ,Steven W. Smith , ISBN 0-9660176-6-8 • “Using The Low Cost, High Performance ADSP-21065L Digital Signal Processor For Digital Audio Applications” ,Dan Ledger and John Tomarakos , Analog Device Inc. • “Converting floating-point applications to fixed-point “ ,Randy Allen , Embedded Systems Programming. Sep. 24. 2004 • “Developing software for audio/visual devices” ,Bjorn Hori and Jeff Bier , Embedded Systems Programming. Nov. 24. 2004 • ETSI ANSI-C code for the GSM half rate speech codec (GSM 06.06)

  27. Q&A

More Related