sphinx on handhelds l.
Skip this Video
Loading SlideShow in 5 Seconds..
Sphinx on Handhelds PowerPoint Presentation
Download Presentation
Sphinx on Handhelds

Loading in 2 Seconds...

play fullscreen
1 / 8

Sphinx on Handhelds - PowerPoint PPT Presentation

  • Uploaded on

Sphinx on Handhelds. David Huggins-Daines dhuggins@cs.cmu.edu. Sphinx on Handhelds?. Handheld/embedded devices are pretty speedy these days LVCSR on them is not unreasonable An open-source one does not exist yet CALO’s new focus on mobility S2S translation projects could use it

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Sphinx on Handhelds' - yair

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
sphinx on handhelds

Sphinx on Handhelds

David Huggins-Daines


sphinx on handhelds2
Sphinx on Handhelds?
  • Handheld/embedded devices are pretty speedy these days
  • LVCSR on them is not unreasonable
  • An open-source one does not exist yet
    • CALO’s new focus on mobility
    • S2S translation projects could use it
    • Sublime, smartphone applications, etc
handheld challenges
Handheld challenges
  • CPU speed
    • Typically 200-400MHz ARM/XScale
    • Faster than the workstations Sphinx started out on
  • No hardware floating-point instructions
    • ARM has very fast and sophisticated integer ISA
  • Memory and storage capacity/speed
    • DRAM is very limited (32 or 64MB)
    • Storage is very slow (typically CF cards)
  • Inefficient and clumsy operating systems
    • WinCE has no stdio, broken malloc, 32MB limit
    • PalmOS is much, much worse!
plan for sphinx on handhelds
Plan for Sphinx on Handhelds
  • Start out with Sphinx2
    • It’s fast
    • People use it already
  • Convert “hot spots” to integer math
  • Precompute model files
    • Avoid parsing (no stdio, remember)
    • Allow memory-mapped I/O (subvert the 32MB limit on WinCE)
  • Disable non-useful features in libraries
    • e.g. flat lexicon search, CDHMM
current status
Current Status
  • Sphinx2 on Sharp Zaurus
    • Linux, 40MB system RAM, 206MHz ARM
    • Performance on RM1: 1.7x realtime
    • No degradation in accuracy
  • Integer front-end and GMM code complete
    • Front end also has a “faster” mode
    • 10% faster, 10% degradation in accuracy
  • Memory consumption is too high
    • WSJ5k can just barely run
    • Sphinx2 consumes about 16MB of heap space
    • Requires quantized mixture weights (-8bsen)
    • Sphinx3.x is much smaller … and slower
implementation details
Implementation details
  • FFT is done with 16:16 fixed point
    • Bits 31:16 are whole part and sign
    • Bits 15:0 are fractional part
    • I.e. all numbers scaled by 65536
    • Lossless multiplication done using 4 integer shift-multiply-accumulates (ARM is really good at this)
  • Mel-spectrum calculated in log scale
    • Using base 1.0001 in order to exploit existing add-table implementation
    • “Faster” mode uses 28:4 fixed point instead
      • Overflows saturated to INT_MAX
      • Zeroes floored to log(2-4) - very important!
implementation details7
Implementation details
  • Abstract types for intermediate values
    • mfcc_t, powspec_t, mean_t, var_t
    • #define FIXED_POINT to make them ints
  • Arithmetic macros (fixpoint.h)
    • fixed32 type analogous to float32
    • addition and subtraction work as expected
    • MFCCMUL(), MFCC2FLOAT(), FLOAT2MFCC() macros become no-ops in floating-point build
    • GMMADD(), GMMSUB() do saturating addition and subtraction
      • ARM has special instructions for this too! Wow!
future work
Future Work
  • Rationalize the file formats
  • General WinCE porting (Mohit)
  • Front-end optimization
    • Implement fixed-point FHT
  • Investigate Sphinx 3.x for embedded
    • SubVQ and GS can make it fast and cut memory consumption even more
    • Much nicer architecture
    • But not widely used, API not as stable