Sphinx on Handhelds David Huggins-Daines

Sphinx on Handhelds David Huggins-Daines dhuggins@cs.cmu.edu

Sphinx on Handhelds? Handheld/embedded devices are pretty speedy these days LVCSR on them is not unreasonable An open-source one does not exist yet CALO’s new focus on mobility S2S translation projects could use it Sublime, smartphone applications, etc ISL has it, so should we!

Handheld challenges CPU speed Typically 200-400MHz ARM/XScale Faster than the workstations Sphinx started out on No hardware floating-point instructions ARM has very fast and sophisticated integer ISA Memory and storage capacity/speed DRAM is very limited (32 or 64MB) Storage is very slow (typically CF cards) Inefficient and clumsy operating systems WinCE has no stdio, broken malloc, 32MB limit PalmOS is much, much worse!

Plan for Sphinx on Handhelds Start out with Sphinx2 It’s fast People use it already Convert “hot spots” to integer math Precompute model files Avoid parsing (no stdio, remember) Allow memory-mapped I/O (subvert the 32MB limit on WinCE) Disable non-useful features in libraries e.g. flat lexicon search, CDHMM

Current Status Sphinx2 on Sharp Zaurus Linux, 40MB system RAM, 206MHz ARM Performance on RM1: 1.7x realtime No degradation in accuracy Integer front-end and GMM code complete Front end also has a “faster” mode 10% faster, 10% degradation in accuracy Memory consumption is too high WSJ5k can just barely run Sphinx2 consumes about 16MB of heap space Requires quantized mixture weights (-8bsen) Sphinx3.x is much smaller … and slower

Implementation details FFT is done with 16:16 fixed point Bits 31:16 are whole part and sign Bits 15:0 are fractional part I.e. all numbers scaled by 65536 Lossless multiplication done using 4 integer shift-multiply- accumulates (ARM is really good at this) Mel-spectrum calculated in log scale Using base 1.0001 in order to exploit existing add-table implementation “Faster” mode uses 28:4 fixed point instead Overflows saturated to INT_MAX Zeroes floored to log(2 -4 ) - very important!

Implementation details Abstract types for intermediate values mfcc_t, powspec_t, mean_t, var_t #define FIXED_POINT to make them ints Arithmetic macros (fixpoint.h) fixed32 type analogous to float32 addition and subtraction work as expected MFCCMUL(), MFCC2FLOAT(), FLOAT2MFCC() macros become no-ops in floating-point build GMMADD(), GMMSUB() do saturating addition and subtraction ARM has special instructions for this too! Wow!

Future Work Rationalize the file formats General WinCE porting (Mohit) Front-end optimization Implement fixed-point FHT Investigate Sphinx 3.x for embedded SubVQ and GS can make it fast and cut memory consumption even more Much nicer architecture But not widely used, API not as stable

Sphinx on Handhelds David Huggins-Daines

Similar presentations

Presentation on theme: "Sphinx on Handhelds David Huggins-Daines"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sphinx on Handhelds David Huggins-Daines

Similar presentations

Presentation on theme: "Sphinx on Handhelds David Huggins-Daines"— Presentation transcript:

Similar presentations

About project

Feedback