SPEECH DESCRIPTORS GENERATION SOFTWARE UTILIZED FOR CLASSIFICATION AND RECOGNITION PURPOSES Lukasz Laszko Department of Biomedical Engineering, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology
General architecture SDR client SDR server ASR server SDR DB Service Oriented Architecture External systems
Spoken content retrivial – implementation ASR Server architecture Status : implemented ASR Engine Metadata datbase ORM mapper Multithread execution pool Web service Java Concurrency Framework JAX-WS 2.1 with WSIT on Apache Jetty 6 CMU Sphinx-4 Network SOAP + MTOM over HTTPS
Spoken content retrivial – implementation SDR Server architecture External services agents Network Status : under development ASR Connector Services FrontEnd Diagnostic portal Data Access LogicWorkflow Runtime Data Connector Indexing Service Search Service Indexing Workflow Search Workflow SCD Database
SDR database model -Holds ASR temporary results -Holds extraction metadata -Supports task queuing -Performance measurement
SDR database model
SDR architecture – Microsoft P&P architecture approach retail model with mappings SDR database ASR web service IoC with policy injection Web portal + SOAP interfaces
SDR Component - technology 5 Layer architecture (3 layer architecture extension): -Data management layer – PL/SQL stored procedures hosted on Oracle Database 10g -Data access layer – data accessors for stored procedures and ORM mapping (Apache iBatis + Oracle Data Provider for.NET) -Business Logic Layer – business rules encapsulation -Presentation layer – ASP.NET web application + Flash communication server -Client presentation layer – client JavaScript code and Adobe Flex Flash forms communicating with presentation layer via Flash Remoting gateway interface Additionally IoC container for model views load is used – Spring.NET with Policy Injection Aspect programing interface for validation and error handling Routines at Business Logic / Presentation boundry. Security: -Windows Integrated Security – integrated authentication in MS Windows domains, background authorization in components supporting WIS -Standalone, buildin security – custom membership and role providers for ASP.NET providing authentication and authorization according to credentials stored in the database
Indexing and retreval methods Indexing methods Word indexingSub-word indexing
Indexing and retreval methods Spoken document indexing and retrivial methods = extension for full-text indexing methods in textual databases Retrieval Status Value (RSV) – a relevance score calculated for each document stored in the database according to specified Information Retrevial (IR) query. This value reflects how much a given document satisfies requirements defined in the query. IR models adapted for SDR purposes Similarity based modelsProbabilistic models
Similarity based models In this models RSV is defined as a measure of similarity reflecting the degree of resemblance between the query and the document descriptions. The most popular similarity based models are based on the vector space model (VSM) 1. Boolean matching searching 2. Best matching-searching (Salton and Backley method) Weighting Methods Weighting methods fd(t) is the frequency of term t in document description D fq(t) is the frequency of term t in query Q Nc is the total number of documents in the collection and nct the number of documents containing term t Weighting methods
Current status Implementation: Documentation: ASR components – implemented and validated ASR connector for SDR system – partially implemented SDR Client portal – partially implemented SDR Documents indexing – implemented, with partial user interface SDR Documents search – implemented with unit tests Requirements specification – compilant with IEEE-830 standard and Volere template Functional specification – compilant with IEEE standard Design specification for both ASR and SDR systems Test cases and validation scenarios Solution descriptions