Download presentation
Presentation is loading. Please wait.
1
SPEEch on the griD (SPEED)
Virtual team project SPEEch on the griD (SPEED) Ladislav Hluchý, Milan Rusko Institute of Informatics Slovak Academy of Sciences Bratislava, Slovakia 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
2
SPEEch on the griD (SPEED)
Almost every step in automatic speech recognition (ASR) and text-to-speech (TTS) training and testing could benefit from data parallelism and splitting a computing task among many execution nodes of a computer cluster. Feature extraction, initiation and forward-backward training of the acoustic models, re-alignment of the speech data and alignment for clustering, tree-based clustering, linear transform estimations for adaptation of acoustic models, belong to the tasks in which huge amount of data rapidly causes the ASR development to be extremely time expensive. Data parallelism may be used to shorten this development and is applicable in different contexts such as in per-subset training, Hidden Markov Models' state, stream, speaker, and even sentence-level contexts. The training process has a lot of input parameters, e.g. the type of features, number of forward-backward iterations, where to apply re-alignment of the speech data and when to apply tree clustering, what type of linear transform to choose, and so far. The set of input parameters depends on the speech data available, the domain (the type of data such as broadband speech or telephone speech). Some steps can be language dependent as well. Finally, validation and evaluation criteria increase the number of variable input parameters, which leads to the need for more computing power in the optimization process. 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
3
SPEEch on the griD (SPEED)
Motivation Automatic speech processing computationally demanding in the training, optimization and testing phases, therefore optimization is done „part by part“. But optimization of one part of the recognizer is not independent from the settings of the other parts, the optimization process should be holistic, taking into account the influence of as much parameters as possible. Output Making power of GRID computing available to a wider community of researchers dealing with speech processing for everyday work Developing methods for holistic optimization and diagnostics in speech processing and tools implementing these methods in the grid platform Other to be added by the consortium 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
4
SPEEch on the griD (SPEED)
The holistic optimization The ASR system consists of many modules. Features of the architecture of such a module together with many values/settings of its control characteristics can be taken as components of a “settings-vector”, or input vector of the particular module. In an ideal case the GRID should enable optimization of the settings of all the input vector values of the system, giving the globally optimized setting of an ASR system for a given purpose. 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
5
Holistic optimization
The aim is to find clusters of the input parameters vector space that increase the system performance for the different training/testing tasks. 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
6
SPEEch on the griD (SPEED)
The aim of the Virtual Team Project is to optimize the training and testing processes of the speech technologies to find clusters of the input parameters vector space that increase the system performance for the different training/testing tasks. The multi-language virtual team could test the impact of language on the optimization processes. A computer cluster can help in identifying the impact of run-time diagnosis on the optimization process, as well as facilitate the search for input parameters subsets that increase the diagnosis performance. 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
7
SPEEch on the griD (SPEED)
Tasks: 1. Establishment of contacts, investigation of the state of the art, formation of a consortium 2. Methodology development for a) holistic optimization i. ASR (may include speaker identification, speaker recognition and language recognition) ii. Text to Speech (TTS) systems b) holistic diagnostics i. ASR ii. TTS 3. Implementation aspects a) porting the computations in the Automatic Speech Processing domain to the Grid platform b) solving particular domain-dependent problems of using GRID computing in automatic speech processing i. Problem of needed high data transfers and its influence on GRID computing speed ii. Data security and program security 4. Storage possibilities for large databases in GRID 5. Porting commercial applications to GRID 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
8
SPEEch on the griD (SPEED)
The consortium is looking for partners with expertise in automatic speech processing, natural language processing, speech and language resources, speech and language modelling, optimalization, high-performance computing and preferably with own available computational capacity 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
9
SPEEch on the griD (SPEED)
Members: NGIs: Slovakia: Milan Rusko (IISAS - Institute of Informatics of the Slovak Academy of Sciences (Leader)) Speech processing group Ladislav Hluchy (IISAS - Institute of Informatics of the Slovak Academy of Sciences (NIL)) Grid computing group Jozef Juhar (Tech. Univ. Košice) Speech processing Ireland: David O'Callaghan (Trinity College Dublin) Switzerland: Milos Cernak (Idiap research institute ) UK: Martin Wynne (University of Oxford) John Coleman (Phonetics Laboratory at Oxford University) Claire Devereux (STFC) EGI.eu: Nuno Ferreira Gergely Sipos aaaa aaaaaaaaa aaa aa aaaa aaaaaaaaaaaa We are looking for other partners Karolis Eigelis aaaa aaaaaaaaa aaaaaaaaaaaaaa Mailing List: 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
10
Virtual Team Project SPEEch on the griD (SPEED)
Thank you for your attention. Any questions? 06/08/2018 EGI Community Forum 27 March 2012 Leibniz
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.