Yes, I'm able to index audio files within Alfresco

Slides:



Advertisements
Similar presentations
Visit the ccScan Website Scan, Import, and Automatically File documents to the Cloud SCAN, IMPORT, AND AUTOMATICALLY FILE DOCUMENTS TO SALESFORCE ® Introduction.
Advertisements

Cheryl Jelks Trainer/Applications Support Analyst Richland School District One.
                      Digital Audio 1.
Chapter 11 Media and Interactivity Basics Key Concepts
Linguist Module in Sphinx-4 By Sonthi Dusitpirom.
Sean Powers Florida Institute of Technology ECE 5525 Final: Dr. Veton Kepuska Date: 07 December 2010 Controlling your household appliances through conversation.
PHONEXIA Can I have it in writing?. Discuss and share your answers to the following questions: 1.When you have English lessons listening to spoken English,
1 Component Description Multimodal Interface Carnegie Mellon University Prepared by: Michael Bett 3/26/99.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
CONTENT: A model for collaborative database building Trevor Bond Alan Cornish Washington State University Libraries.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Macromedia Dreamweaver 4 Advanced Level Course. Add Rollovers Rollovers or mouseovers are possibly the most popular effects used in designing Web pages.
LYU0103 Speech Recognition Techniques for Digital Video Library Supervisor : Prof Michael R. Lyu Students: Gao Zheng Hong Lei Mo.
Microsoft Office Illustrated Inserting Illustrations, Objects, and Media Clips.
Outline of Presentation Introduction of digital video libraries Introduction of the CMU Informedia Project Informedia: user perspective Informedia:
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Tutorial 7 Working with Multimedia. XP Objectives Explore various multimedia applications on the Web Learn about sound file formats and properties Embed.
Free Sound Recorder By FreeAudioVideoSoft. Pricing & Installation Software is absolutely FREE With agreement to terms and conditions Installation Requirements:
STANDARDIZATION OF SPEECH CORPUS Li Ai-jun, Yin Zhi-gang Phonetics Laboratory, Institute of Linguistics, Chinese Academy of Social Sciences.
GIS technologies and Web Mapping Services
© Cheltenham Computer Training 2001 Macromedia Dreamweaver 4 - Slide No 1 Macromedia Dreamweaver 4 Advanced Level Course.
CapturaTalk4Android Demonstration Abi James
CHAPTER FOUR COMPUTER SOFTWARE.
Tutorial 7 Working with Multimedia. XP Objectives Explore various multimedia applications on the Web Learn about sound file formats and properties Embed.
Spoken dialog for e-learning supported by domain ontologies Dario Bianchi, Monica Mordonini and Agostino Poggi Dipartimento di Ingegneria dell’Informazione.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
By: Meghal Bhatt.  Sphinx4 is a state of the art speaker independent, continuous speech recognition system written entirely in java programming language.
Outline Grammar-based speech recognition Statistical language model-based recognition Speech Synthesis Dialog Management Natural Language Processing ©
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal VideoConference Archives Indexing System.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Voice Recognition (Presentation 2) By: Priya Devi A. S/W Developer, Xsys technologies Bangalore.
Distributed Rendering Tool for Voices (DRTV) Familiar, Expressive Voices & Personalities Speech Technology & Media Solutions By Dale Schalow SCHALOW Innovations.
© 2013 by Larson Technical Services
Basic structure of sphinx 4
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Reducing uncertainty in speech recognition Controlling mobile devices through voice activated commands Neil Gow, GWXNEI001 Stephen Breyer-Menke, BRYSTE003.
ALPHABET RECOGNITION USING SPHINX-4 BY TUSHAR PATEL.
Learning Aim C.  In this section we will look at some simple client-side scripts, browser compatibility, exporting and compressing and suitable file.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
#SummitNow Fighting viruses with Alfviral 2013 Fernando González @fegorama.
Krum Haesli, BotsBits SA Implementing Digital Asset Management with SharePoint 2013.
Capture and Storage of Tabular Data Leveraging Ephesoft and Alfresco W. Gary Cox Senior Consultant Blue Fish Development Group.
#SummitNow Yes, I'm able to index audio files within Alfresco 2013 Fernando González @fegorama.
#SummitNow Alfresco Workdesk – Technical Insights November 12, 2013 Martin Kappel.
Microsoft FrontPage 2003 Illustrated Complete Creating a Web Site.
Scan, Import, and Automatically file documents to Box Introduction
Scanning to Google Drive and Docs™ with ccScan®
VoiceXML Tutorial: Part 1 Introduction and User Interaction with DTMF
Contract Management in electrical engineering
Actions and Behaviours
Digital Video Library - Jacky Ma.
“Under the hood”: Angry Birds Maze
Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan
Alfresco Workdesk – Technical Insights
Tutorial 7 Working with Multimedia
Lab 2: Isolated Word Recognition
                      Digital Audio 1.
EPIC INFOTECH CONSULTING GROUP
Chapter 2: System Structures
Dynamic Web Pages Jin Wu INF 385E Information Architecture
Speech Capture, Transcription and Analysis App
Working with Multimedia
Lab 3: Isolated Word Recognition
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
“Under the hood”: Angry Birds Maze
SpeechClipse v 1.0 “An Effective Plug-In for the Eclipse IDE”
Presentation transcript:

Yes, I'm able to index audio files within Alfresco 2013 Fernando González Hi everyone! I’m Fernando González and this lightning talk is about indexing audio files within Alfresco.

Why? A lot of audio/video files in many companies The need to seek words in audio files Transcription of important conversations Efficiency in DAM There are many answers about the possibilities of indexing audio files: Many companies have a lot of audio and video files It’s necessary to search audio files for text words Many important talks have to be transcribed Audio indexing promotes efficiency in DAM (Digital Asset Management)

AAT (Alfresco Audio Transcriber) What is it? AAT (Alfresco Audio Transcriber) Alfresco Action (Java) for audio transcription with Sphinx-4 from Carnegie Mellon University AAT (Alfresco Audio Transcriber) is an Alfresco Module created in Java for audio transcription with Sphinx4 program developed by Carnegie Mellon University. This transcription is used to index text words in Alfresco.

What is Sphinx-4? A group of speech recognition systems developed at Carnegie Mellon University. These include a series of speech recognizers (Sphinx 2 - 4) and an acoustic model trainer (SphinxTrain). But, what is Sphinx-4? Sphinx-4 describes a group of speech recognition systems developed at Carnegie Mellon University. These include a series of speech recognizers (Sphinx 2 - 4) and an acoustic model trainer (called SphinxTrain).

Hidden Markov Model (HMM) Elements of Sphinx-4 Language model: Grammars Dictionaries Acoustic models: Hidden Markov Model (HMM) The main elements of Sphinx-4 are: Two model types --a language model and an acoustic model. The language model includes grammars and dictionaries. Acoustic models are wave modulation algorithmics for human voice recognition --this software uses the Hidden Markov Model (HMM).

How does the action work? Transcribes by direct execution Transcribes using content rules Transcribes using UI-Actions Transcribes with Alfresco Scheduler The Alfresco Java Action works as follows: Audio transcription from direct execution of Java Action Audio transcription using content rules Audio transcription using UI-Actions in Alfresco Share Audio transcription with Alfresco Scheduler by settinp up a scheduler-actions-context.xml file

Features Use of Sphinx-4 and JSAPI2 for recognition Use of "policies" to transcribe uploaded content Use of "scheduler" to transcribe spaces programmatically Use of action “Audio Transcriber" in user interfaces (Alfresco Explorer and Share) List of available Audio Files Assignment of "aspects" to control transcriptions With respect to the supported features… Use of Sphinx-4 and JSAPI2 for human voice recognition Use of Alfresco Events (policies) to transcribe uploaded content Use of “scheduler” to transcribe spaces or folders programmatically Use of the Alfresco Java Action “Audio Transcriber” in user interfaces --Alfresco Share and Alfresco Explorer Maintenance of a list of available audio files Assignment of “aspects” from “custom content model” to control transcriptions

Architecture Alfresco API (Actions) Share API (UI-Actions) JSAPI2 Sphinx-4 API AAT uses four main elements: Alfresco API for development of Java Actions extended from ActionExecuterAbstractBase and Scripts in JavaScript Alfresco Share API for development of webscriots UI-Actions JSAPI2 (Java Speech API 2.0) as middleware providing JSGF and JSML specifications, support for audio redirection, and more… …and Sphinx-4 API as main element for audio recognition and transcription

Transcriber Action Upload the file (WAV,…) Run the Action Call to transcriber and recognizer Capture words and other properties Indexing… Upon uploading an audio file, Java “Transcriber Action” is called and a voice recognition is made using a grammar and dictionaries model and an acoustic model. Afterwards, the words captured are included into properties …and indexed!

Model for audio-indexing Aspect: Transcriber Property: Words Index: Atomic and Tokenized Property: Frames Index: No Words and Frames are multiple The custom content model is very simple –it uses a Transcriber Aspect to assign properties. The properties contain multiple values and save text words and frames/time during detection. Text words are indexed in atomic form.

Ways to transcribe Automatic transcription Upload/Create and Load documents Actions/Rules Programming transcription Scheduled Actions Interactive transcription Repository action running UI Action running Use of automation for transcription by using uploaded audio files as events and action rules Use of transcription through scheduled actions And interactive transcription with execution of Repository and UI-Actions within Alfresco Share and Alfresco Explorer

Fields of application DAM (Digital Asset Management) Trials recording Movies and Songs Radio and TV Education There are many fields of application: DAM (Digital Asset Management) Trials recording in courts Movies and songs in media companies Radio and TV Education and more

To Do… New formats of audio files for transcriptions Internationalization (Grammars and Acoustic models) Specialized Dictionaries Refactoring, refactoring and refactoring… The to-do list includes: New formats of audio files for transcriptions Internationalization of grammars, dictionaries and acoustic models Specialized dictionaries and thesaurus And more refactoring…