IPSOM Indexing, Integration and Sound Retrieval in Multimedia Documents.

Slides:

Advertisements

Similar presentations

AUTOMATIC PHONETIC ANNOTATION OF AN ORTHOGRAPHICALLY TRANSCRIBED SPEECH CORPUS Rui Amaral, Pedro Carvalho, Diamantino Caseiro, Isabel Trancoso, Luís Oliveira.

Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Information Society Technologies Third Call for Proposals Norbert Brinkhoff-Button DG Information Society European Commission Key action III: Multmedia.

Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.

LogCLEF 2009 Log Analysis for Digital Societies (LADS) Thomas Mandl, Maristella Agosti, Giorgio Maria Di Nunzio, Alexander Yeh, Inderjeet Mani, Christine.

1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,

Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP1. Project Management.

Multimedia Retrieval. Outline Audio Retrieval Spoken information Music Document Image Analysis and Retrieval Video Retrieval.

MULTI LINGUAL ISSUES IN SPEECH SYNTHESIS AND RECOGNITION IN INDIAN LANGUAGES NIXON PATEL Bhrigus Inc Multilingual & International Speech.

Selecting Preservation Strategies for Web Archives Stephan Strodl, Andreas Rauber Department of Software.

1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications.

WP5 – Knowledge Resource Sharing and Management Kick-off Meeting – Valkenburg 8-9 December 2005 Dr. Giancarlo Bo Giunti Interactive Labs S.r.l.

1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.

Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.

Spoken Language Technologies: A review of application areas and research issues Analysis and synthesis of F0 contours Agnieszka Wagner Department of Phonetics,

1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.

Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.

Web Mining Research: A Survey

Auditory User Interfaces

Development of Japanese GIS Tool for use in the Humanities ○ Masatoshi ISHIKAWA †, Yoichi KAWANISHI ††, Hidefumi OKUMURA †††, Shoichiro HARA †††† † University.

Text-To-Speech System for Marathi Miss. Deepa V. Kadam Indian Institute of Technology, Bombay.

1 Problems and Prospects in Collecting Spoken Language Data Kishore Prahallad Suryakanth V Gangashetty B. Yegnanarayana Raj Reddy IIIT Hyderabad, India.

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.

Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.

Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.

Towards Online Accessibility of Valuable Phenomena of the Bulgarian Folklore Heritage Radoslav Pavlov 1 Konstantin Rangochev 1 Desislava Paneva-Marinova.

OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.

PrepTalk a Preprocessor for Talking book production Ted van der Togt, Dedicon, Amsterdam.

COINE Cultural Objects in Networked Environments.

Consolidating the European Library Space Luxembourg November 1999.

The PrestoSpace Project Valentin Tablan. 2 Sheffield NLP Group, January 24 th 2006 Project Mission The 20th Century was the first with an audiovisual.

AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.

1 CS 430 / INFO 430 Information Retrieval Lecture 23 Non-Textual Materials 2.

1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)

Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio News (proceedings page 255) Mike Dowman Valentin Tablan Hamish Cunningham.

Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.

1 CS430: Information Discovery Lecture 18 Usability 3.

1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.

Endeca: a faceted search solution for the library catalog Kristin Antelman & Emily Lynema UNC University Library Advisory Council June 15, 2006.

Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,

Audient: An Acoustic Search Engine By Ted Leath Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University.

Uganda Scholarly Digital Library (USDL) Makerere University’s Institutional Repository By Margaret Nakiganda URL:

Enabling Access to Sound Archives through Integration, Enrichment and Retrieval Annual Review Meeting - Introduction.

1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.

PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Special User Groups.

Information Retrieval

Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Introduction A field survey of Dutch language resources has been carried out within the framework of a project launched by the Dutch Language Union (Nederlandse.

Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.

1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.

Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.

Margret Plank 17th International Conference on Grey Literature 1st and 2nd December 2015, Amsterdam (Netherlands) Move beyond text – How TIB manages the.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation.

TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.

Multi-Source Information Extraction Valentin Tablan University of Sheffield.

Linguistic knowledge for Speech recognition

Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin

3.0 Map of Subject Areas.

EXPERIMENTS WITH UNIT SELECTION SPEECH DATABASES FOR INDIAN LANGUAGES

Searching and browsing through fragments of TED Talks

Pilar Orero, Spain Yoshikazu SEKI, Japan 2018

CSE 635 Multimedia Information Retrieval

Web Mining Department of Computer Science and Engg.

A Suite to Compile and Analyze an LSP Corpus

Presentation transcript:

IPSOM Indexing, Integration and Sound Retrieval in Multimedia Documents

IPSOM 2 Outline Objectives3 Problems to be addressed4 Research team5 Background6 Work plan7 Dissemination of results8

IPSOM 3 Objectives Improved access to spoken information –spoken interface (accessible by the visually impaired) –detection and indexing of units in spoken books words sentences topics Development of multimedia spoken books –broaden the usage of spoken books (didactic applications, etc.) –multimedia interfaces for access and retrieval

IPSOM 4 Problems to be Addressed spoken books offer the visually impaired community a powerful source of information and leisure however: –information is sequentially stored in analogue form 30,000 hours  2,000 books –information retrieval is extremely slow and difficult error prone trial-and-error basis not structured

IPSOM 5 Research Team Speech Processing Group of INESC Lisboa –António Serralheiro (PhD) –Isabel Trancoso (PhD) –Carlos Teixeira (PhD) –Diamantino Caseiro (MSc) –Rui Amaral (MSc) –Hugo Amorim (UG)* Large Scale Informatics Laboratory of Faculdade de Ciências de Lisboa –Nuno Guimarães (PhD) –Teresa Chambel (MSc) National Library –José Borbinha (MSc)

IPSOM 6 Background Previous work on speech recognition and synthesis, development of spoken corpora and alignment tools Current work on topic detection for broadcast news recognition (ALERT project) Previous work on video segmentation and indexing Current work on techniques and methods for integrating digital video with text (UNIBASE project) Collaboration with the NISO efforts for the “Talking Book” standard Collaboration in the DAISY (world-wide initiative) project for digital talking books

IPSOM 7 Work Plan Duration: 36 months Manpower: 193 person*month

IPSOM 8 Dissemination of Results Conferences & Workshops –National (e. g. RECPAD, PROPOR) –International (e.g. AACE, ICASSP, ASRU, EUROSPEECH, ICSLP, ECDL) Final Workshop –specially aimed at the visually impaired community Web Site –dissemination of didactic or other multimedia applications Browsing tools for spoken books Digitally stored and indexed spoken books (distributed through BN) –invaluable resource for data-driven prosody modelling and unit selection for text-to-speech synthesis

IPSOM 9 Budget Overall funding

IPSOM 10 Budget Requested Funding by Institution

IPSOM 11 Spoken Corpora Alignment Generation of an N-gram framework for topic detection. Generation of phonetic transcriptions for the spoken book texts –To be done automatically using the grapheme-to-phone module of the DIXI+ project –Pronunciation of specific proper names or technical terms may eventually need to be manually corrected Generation of speaker-dependent acoustic models, adapted to the reader of the spoken book –Context-dependant models are the state-of-the-art for LVCSR tasks –Especially important to model the intra and inter word vowel reduction phenomena that characterise European Portuguese. Labelling of the spoken corpora –Initial segmentation stage –Knowledge sources derived from the above subtasks.

IPSOM 12 Multimedia Applications for Indexing and Retrieval Search and retrieval models –keyword, combined indexes, topics, and standardised metadata Access interface, visualisation and navigation on the "spoken books" base –Search –Query –Retrieval –Navigation Integration of the tools/applications for "browsing" and "retrieval" with the phonographic base –Access interfaces and performance issues have to be designed Usability testing and evaluation