1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications (repurposing, archiving…) Technically speaking Structuring and segmenting the documents Associating usable semantics to the documents Organizing, querying, navigating, classifying large collections Strategy Studying various couples of media and techniques on practical problems Extending them to the global MM problem
2 Texmex – November 15 th, 2005 Speech transcription and NLP From specific objectives… Improving transcription with NLP techniques (tagging, topic detection) NLP with degraded text (errors, no structure) …To our global MM goal Both needed to grasp semantics in video
3 Texmex – November 15 th, 2005 Text and Image From specific objectives… Associate textual description to images (not only keywords) Using similar images to find related texts …To our global MM goal Add automatic textual annotation to MM documents A first step towards semantics of MM…
4 Texmex – November 15 th, 2005 Multimedia Models From specific objectives… Stochastic modeling with several data streams / several temporal rates / weakly synchronized data …To our global MM goal Document structuring using all the media of a document
5 Texmex – November 15 th, 2005 Media data analysis From specific objectives… Classification, sampling, clustering, distribution modeling of high dimensional symbolic and numeric data Models and algorithms for handling large collections of data …To our global MM goal Basic toolkit for each media Can be used for MM documents All media together???
6 Texmex – November 15 th, 2005 NLP, Databases and Search Engines From specific objectives… More realistic text representations for SE and DB More semantic querying and retrieval …To our global MM goal Querying semantically MM collections using transcribed text
7 Texmex – November 15 th, 2005 Numerical Descriptors Indexing From specific objectives… Efficient similarity search for high dimensional numeric data Evaluation of descriptor recognition power on large collections …To our global MM goal Efficient indexing of audio, image… sequences
8 Texmex – November 15 th, 2005 The Big Picture Image and video Sound and speech NLP Data analysis Databases Machine Learning Platforms VISTA QGAR LAGADIC TEMICS IMEDIA MRIM METISS IRIT DREAM SYMBIOSE AXIS R2D2 SYMBIOSE PARIS ATLAS GEMO Thomson Canon Reykjavík Geneva Nagoya – NII Montreal Geneva Dublin Geneva Croatia Slovenia ERSS Thomson FT INA FT MULTIMEDIA
9 Texmex – November 15 th, 2005 “I want my PC to understand TV”… Coupling the media Speech and NLP for semantic understanding –Improving speech transcription with NLP techniques –NLP on degraded text (spoken language, no punctuation, errors…) Finding similar audio or video sequences –Image indexing applied to TV sequences –Indexing algorithms for temporal descriptors Image, Video and Sound for document structuring –Program segmentation and identification Structuring models –Segment models with speech and text –Bayesian models for sparse event detection
10 Texmex – November 15 th, 2005 “I want my PC to use TV”… From well defined contexts to open worlds Sport videos, collections of programs TV streams From techniques to applications Archives management Repurposing Home media management Close to real and future problems Relations with suppliers (Thomson) Relations with service providers (FT) Relations with end-users (INA)
11 Texmex – November 15 th, 2005 Challenges Corpora are still difficult to get Copyright issues, annotation is extremely expensive We need more competences In information retrieval and search engines In computer-human interactions A unique environment In terms of fields concerned In experimental platforms In links with partners and industry It remains complex to make everything working together It should allow very nice and original results!