Genoa – May 23, 2006 LREC workshop From Media Crossing to Media Mining Franciska de Jong University of Twente/TNO ICT

Slides:



Advertisements
Similar presentations
Generation of Multimedia TV News Contents for WWW Hsin Chia Fu, Yeong Yuh Xu, and Cheng Lung Tseng Department of computer science, National Chiao-Tung.
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
ESDS Qualidata Libby Bishop, ESDS Qualidata Economic and Social Data Service UK Data Archive ESDS Awareness Day Friday 5 December 2003Royal Statistical.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Yansong Feng and Mirella Lapata
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
Languages & The Media, 5 Nov 2004, Berlin 1 New Markets, New Trends The technology side Stelios Piperidis
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
Web- and Multimedia-based Information Systems. Assessment Presentation Programming Assignment.
Mining the web to improve semantic-based multimedia search and digital libraries
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Multimedia Search and Retrieval: New Concepts, System Implementation, and Application Qian Huang, Atul Puri, Zhu Liu IEEE TRANSACTION ON CIRCUITS AND SYSTEMS.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
AceMedia Personal content management in a mobile environment Jonathan Teh Motorola Labs.
Presented by Zeehasham Rasheed
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Multimedia Data Mining Arvind Balasubramanian Multimedia Lab (ECSS 4.416) The University of Texas at Dallas.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP3 – Retrieval systems.
Information Retrieval in Practice
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
WP5.4 - Introduction  Knowledge Extraction from Complementary Sources  This activity is concerned with augmenting the semantic multimedia metadata basis.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Bootstrapping ontology evolution with multimedia information extraction C.D. Spyropoulos,
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Of 33 lecture 10: ontology – evolution. of 33 ece 720, winter ‘122 ontology evolution introduction - ontologies enable knowledge to be made explicit and.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio News (proceedings page 255) Mike Dowman Valentin Tablan Hamish Cunningham.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Search Engine Architecture
Prof. Thomas Sikora Technische Universität Berlin Communication Systems Group Thursday, 2 April 2009 Integration Activities in “Tools for Tag Generation“
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
LREC 2004, 26 May 2004, Lisbon 1 Multimodal Multilingual Resources in the Subtitling Process S.Piperidis, I.Demiros, P.Prokopidis, P.Vanroose, A. Hoethker,
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Peter Brusilovsky. Index What is adaptive navigation support? History behind adaptive navigation support Adaptation technologies that provide adaptive.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
ELAN as a tool for oral history CLARIN Oral History Workshop Oxford Sebastian Drude CLARIN ERIC 18 April 2016.
LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May Cross-Media Indexing in the Reveal-This System Murat Yakici,
CLARIN ERIC Franciska de Jong Oxford April 2016
Multi-Source Information Extraction Valentin Tablan University of Sheffield.
Digital Video Library - Jacky Ma.
Visual Information Retrieval
Representation and Analysis of Multimedia Content: The BOEMIE Proposal
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
ece 627 intelligent web: ontology and beyond
Peggy van der Kreeft Deutsche Welle
Multimedia Information Retrieval
CSE 635 Multimedia Information Retrieval
Web Mining Department of Computer Science and Engg.
Introduction to Information Retrieval
Content Augmentation for Mixed-Mode News Broadcasts Mike Dowman
Presentation transcript:

Genoa – May 23, 2006 LREC workshop From Media Crossing to Media Mining Franciska de Jong University of Twente/TNO ICT

Genoa – May 23, 2006 LREC workshop Overview Introduction Cluster-oriented browsing Reasoning Audio indexing Conclusion

Genoa – May 23, 2006 LREC workshop Semantic Gap etc. (1) Need for access to information at conceptual level as old as idea of Information Retrieval manual annotation by documentalists –limitation: future user needs hard to predict text as second best: –full text indexing (all words/phrases) –infomation extraction (some words/phrases) –….. –but how about other modalities?

Genoa – May 23, 2006 LREC workshop Semantic Gap etc. (2) Approach for other modalities than text: exploit collateral linguistic elements –text: captions, telepromter text, subtitles –transcribed spreech enable automated semantic annotation –identification of top-n relevant concepts –train detectors (= automated learning of concepts based on low level features) –ontological frameworks: translation of feature patterns into a concept hierarchy

Genoa – May 23, 2006 LREC workshop Semantic Gap etc. (3) W.V.O Quine, ‘Word and Object’ (1960) Motto: Ontology recapitulates philology (James Miller) “Quine argues that the notion of a language- transcendent ‘sentence meaning’ must be rejected; meaningful studies in the semantics of reference can only be directed toward substantially the same language in which they are conducted”.

Genoa – May 23, 2006 LREC workshop Media Crossing Dominant search tasks: text-to-image speech-to-video concept-to-image/video

Genoa – May 23, 2006 LREC workshop Media Crossing as promising concept limitations of monomedia analysis overcome fills the semantic gap between content features and user needs full range of data avaliable can be exploited, including manual annotation records mature idea; early projects already in ’90s (e.g, THISL) TRECVID demonstrates that it works ….

Genoa – May 23, 2006 LREC workshop Media Crossing as poor concept Little progress in 10 years Many projects, but few implementations of fully automated media crossing applications for real life data sets/uses cases. Strong bias to text-to-image Workflow in archives often prohibits adoption of possibilities Why waiting for the breakthrough of an old idea?

Genoa – May 23, 2006 LREC workshop Other X-ing fields Important parallels in Language Crossing Machine Translation –idea is even older (’50s) –successes are rare –# languages covered is indication for sophistication –interesting concepts: language-specific vs. interlingual vs. language independent representations of meaning CLIR (Cross-lingual Information Retrieval) –on the agenda since beginning ’90s (TREC, CLEF) –focus now on tasks for which heuristics play a huge role (QA, image retrieval) –few people (if any) got rich

Genoa – May 23, 2006 LREC workshop Mining has a tradition Important parallels Data Mining Text Mining Audio Mining Media Mining Reality Mining Virtual Reality Mining ….

Genoa – May 23, 2006 LREC workshop Media Mining Ill-defined concept, but combines at least some of these feature Mining: finding patterns that haven’t been put in Content-oriented rather than query-based Format integration vs. format crossing Not limited to combinations of 2 modalities Not limited to text as starting point Not limited to uni-directional approaches Emphasis on automated analysis …

Genoa – May 23, 2006 LREC workshop Initial steps Three illustrations of initial steps in the right direction: content reduction via clustering: Novalist content merging via reasoning: MUMIS content enhancement via audio analysis: MultimediaN

Genoa – May 23, 2006 LREC workshop case 1 - Novalist layered browser for a news corpus heterogenenous in type, format, source: –news-related broadcast programmes –newspapers –webpages –corporate documents 20+ titles, covering 2 years topic clustering (currently based on text only) multifaceted metadata extraction no explicit semantics

Genoa – May 23, 2006 LREC workshop Content reduction Automatically generated cluster metadata keywords thesaurial terms (via automatic classification) lists of names entities network presentation for named entities headlines summarization (via extraction technique) timeline All metadata types can be queried. In addition: full text search

Genoa – May 23, 2006 LREC workshop query: orkaan (‘hurricane’); overview of clusters per period

Genoa – May 23, 2006 LREC workshop

case 2 - MUMIS completed IST project ( ) reports on EC soccer matches heterogenous content base: –multiple sources (speech transcripts, webpages, ticker text, newspapers, tables) –multiple languages target: –searchable knowledge base –timelinks to video content base –reduced redundancy (each event covered only once) –error correction approach: merging results from Information Extraction –time-alignement, unification, re-ordering

Genoa – May 23, 2006 LREC workshop Merging IE results

Genoa – May 23, 2006 LREC workshop case 3 – MultimediaN Content enhancement via audio analysis –segmention (topic, speaker) –speech recognition –time-alignment (enrichment of text with time-stamps) –linking of audio to newspaper archive –language model improvement via cross-media linking Emotion detection Applicaton domains: news, meeting recordings, oral history

Genoa – May 23, 2006 LREC workshop feature extraction from audio extract features -speech (words) -speaker (who, when) -structure (silence, music, speaker change) -emotion

Genoa – May 23, 2006 LREC workshop Time-alignment

Genoa – May 23, 2006 LREC workshop Next steps Attention for: Heterogeneous content integration should get more attention Other modalities than text, speech and image Integration of search results via more abstract (medium- neutral) content models (e.g., probabilistic approaches to image search models, ‘visual words’ approaches) Exploitation of manually created annotations and surface features for video (context models) can help Mining is data oriented; user-interaction can offer additional information. Explore the concept of parameterized search environment

Genoa – May 23, 2006 LREC workshop Conclusion Why waiting for the breakthrough of an old idea? The idea of Media Crossing has offered a useful playgroud for 10 years, some applications based on it have added value, but it should not be seen as a concept rich enough to be the basis of a longterm research programme.

Genoa – May 23, 2006 LREC workshop Thanks! PS. Has this been recorded... ?