Download presentation
Presentation is loading. Please wait.
Published byEdwin Dorsey Modified over 8 years ago
1
Multimedia Semantic Analysis in the PrestoSpace Project Valentin Tablan, Hamish Cunningham, Cristian Ursu NLP Research Group University of Sheffield Regent Court, 211 Portobello Street, Sheffield, S1 4DP, UK http://nlp.shef.ac.ukhttp://nlp.shef.ac.uk, http://gate.ac.ukhttp://gate.ac.uk
2
LREC 2006, Genoa, Italy – Crossing Media Workshop 2 Project Mission The 20th Century was the first with an audiovisual record. Audiovisual media became the new form of cultural expression. These historical, cultural and commercial assets are now entirely at risk from deterioration. PrestoSpace aims to provide technical devices and systems for digital preservation of all types of audio-visual collections.
3
LREC 2006, Genoa, Italy – Crossing Media Workshop 3 The Partners IP, 34 partners Steering Board: Institut National de l’Audiovisuel INA (France) British Broadcasting Corporation BBC (UK) Radiotelevisione Italiana RAI (Italy) Joanneum Research JRS (Austria) Netherlands Institute for Sound and Vision - Beeld en Geluid B&G (The Netherlands) Oesterreichischer Rundfunk ORF (Austria) University of Sheffield USFD (UK)
4
LREC 2006, Genoa, Italy – Crossing Media Workshop 4 Project Organisation
5
LREC 2006, Genoa, Italy – Crossing Media Workshop 5 Semantic Analysis – Motivation Sizeable archives plus new material produced daily (BBC has 8 TV and 11 radio national channels). Some of this material can be reused in new productions. Access to archive material can be provided by some form of semantic annotation and indexing, but manual annotation is time consuming (up to 10x real time) and expensive. Archive budgets alone cannot support digitisation effort.
6
LREC 2006, Genoa, Italy – Crossing Media Workshop 6 English SA - RichNews A prototype addressing the automation of semantic annotation for multimedia material. Not aiming at reaching performance comparable to that of human annotators. Fully automatic. Aimed at news material, further extensions envisaged. TV and radio news broadcasts from the BBC were used during development and testing.
7
LREC 2006, Genoa, Italy – Crossing Media Workshop 7 Overview Input: multimedia file Output: OWL/RDF descriptions of content Headline (short summary) List of entities (Person/Location/Organization/…) Related web pages Segmentation Multi-source Information Extraction system Automatic speech transcript Subtitles/closed captions Related web pages Legacy metadata
8
LREC 2006, Genoa, Italy – Crossing Media Workshop 8 Using ASR Transcripts ASR is performed by the THISL system. Based on ABBOT connectionist speech recognizer. Optimized specifically for use on BBC news broadcasts. Average word error rate of 29%. Error rate of up to 90% for out of studio recordings. No capitalisation – limited IE capability.
9
LREC 2006, Genoa, Italy – Crossing Media Workshop 9 ASR error examples he was suspended after his arrest [SIL] but the process were set never to have lost confidence in him he was suspended after his arrest [SIL] but the Princess was said never to have lost confidence in him and other measures weapons inspectors have the first time entered one of saddam hussein's presidential palaces United Nations weapons inspectors have for the first time entered one of saddam hussein's presidential palaces
10
LREC 2006, Genoa, Italy – Crossing Media Workshop 10 Architecture THISL Speech Recogniser C99 Topical Segmenter TF.IDF Key Phrase Extraction Media File Manual Annotation (Optional) Entity Validation Semantic Index Web-Search and Document Matching KIM Information Extraction Degraded Text Information Extraction
11
LREC 2006, Genoa, Italy – Crossing Media Workshop 11 Search for Related Pages ASR transcript segmented using C99. Key-phrases found for each segment using TF/IDF. Any sequence of up to three words can be a phrase; up to four phrases extracted per story. Key-phrases used to search the BBC, Times, Guardian and Telegraph newspaper websites. Searches are restricted to the day of broadcast, or the day after. The text of the returned web pages is compared to the text of the transcript to find matching stories.
12
LREC 2006, Genoa, Italy – Crossing Media Workshop 12 Using the Web Pages The web pages contain: A headline, summary and section for each story. Good quality text that is readable, and contains correctly spelt proper names. They give more in depth coverage of the stories.
13
LREC 2006, Genoa, Italy – Crossing Media Workshop 13 Semantic Annotation The KIM knowledge management system can semantically annotate the text derived from the web pages: KIM will identify people, organizations, locations etc. KIM performs well on the web page text, but very poorly when run on the transcripts directly. This allows for semantic ontology-aided searches for stories about particular people or locations etcetera. So we could search for people called Sydney, which would be difficult with a text-based search.
14
LREC 2006, Genoa, Italy – Crossing Media Workshop 14 Entity Matching
15
LREC 2006, Genoa, Italy – Crossing Media Workshop 15 Evaluation Evaluation based on 66 news stories from 9 half-hour news broadcasts. Web pages were found for 40% of stories. 7% of pages reported a closely related story, instead of that in the broadcast. Lenient recall: 47%, precision: 100%. Results are based on earlier version of the system, only using BBC web pages.
16
LREC 2006, Genoa, Italy – Crossing Media Workshop 16 Future Improvements Use teletext subtitles (closed captions) when they are available Better story segmentation through visual cues. Use for different domains and languages.
17
LREC 2006, Genoa, Italy – Crossing Media Workshop 17 Thank you! More information: http://www.prestospace.org http://gate.ac.uk
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.