Story Segmentation of Broadcast News Mehrbod Sharifi Thanks to Andrew Rosenberg ~mehrbod/presentations/SSegDec06.pdf.

Slides:



Advertisements
Similar presentations
Bellwork If you roll a die, what is the probability that you roll a 2 or an odd number? P(2 or odd) 2. Is this an example of mutually exclusive, overlapping,
Advertisements

By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
1 Multimodal Technology Integration for News-on-Demand SRI International News-on-Demand Compare & Contrast DARPA September 30, 1998.
A probabilistic model for retrospective news event detection
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Yansong Feng and Mirella Lapata
LABELING TURKISH NEWS STORIES WITH CRF Prof. Dr. Eşref Adalı ISTANBUL TECHNICAL UNIVERSITY COMPUTER ENGINEERING 1.
Filtering Semi-Structured Documents Based on Faceted Feedback Lanbo Zhang, Yi Zhang, Qianli Xing Information Retrieval and Knowledge Management (IRKM)
Filtering Semi-Structured Documents Based on Faceted Feedback Lanbo Zhang, Yi Zhang, Qianli Xing Information Retrieval and Knowledge Management (IRKM)
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
YO-YO Leader Election Lijie Wang
Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Addition 1’s to 20.
25 seconds left…...
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Week 1.
Using Closed Captions to Train Activity Recognizers that Improve Video Retrieval Sonal Gupta and Raymond Mooney University of Texas at Austin.
Classification Classification Examples
Information Extraction Lecture 7 – Linear Models (Basic Machine Learning) CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Prosodic Cues to Discourse Segment Boundaries in Human-Computer Dialogue SIGDial 2004 Gina-Anne Levow April 30, 2004.
Spoken Language Processing Lab Who we are: Julia Hirschberg, Stefan Benus, Fadi Biadsy, Frank Enos, Agus Gravano, Jackson Liscombe, Sameer Maskey, Andrew.
Automatic Prosody Labeling Final Presentation Andrew Rosenberg ELEN Speech and Audio Processing and Recognition 4/27/05.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg, Julia Hirschberg Columbia University Interspeech /14/06.
On the Correlation between Energy and Pitch Accent in Read English Speech Andrew Rosenberg Weekly Speech Lab Talk 6/27/06.
Classification of Discourse Functions of Affirmative Words in Spoken Dialogue Julia Agustín Gravano, Stefan Benus, Julia Hirschberg Shira Mitchell, Ilia.
2001/03/29Chin-Kai Wu, CS, NTHU1 Speech and Language Technologies for Audio Indexing and Retrieval JOHN MAKHOUL, FELLOW, IEEE, FRANCIS KUBALA, TIMOTHY.
Varying Input Segmentation for Story Boundary Detection Julia Hirschberg GALE PI Meeting March 23, 2007.
Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.
® Automatic Scoring of Children's Read-Aloud Text Passages and Word Lists Klaus Zechner, John Sabatini and Lei Chen Educational Testing Service.
1 International Computer Science Institute Data Sampling for Acoustic Model Training Özgür Çetin International Computer Science Institute Andreas Stolcke.
Topic Detection and Tracking Introduction and Overview.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Exploiting lexical information for Meeting Structuring Alfred Dielmann, Steve Renals (University of Edinburgh) {
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
Overview of the TDT-2003 Evaluation and Results Jonathan Fiscus NIST Gaithersburg, Maryland November 17-18, 2002.
Automatic Speech Recognition: Conditional Random Fields for ASR Jeremy Morris Eric Fosler-Lussier Ray Slyh 9/19/2008.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
1 Computation Approaches to Emotional Speech Julia Hirschberg
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1 Prosody-Based Automatic Segmentation of Speech into Sentences and Topics Elizabeth Shriberg Andreas Stolcke Speech Technology and Research Laboratory.
1 Broadcast News Segmentation using Metadata and Speech-To-Text Information to Improve Speech Recognition Sebastien Coquoz, Swiss Federal Institute of.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
A Critique and Improvement of an Evaluation Metric for Text Segmentation A Paper by Lev Pevzner (Harvard University) Marti A. Hearst (UC, Berkeley) Presented.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
Using Linguistic Analysis and Classification Techniques to Identify Ingroup and Outgroup Messages in the Enron Corpus.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Dec. 4-5, 2003EARS STT Workshop1 Broadcast News Training Experiments Anand Venkataraman, Dimitra Vergyri, Wen Wang, Ramana Rao Gadde, Martin Graciarena,
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
Automatic Speech Recognition: Conditional Random Fields for ASR
Speaker Identification:
Low Level Cues to Emotion
Automatic Prosodic Event Detection
Presentation transcript:

Story Segmentation of Broadcast News Mehrbod Sharifi Thanks to Andrew Rosenberg ~mehrbod/presentations/SSegDec06.pdf

2/19 GALE (Global Autonomous Language Exploitation) … to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information … Transcription Engines (ASR) Translation Engines (MT) Distillation Engines (QA+IR)

3/19

4/19 Task: Story Segmentation Input:.sph: audio files from TDT-4 corpus distributed by LDC.rttmx: output from other collaborators of GALE project (all automated, one word per row) Speaker boundaries (Chuck at ICSI) ASR: words, start & end time, confidence, phone durations (Andreas at SRI/ICSI) Sentence boundaries probabilities (Dustin at UW) Gold standard – annotated story boundaries Output:.rttmx files with story boundaries (generated by a method that performs well on unseen data) /n/squid/proj/gale1/AA/eng-tdt4/tdt4-eng-rttmx /README

5/19 Task: Story Segmentation Event: specific thing that happens at a specific time and place along with all necessary preconditions and unavoidable consequences U.S. Marine jet sliced a funicular cable in Italy in February 1998, the cable car's crash to earth and the subsequent injuries were all unavoidable consequences and thus part of the same event. Topic: an event or activity, along with all directly related events and activities Story: News stories may be of any length, even fewer than two independent clauses, as long as they constitute a complete, cohesive news report on a particular topic. Note that single news stories may discuss more than one related topic.

6/19 Task: Story Segmentation Example: 3898 words / 263 sentences / 26 stories (?: reject or low confidence word) 1. ? 2. ? ? [headlines] ? ? 3. good evening everyone...[report on war]... gillian findlay a. b. c. news ? 4. turning to politics... [election - Gore]... a. b. c. news ? ? 5. this is ron claiborne... [election - Bush]... a. b. c. news ? ? 6. ? as for the two other candidates... said the same 7. still ahead... [teaser]... camera man 8. this is world news... [commercials]... was a woman 9. turning to news overseas... [election]... no matter what 10. its just days after a deadly ferry sinking in greece... safety tests ~mehrbod/rttmx/eng/ _1830_1900_ABC_WNT.rttmx ~mehrbod/out/eng.ANC_WNT.txt

7/19 Task: Story Segmentation How difficult is it? Topic vs. Story Segment classes New story Teaser Misc. Under-transcribed Error accumulated from previous processes

8/19 Current Approach - Summary Align story boundaries with sentence boundaries Extract sentence level features Lexical Acoustic Speaker-dependent Train and evaluate a decision tree classifier (J48 or JRip)

9/19 Current Approach - Features Lexical (*various windows) TextTiling*, LCSeg, keywords*, sentence position and length Acoustic Pitch and Intensity: min, max, median, mean, std. dev., mean absolute slope Pause, speaking rate (voiced frame / total) Vowel Duration: Mean vowel length, sentence final vowel length, sentence final rhyme length Second order of the above Speaker speaker distribution, speaker turn, first in the show

10/19 Current Approach - Results Report in the HLT paper for full feature set at the sentence level F1 (p,r)PkWinDiffCseg English.421(.67,.32) Mandarin.592(.73,.50) Arabic.300(.65,.19) pk (Beeferman et al., 1999) WindowDiff (Pevzner and Hearst, 2002) Cseg (Doddington, 1998)

11/19 Improvements In Progress Looking for ways to reduce the negative effect of error inherited from upstream processes (ASR, SU and speaker detection) Adding/modifying features to make them more flexible to error Analyzing the current features and discard those that are not discriminative or descriptive enough Improving the framework for the package

12/19 Word Level vs. Sentence Level Pros Eliminate the error on sentence boundary detection (it becomes a feature) No need for story boundary alignment Cons More chance for error and lower baseline Higher risk of over fitting

13/19 Word Level vs. Sentence Level

14/19 Word Level - Features Providing information about a window preceding, surrounding or following the current word to provide more information: Acoustic features were done for windows of five words Similar idea for other features, speaker_boundary same_speaker_5 same_speaker_10 same_speaker_20 {TRUE,FALSE}

15/19 Word Level - Features Feature analysis for sentence level features e.g., for ABC show using Weka (ordered list): Chi SquareInformation Gain sent_position pauselen start_time keywords_after_5speaker_distribution keywords_after_10end_time keywords_after_5

16/19 Word Level - Features Word ASR confidence: score, or score < 0.8): Boolean and count in various window widths Word introduction

17/19 Word Level - Results

18/19 Future Directions Finding a reasonable segmentation strategy, followed by clustering on featured extracted from segments: Sentences => A+L+S Pause => L acoustic tiling => L+S Sequential Modeling Performing more morphological analysis particularly in Arabic Using the rest of the story and topic labels Using other parts of the TDT and/or external information for training: WordNet, WSJ, etc. Experimenting with other classifiers: JRip, SVM, Bayesian, GMM, etc.

19/19 Thank you. Questions?