Tetsuya Nasukawa, IBM Tokyo Research Lab

Slides:



Advertisements
Similar presentations
Punctuation Generation Inspired Linguistic Features For Mandarin Prosodic Boundary Prediction CHEN-YU CHIANG, YIH-RU WANG AND SIN-HORNG CHEN 2012 ICASSP.
Advertisements

Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Automatic Content Extraction for Voic Using Ninja Goal: Make voic more accessible Enable faster browsing of many voic s Access from different.
Page-level Template Detection via Isotonic Smoothing Deepayan ChakrabartiYahoo! Research Ravi KumarYahoo! Research Kunal PuneraUniv. of Texas at Austin.
June 14th, 2005Speech Group Lunch Talk Kofi A. Boakye International Computer Science Institute Mixed Signals: Speech Activity Detection and Crosstalk in.
EE225D Final Project Text-Constrained Speaker Recognition Using Hidden Markov Models Kofi A. Boakye EE225D Final Project.
Grammar foldable With dialogue.
Thoughts on Treebanks Christopher Manning Stanford University.
Classroom language & Giving instructions
Semantic Parsing for Robot Commands Justin Driemeyer Jeremy Hoffman.
Mining and Summarizing Customer Reviews
How conversation works Conversational English Compiled by Victor Nickolz Grand Lyceum 2004 For classes 7-11.
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Better Punctuation Prediction with Dynamic Conditional Random Fields Wei Lu and Hwee Tou Ng National University of Singapore.
1 Sentence-extractive automatic speech summarization and evaluation techniques Makoto Hirohata, Yosuke Shinnaka, Koji Iwano, Sadaoki Furui Presented by.
Step 2: Inviting to Challenge Group. DON’T! Before getting into the training, it’s important that you DON’T just randomly send someone a message asking.
Rundkast at LREC 2008, Marrakech LREC 2008 Ingunn Amdal, Ole Morten Strand, Jørn Almberg, and Torbjørn Svendsen RUNDKAST: An Annotated.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent CAIR Twente (10/10/2003) Audio Indexing as a first step in an Audio Information Retrieval System Jean-Pierre.
Automatic Cue-Based Dialogue Act Tagging Discourse & Dialogue CMSC November 3, 2006.
Structural Metadata Annotation of Speech Corpora: Comparing Broadcast News and Broadcast Conversations Jáchym KolářJan Švec University of West Bohemia.
1 DUTIE Speech: Determining Utility Thresholds for Information Extraction from Speech John Makhoul, Rich Schwartz, Alex Baron, Ivan Bulyko, Long Nguyen,
Lesson 4: Dialogue. Dialogue is the conversation among characters in a story. Good dialogue helps readers get to know the characters. Dialogue also moves.
What is a Comma? A comma is a punctuation mark that indicates a pause is needed in a sentence. Commas help to clarify meaning for the reader. ,
Parsing & Language Acquisition: Parsing Child Language Data CSMC Natural Language Processing February 7, 2006.
A UTOMATIC G ENERATION OF D OMAIN M ODELS FOR C ALL C ENTERS FROM N OISY T RANSCRIPTIONS Shourya Roy and L Venkata Subramaniam IBM Research India Research.
Lexical, Prosodic, and Syntactics Cues for Dialog Acts.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
What is the woman saying in this picture?
Punctuating Dialogue Photoshopped Summer Narrative.
Guidelines for building a bar graph in Excel and using it in a laboratory report IB Biology (December 2012)
Language Identification and Part-of-Speech Tagging
We travel the world to bring you the latest news!
CRANBOURNE EAST SECONDARY COLLEGE– VCAL LITERACY
Direct Speech Punctuation
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
Automatic Speech Recognition
Authorship Attribution Using Probabilistic Context-Free Grammars
Language & Occupation What is special about the language of work?
Erasmus University Rotterdam
Conditional Random Fields for ASR
What is a Comma? A comma is a punctuation mark that indicates a pause is needed in a sentence. Commas help to clarify meaning for the reader. ,
Say What? The Importance of Dialogue.
Rules for using quotation marks
Automatic Hedge Detection
Punctuation (Thursday, 9/21)
DAY 6: PROOFREADING PART ONE
Pronounced Uh-Woo-bis
Turn-taking and Disfluencies
Recognizing Structure: Sentence, Speaker, andTopic Segmentation
High Frequency Word Entrainment in Spoken Dialogue
The CoNLL-2014 Shared Task on Grammatical Error Correction
iSRD Spam Review Detection with Imbalanced Data Distributions
PROJECTS SUMMARY PRESNETED BY HARISH KUMAR JANUARY 10,2018.
Chunk Parsing CS1573: AI Application Development, Spring 2003
Introduction to Text Analysis
Semi-colons and Colons
Writing with Dialogue.
CS224N Section 3: Corpora, etc.
University of Illinois System in HOO Text Correction Shared Task
Anthor: Andreas Tsiartas, Prasanta Kumar Ghosh,
Avoiding Run-on Sentences, Comma Splices, and Fragments
I’m not that kind of girl!
Extracting Why Text Segment from Web Based on Grammar-gram
Huawei CBG AI Challenges
Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen
Presentation transcript:

Adding Sentence Boundaries to Conversational Speech Transcriptions using Noisily Labelled Examples Tetsuya Nasukawa, IBM Tokyo Research Lab Diwakar Punjani, IBM India Research Lab Shourya Roy , IBM India Research Lab L V Subramaniam , IBM India Research Lab Hironori Takeuchi, IBM Tokyo Research Lab Presented by : Shourya Roy December 9, 2018 IBM Research

What are We Trying to do? Automatically identifying sentence boundaries in noisy transcriptions of conversational data. Transcriptions can be manual or automatic (ASR) It can work without any manual supervision The accuracy improves with manual supervision Detects only periods – not comma, semicolon December 9, 2018 IBM Research

Importance – One Motivating Example from Real Life Huge amount of telephonic conversational data produced in various domains such as CRM, BPO Important to analyze to improve customer satisfaction, agent productivity, market reputation NLP techniques on transcriptions is an obvious approach Transcriptions are noisy and does not contain any punctuation marks POS taggers and syntactic parsers perform poorly in absence of sentence boundaries Importance of analysis of transcriptions Importance of sentence boundary detection for transcriptions analysis December 9, 2018 IBM Research

Why Non Trivial Noise in the dataset Spontaneous nature of conversation Variation in style of speaking Boundary density varies from call to call Removing the calls with very low boundary density improves the scores by approx. 10% December 9, 2018 IBM Research

Existing Solutions SBD on conversational data – not many work Based on Pause (Silence) Information December 9, 2018 IBM Research

Example: Manual Transcription 64.88 67.59 A: i've i've barely been out of the country. i wouldn't {breath} 65.10 67.16 B: {lipsmack} {breath} 67.64 71.26 A: i think my most memorable trip was when i was in high school. 70.57 71.81 B: {breath} uh-huh. 71.69 74.29 A: i went to %uh ^London and ^Paris. 74.29 75.01 B: %oh that's cool. 74.82 76.80 A: and that's about as exotic as it ever got. 76.75 77.76 B: {breath} was it fun? 77.49 79.95 A: %uh other than that, i haven't been west of ^Texas 80.04 80.44 B: %hm. 81.31 83.63 B: {breath} it looks like you are a east *coaster born and raised. 84.02 86.14 A: yeah. how about yourself? where are you? 86.74 87.38 B: {breath} i'm in ^Philly 87.72 90.78 A: you're in ^Philly, i guess? i wonder if everybody here is in ^Philly? probably. {breath} 88.57 89.01 B: yeah. 90.82 94.68 B: yeah, i think so because it's a ~U ^Penn thing. they probably just did it locally. plus 94.80 96.69 B: %uh are you using an ^Omnipoint phone? 96.82 97.23 A: uh-huh Timing Meta Info Names of Places Speaker December 9, 2018 IBM Research

Example : Automatic Transcription then go to properties ok now once when you go to properties up if you scroll down there that he's having internet protocol ok you have to no i'm sorry just any scroll down that you're having a net firewall so that's no we have to check if there's a check next to it ok if it's not checked you have to get a check that ok and if if you do not so if you are calling you having a check all you have to do is i can check the net firewalls so this ok and you have to go ahead and reboot the system December 9, 2018 IBM Research

Example then go to properties ok now once when you go to properties up if you scroll down there that he's having internet protocol ok you have to no i'm sorry just any scroll down that you're having a net firewall so that's no we have to check if there's a check next to it ok if it's not checked you have to get a check that ok and if if you do not so if you are calling you having a check all you have to do is i can check the net firewalls so this ok and you have to go ahead and reboot the system December 9, 2018 IBM Research

Summary of Proposed Technique From (possibly imprecisely) marked sentence boundaries in conversational data identify n-grams which are more likely to occur at sentence boundaries than inside the sentence Mark sentence boundaries before (or after) head or (tail) n-grams in test data December 9, 2018 IBM Research

Technique Preprocessing of data Pause filling words, repetitions, unclear words are removed Identify frequent head and tail n-grams from training data which occur in beginning and ending of sentences Filter n-grams which also occur significant number of times in middle of the sentences Threshold on head/tail:middle of sentence ratio Handle interruption and continuation across turns separately Words indicating incomplete turn e.g. get, and December 9, 2018 IBM Research

Technique (Contd.) In the test set mark a boundary before every head n-gram and after every tail n-gram In the case of boundaries marked based on silence information on ASR data, add new sentence boundaries If the turn does not end with a word from the set of words indicating incomplete turn mark a boundary at the end of the turn December 9, 2018 IBM Research

Nature of Data Manual Transcriptions Automatic Transcriptions Switchboard corpus and the Call-home corpus of transcribed phone conversations from LDC Automatic Transcriptions Manually put punctuations Automatically put punctuations based on silence ASR transcribed calls from IBM helpdesk Data Statistics December 9, 2018 IBM Research

Results Result of punctuation insertion for helpdesk data Method Precision Recall F1 Word Error Rate (WER) Only Silence 0.54 0.28 0.37 0.96 Only Head/Tail 0.78 0.55 0.65 0.60 Increasing Decreasing Head/Tail + Silence 0.66 0.72 0.68 Head/Tail + Silence – FalseBoundaries 0.69 0.70 0.58 Result of punctuation insertion for helpdesk data December 9, 2018 IBM Research

Improvement in PoS Tagging PoS Tagging Accuracy on Helpdesk Data An example PoS tagging improving with sentence boundary detection Ideally ‘i’ should be pronoun and ‘yeah’ and ‘oh’ should be interjection December 9, 2018 IBM Research

Improvement in PoS Tagging (Contd.) Extracted top 10 Noun Phrases from Switchboard Data Set December 9, 2018 IBM Research

Summary Fundamental operation to be performed to apply state-of-the-art NLP techniques on (automatic) transcriptions of conversations We proposed a technique to train a sentence boundary detector with minimal manual supervision It would be interesting to see how much improvement is happening in actual extraction task! December 9, 2018 IBM Research

Questions? December 9, 2018 IBM Research