Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.

Slides:



Advertisements
Similar presentations
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Advertisements

Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon National Institute for Testing.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
18 and 24-month-olds use syntactic knowledge of functional categories for determining meaning and reference Yarden Kedar Marianella Casasola Barbara Lust.
Measuring Referring Expressions in a Story Context Phyllis Schneider, Speech Pathology & Audiology, University of Alberta Denyse Hayward, University of.
Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Learning Within-Sentence Semantic Coherence Elena Eneva Rose Hoberman Lucian Lita Carnegie Mellon University.
Readability Assessment for Text Simplification Sandra Aluisio 1, Lucia Specia 2, Caroline Gasperin 1, Carolina Scarton 1 1 University of São Paulo, Brazil.
Text Specificity and Impact on Quality of News Summaries Annie Louis & Ani Nenkova University of Pennsylvania June 24, 2011.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Input-Output Relations in Syntactic Development Reflected in Large Corpora Anat Ninio The Hebrew University, Jerusalem The 2009 Biennial Meeting of SRCD,
Semi-automatic glossary creation from learning objects Eline Westerhout & Paola Monachesi.
TBALL ASR Work Summary USC group: Joseph Tepperman, Matt Black, Abe Kazemzadeh, Matteo Gerosa, Sungbok Lee, Shri Narayanan.
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Measuring Language Development in Children: A Case Study of Grammar Checking in Child Language Transcripts Khairun-nisa Hassanali and Yang Liu {nisa,
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz Laura Plaza Pablo Gervás Alberto Díaz Universidad.
What is Readability?  A characteristic of text documents..  “the sum total of all those elements within a given piece of printed material that affect.
LREC 2010, Malta Maj Centre for Language Technology The DAD corpora and their uses Costanza Navarretta Funded by Danish Research.
The fragile-X syndrome: What about the deficit in the pragmatic component of language? Abstract The fragile-X syndrome: What about the deficit in the pragmatic.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali and Vasileios Hatzivassiloglou Human Language Technology Research Institute The.
Automated Scoring of Picture- based Story Narration Swapna Somasundaran Chong Min Lee Martin Chodorow Xinhao Wang.
A Comparison of Features for Automatic Readability Assessment Lijun Feng 1 Matt Huenerfauth 1 Martin Jansche 2 No´emie Elhadad 3 1 City University of New.
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Incorporating Extra-linguistic Information into Reference Resolution in Collaborative Task Dialogue Ryu Iida Shumpei Kobayashi Takenobu Tokunaga Tokyo.
On Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings Jáchym Kolář 1,2 Elizabeth Shriberg 1,3 Yang Liu 1,4.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Predicting Student Emotions in Computer-Human Tutoring Dialogues Diane J. Litman&Kate Forbes-Riley University of Pittsburgh Department of Computer Science.
Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma
Coding and Evaluating Deaf Students' Productive English Gerald P. Berent, Paula M. Brown, & Brenda H. Whitehead National Technical Institute for theDeaf.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Presenter: Jinhua Du ( 杜金华 ) Xi’an University of Technology 西安理工大学 NLP&CC, Chongqing, Nov , 2013 Discriminative Latent Variable Based Classifier.
Automatic Evaluation of Linguistic Quality in Multi- Document Summarization Pitler, Louis, Nenkova 2010 Presented by Dan Feblowitz and Jeremy B. Merrill.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
Conditional Random Fields for ASR Jeremy Morris July 25, 2006.
CoCQA : Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation Baoli Li, Yandong Liu, and Eugene Agichtein.
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Combining Speech Attributes for Speech Recognition Jeremy Morris November 9, 2006.
J UMPING AROUND AND LEAVING THINGS OUT : A PROFILE OF THE NARRATIVES ABILITIES OF CHILDREN WITH SPECIFIC LANGUAGE IMPAIRMENT M IRANDA, A., M C C ABE, A.,
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Using Latent Dirichlet Allocation for Child Narrative Analysis Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar Solorio 2
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May Cross-Media Indexing in the Reveal-This System Murat Yakici,
Language Identification and Part-of-Speech Tagging
A Simple Approach for Author Profiling in MapReduce
gesture features for coreference
Conditional Random Fields for ASR
Analysis of Spontaneous Speech in Dementia of Alzheimer Type: Experiments with Morphological and Lexical Analysis Nick Cercone Vlado Keselj Calvin Thomas.
Framework for Analysing Children’s Reading Books
Social Knowledge Mining
LING 388: Computers and Language
Text Detection in Images and Video
Clustering Algorithms for Noun Phrase Coreference Resolution
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar Solorio 2 1 The University of Texas at Dallas 2 University of Alabama at Birmingham 1. Summary  Evaluate various features for automatic prediction of LI from child language transcripts  General word and text, and syntactic features perform better for spontaneous narratives  Referential and semantic and entity grid features perform better for story telling narratives  Narrative structure quality features led to an increase of 8.7% over baseline for spontaneous narratives 7. Experimental Results 7. Conclusion 3. Data  Transcripts of adolescents aged 14 years, for two tasks:  Story telling task  Spontaneous personal narrative  118 speakers (99 TD children, 18 LI children)  118 transcripts for each task  Annotated story telling transcripts for coherence and narrative structure features  Treat prediction of LI as a binary classification task  Used features in prior work by Gabani et al. as a baseline  Feature categories 2- 6: from Coh-Metrix tool  Entity grid model features: Brown coherence toolkit  Narrative structure and quality features: based on manual annotation for the story telling narratives  Used leave-one-out cross validation  Classification experiments: WEKA 2. The Larger Problem  Detecting language impairment (LI) in children  Traditional methods of detecting LI  Cutoff methods on norm referenced tests  Time consuming  May result in over and under identification of LI  Automatic detection of LI is faster and allows for exploring more features beyond norm referenced tests  Given a child language transcript, answer the following question:  Does the transcript belong to a typically developing (TD) child or a child with LI?  What features are useful in detecting LI?  Deeper NLP features are explored for automatic prediction of Language Impairment (LI)  Narrative structure and narrative quality features are also explored for the automatic prediction of LI for story telling tasks  Narrative structure and quality features along with a combination of other features are helpful in the prediction of LI on story telling narratives 5. Experiment Setup No Feature CategoryExample of features 1Gabani et al. (Gb) (baseline) Probabilities of language model, Morphosyntactic features,, Language productivity features 2Readability (RM)Flesh reading ease score 3Situational Model (SM) Repetition score for tense and aspect, Number of causal verbs in the text Ratio of causal verbs to causal particles 4General word and text (GWT) Number of utterances, Mean frequency of content words, Mean concreteness of content words, Mean hypernym value of nouns in text 5Syntactic (Syn) Number of noun phrases, Syntactic similarities between utterances, No of personal pronouns per 1000 words, Ratio of pronouns to noun phrases 6Referential and Semantic (RS) Number of anaphora references Proportion of content words in adjacent utterances that share a content word 7Entity Grid (EG)Fractions of each entity distribution 8 Narrative structure and quality (NSQ) Coherence, no of cognitive references, No of social engagement devices, Instantiation of story, resolution of story, Presence of search episodes in story, Maintenance of search theme, No of affective devices and intensifiers, INTERSPEECH Results 4. Features  Bayesian network classifier performed the best  Performed feature selection Instantiation of story, number of cognitive references and number of social engagement devices were top scoring NSQ features Task Feature PrecisionRecallF-1 Spontaneous narrative Gb (baseline) GWT Entity grid RS + GWT Gb + All Gb + All – RS Story telling Gb (baseline) Narrative GWT Entity grid RS + GWT Gb + All Gb + All – RS Gb + Narrative