Readability Assessment for Text Simplification Sandra Aluisio 1, Lucia Specia 2, Caroline Gasperin 1, Carolina Scarton 1 1 University of São Paulo, Brazil.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Tracking L2 Lexical and Syntactic Development Xiaofei Lu CALPER 2010 Summer Workshop July 14, 2010.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Blueprints or conduits? Using an automated tool for text analysis Stuart G. Towns and Richard Watson Todd King Mongkut’s University of Technology Thonburi.
Hashtags as Milestones in Time Identifying the hashtags for meaningful events using Twitter search logs and Wikipedia data Stewart Whiting University of.
A Metric for Software Readability by Raymond P.L. Buse and Westley R. Weimer Presenters: John and Suman.
Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.
计算机科学与技术学院 Chinese Semantic Role Labeling with Dependency-driven Constituent Parse Tree Structure Hongling Wang, Bukang Wang Guodong Zhou NLP Lab, School.
® Towards Using Structural Events To Assess Non-Native Speech Lei Chen, Joel Tetreault, Xiaoming Xi Educational Testing Service (ETS) The 5th Workshop.
Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.
1 I256: Applied Natural Language Processing Marti Hearst Sept 13, 2006.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
An Analysis of Statistical Models and Features for Reading Difficulty Prediction Michael Heilman, Kevyn Collins-Thompson, Maxine Eskenazi Language Technologies.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Cloze Task Quality for Vocabulary Training Adam Skory, Maxine Eskenazi Language Technologies Institute Carnegie Mellon University
Rutgers’ HARD Track Experiences at TREC 2004 N.J. Belkin, I. Chaleva, M. Cole, Y.-L. Li, L. Liu, Y.-H. Liu, G. Muresan, C. L. Smith, Y. Sun, X.-J. Yuan,
READING – WRITING RELATIONS Are there any? 1. A GENDA The Rationale Literature Review The Purpose of the Study The Study The Research Questions The Results.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
Opinion mining in social networks Student: Aleksandar Ponjavić 3244/2014 Mentor: Profesor dr Veljko Milutinović.
Learning to Rank for Information Retrieval
History of Leveling Systems For as long as teachers have taught children to read, finding appropriate books for students has been a concern. Both readability.
What is Readability?  A characteristic of text documents..  “the sum total of all those elements within a given piece of printed material that affect.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Metadata generation and glossary creation in eLearning Lothar Lemnitzer Review meeting, Zürich, 25 January 2008.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
Detecting Promotional Content in Wikipedia Shruti Bhosale Heath Vinicombe Ray Mooney University of Texas at Austin 1.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
A Comparison of Features for Automatic Readability Assessment Lijun Feng 1 Matt Huenerfauth 1 Martin Jansche 2 No´emie Elhadad 3 1 City University of New.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
NTUT Writing: Week 5 “Results”. 6.1 Contents and Structure: An Example.
1 The Ferret Copy Detector Finding short passages of similar texts in large document collections Relevance to natural computing: System is based on processing.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
UNIT 4: Readability Index Measurement 1. What is Readability? The feature of plain language that makes it easy to read Or Describes the ease with which.
Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009.
A Language Independent Method for Question Classification COLING 2004.
NLP. Introduction to NLP Extrinsic –Use in an application Intrinsic –Cheaper Correlate the two for validation purposes.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Chapter 23: Probabilistic Language Models April 13, 2004.
Supertagging CMSC Natural Language Processing January 31, 2006.
Support Vector Machines Tao Department of computer science University of Illinois.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
The P YTHY Summarization System: Microsoft Research at DUC 2007 Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki,
Content Usability A presentation on creating usable content for the online environment. By John Stubbe.
Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
WP4 Models and Contents Quality Assessment
Improving a Pipeline Architecture for Shallow Discourse Parsing
Writing Analytics Clayton Clemens Vive Kumar.
The Difference Between Revision and Editing
UNIT 3: READABILITY INDEX MEASUREMENT
Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets
Presentation transcript:

Readability Assessment for Text Simplification Sandra Aluisio 1, Lucia Specia 2, Caroline Gasperin 1, Carolina Scarton 1 1 University of São Paulo, Brazil 2 University of Wolverhampton, UK The 5th Workshop on Innovative Use of NLP for Building Educational Applications

Develop technology to benefit low literacy readers Motivation The 5th Workshop on Innovative Use of NLP for Building Educational Applications 2 68 % ˗ Rudimentary: studied up to 4 years; can find explicit information in short and familiar texts ˗ Basic: studied between 4 and 8 years; can read and understand texts of average length, and find information even when it is necessary to make some inference INAF levels

Readability Assessment To assess the readability level of a text – Three levels of readability: INAF levels Rudimentary – Basic – Advanced To supplement our text simplification technology – Two levels of simplification: degree of application of simplification operations STRONG: operations are applied to all complex syntactic phenomena present NATURAL: operations are applied selectively, only when the resulting text remains “natural” The 5th Workshop on Innovative Use of NLP for Building Educational Applications3  RUDIMENTARY  BASIC

Text Simplification Scenario Authoring tool for creating simplified texts 1.Author inputs text 2.Author receives suggestions of possible simplifications: may accept or not Lexical substitutions Syntactic simplification 3.Author does not know if the text is simple enough for his audience Feedback: Readability assessment The 5th Workshop on Innovative Use of NLP for Building Educational Applications4 SIMPLIFICA

Readability Assessment System Machine learning – Classes = 3 INAF levels – Trained on corpus of manually simplified texts Original text + natural and strong simplifications – Extensive set of features Cognitively-motivated: Coh-Metrix [Graesser et al., 2004] Syntactic: occurrence of complex phenomena Language model: up to trigrams – 3 paradigms: Classification, Ordinal Classification, Regression [Heilman et al., 2007] The 5th Workshop on Innovative Use of NLP for Building Educational Applications5

Corpora Training and testing corp – General news: Zero Hora (ZH) newspaper – Popular science news: Caderno Ciencia (CC) – 3 versions for each text: original, natural, strong The 5th Workshop on Innovative Use of NLP for Building Educational Applications6

Features 1Number of words21Number of high level constituents41Adverb ambiguity ratio 2Number of sentences22Number of personal pronouns42Adjective ambiguity ratio 3Number of paragraphs23Type-token ratio43Incidence of clauses 4Number of verbs24Pronoun-NP ratio44Incidence of adverbial phrases 5Number of nouns25Number of “e” (and)45Incidence of apposition 6Number of adjectives26Number of “ou” (or)46Incidence of passive voice 7Number of adverbs27Number of “se” (if)47Incidence of relative clauses 8Number of pronouns28Number of negations48Incidence of coordination 9Average number of words per sentence29Number of logic operators49Incidence of subordination 10 Average number of sentences per paragraph 30Number of connectives50Out-of-vocabulary words 11Average number of syllables per word31Number of positive additive connectives51LM probability of unigrams 12Flesch index for Portuguese32Number of negative additive connectives52LM perplexity of unigrams 13Incidence of content words33Number of positive temporal connectives53LM perplexity of unigrams, no line break 14Incidence of functional words34Number of negative temporal connectives54LM probability of bigrams 15Raw Frequency of content words35Number of positive causal connectives55LM perplexity of bigrams 16Minimal frequency of content words36Number of negative causal connectives56LM perplexity of bigrams, no line break 17Average number of verb hypernyms37Number of positive logic connectives57LM probability of trigrams 18Incidence of NPs38Number of negative logic connectives58LM perplexity of trigrams 19Number of NP modifiers39Verb ambiguity ratio59LM perplexity of trigrams, no line break 20Number of words before the main verb40Noun ambiguity ratio The 5th Workshop on Innovative Use of NLP for Building Educational Applications7

Feature Analysis Pearson correlation between features and literacy levels The 5th Workshop on Innovative Use of NLP for Building Educational Applications8 FeatureCorrelation 1Words per sentence Incidence of apposition Incidence of clauses Flesch index Words before main verb Sentences per paragraph Incidence of relative clauses Syllables per word Number of positive additive connectives Number of negative causal connectives0.388

Predicting readability Levels The 5th Workshop on Innovative Use of NLP for Building Educational Applications9 Classification Weka SVM Ordinal Classification Weka Pairwise SVM

Predicting readability Levels The 5th Workshop on Innovative Use of NLP for Building Educational Applications10 Regression Weka SVM-reg, RBF Kernel Best correlation: Regression Lowest MAE: Ordinal Classification Combination of all features consistently yields better results: more robust Syntactic features achieve the best correlation scores Language model features performed the poorest

Conclusions It is possible to predict with satisfactory performance the readability level of texts according to our three classes of interest Ordinal Classification seems to be the most appropriate model to use – High correlation, lowest error rate (MAE) Combination of all features is best The 5th Workshop on Innovative Use of NLP for Building Educational Applications11

SIMPLIFICA Tool Integration of classification model – Simplest model, highest F-measure, comparable correlation scores The 5th Workshop on Innovative Use of NLP for Building Educational Applications12

Future Work Add deeper cognitive features, e.g. semantic, coreference, latent semantics metrics User evaluation: authors The 5th Workshop on Innovative Use of NLP for Building Educational Applications13

Thanks! The 5th Workshop on Innovative Use of NLP for Building Educational Applications14