Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically evaluating content selection in summarization without human models. In Proceedings of EMNLP. Pitler E., Louis A. and Nenkova A Automatic evaluation of linguistic quality in multi- document summarization, In Proceedings of ACL. Louis A. and Nenkova A Automatic identification of general and specific sentences by leveraging discourse annotations, In Proceedings of IJCNLP Summary Text Quality Useful to measure text quality in a variety of settings To further filter relevant results during search “Property of both content in the article and well-written nature” Thesis contributions 1. Factors that matter for most texts: Generic methods to score content and linguistic quality in the context of automatic summaries 2. Domain-specific factors: For the genre of writing about science 3. Applications 2. Authoring support for academic writing 3. Article recommendation for science news 1.Conference and journal publications2.Science journalism Generic text quality factors for summary evaluation Some factors matter for all texts: syntax, easy vocabulary, good content and organization I developed two very good systems for scoring the content and linguistic quality of summaries using generic factors Content quality : Louis & Nenkova (2009) Input-summary similarity = measure of summary quality How to measure similarity? Jensen-Shannon divergence JS divergence: Difference between 2 probability distributions—Input (I) and summary (S) Define and quantify text quality factors that are general and those that are domain specific to articles about science and test their use in applications Linguistic quality : Pitler, Louis & Nenkova (2010) Different aspects Word familiarity: language models (prior idea of word probabilities computed from a large corpus) Syntactic complexity: parse tree depth, length of phrases Sentence flow: word similarity and compatibility between adjacent sentences Factors related to quality of articles about science Data collectionAcademic writing Science journalism Highly accurate: more than 80% for ranking systems High correlations: 0.88 with human rankings No existing corpus of text quality ratings for science genre In contrast to summarization, where huge number of summaries with previously created human judgements are available 1. Academic writing Citations Annotations: More reflective of text quality than citations and not highly correlated with citations Focus using questions for each section Introduction: Why is the problem important? Has the problem been addressed before? Is the proposed solution motivated? Only abstract, introduction, related work, conclusion 2. Science journalism Well-written articles: New York Times (NYT) articles published in Best American Science Writing books Average: Articles from NYT written by authors of the well- written articles Negative: Other articles from NYT published during the same period on the similar topics Factor 1: Prevalence of subjective sentences Hypothesis: Opinions make an article interesting Task to solve: Automatically identify subjective expressions Annotate a corpus of subjective expressions Learn positive/ negative word dictionary using unsupervised methods Develop a subjectivity classifier using dictionary and context clues Factor 2: Distribution of rhetorical zones Hypothesis: Placement of sentences with different functions varies between good and poorly-written articles Task to solve: Predict zones automatically Already explored in work by others and good accuracies are obtainable (motivation in prior work is information extraction) Explore features related to sizes and transition between zones aim motivation example p rior work comparison Sequences in good articles Factor 1: Presence of visual words Hypothesis: Visual descriptions make an article easy to read Task to solve: Automatically identify image-invoking words in the article I use a large corpus of images and associated tags given by people visual words Features related to density of visual words, position in article and variety in use “Conventional methods to solve this problem are difficult and time- consuming” Factor 2: Surprise-invoking sentences Hypothesis: Surprise elements make the article exciting Task to solve: Predict lexical, syntactic, topic correlates of surprise Likelihood under language model Parse probability Verb-argument compatibility Order of verbs Rare topics in news “Sara Lewis is fluent in firefly” lake, cloud, tree Applications Authoring support for academic writing Article recommendation for science news Highlighting based feedback Based on zone sizes, transition between zones and subjectivity levels Choose topics in science Ranking 1: Articles in order of keyword relevance Ranking 2: Including visual and surprisal scores User study to note preference between the two rankings 1. Automatic summary evaluation Written by academics Sophisticated audience Written by journalists For lay audience Problem definition References My thesis differentiates generic and domain- specific factors of text quality I aim to develop domain-specific text quality measures for the genre of writing about science I explore summary evaluation, authoring support and article recommendation as applications for these measures of text quality 2