Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A. 2009. Automatically.

Slides:



Advertisements
Similar presentations
Prose Analysis Essay for the AP Language and Composition Exam
Advertisements

A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Yansong Feng and Mirella Lapata
On-Demand Writing Assessment
Process Skill Writing / Writing Process. Students use elements of the writing process (planning, drafting, revising, editing, and publishing) to compose.
Book Report Academic Writing for Graduate Students Essential Tasks and Skills (3 rd edition) Asst. Prof. Dr. Siriluck Usaha Department of English for Business.
Strategies in Designing Reading & Writing assignments in Chemistry.
Project Proposal.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Chapter 12 – Strategies for Effective Written Reports
C HAPTER 5 Writing the Research Paper. C OMING U P WITH A T OPIC What are you interested in? Do you have a unique perspective on something? What would.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
47 th Annual Meeting of the Association for Computational Linguistics and 4 th International Joint Conference on Natural Language Processing Of the AFNLP.
Text Specificity and Impact on Quality of News Summaries Annie Louis & Ani Nenkova University of Pennsylvania June 24, 2011.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Michigan Common Core Standards
Writing Reports, Proposals, and Technical Documents By Laurie A. Pinkert.
CHAPTER 3: DEVELOPING LITERATURE REVIEW SKILLS
AP Prompt #2: Prose Prompt. The FREE RESPONSE prompt (almost) ALWAYS asks… …what it contributes the meaning of the work as a whole …how it illuminates.
Moving to LDC in Chemistry. What is LDC? An Instructional Framework that builds in the instructional shifts that move us toward common Core Implementation.
What is Readability?  A characteristic of text documents..  “the sum total of all those elements within a given piece of printed material that affect.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Citation Recommendation 1 Web Technology Laboratory Ferdowsi University of Mashhad.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
The Computational Linguistics Summarization Pilot TAC 2014 Kokil Jaidka †, Muthu Kumar Chandrasekaran* ‡, Min-Yen Kan* ‡, Ankur Khanna ‡ Nanyang.
Analysis Essay for the AP Language and Composition Exam Introduction Information Advice.
ELA Common Core Shifts. Shift 1 Balancing Informational & Literary Text.
 Find out if information from proposed project is already available  Acquire a broad general background in a given field  Acquire new ideas for exploration.
Module 5.1 Unit 1: Building Background Knowledge on Human Rights
Day 3. Standards Reading: 1.0 Word Analysis, Fluency, and Systematic Vocabulary Development- Students apply their knowledge of word origins to determine.
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
SAC 1 Informal Discourse Comparative Analysis. Analytical Commentary SAC 1: Analytical Commentary What is it? Linguistic analysis. Articulate your understanding.
Literature Reviews: the Hows, Whys and Wherefores GEO 518 Anne Nolin and Dawn Wright.
How to Prepare an Annotated Bibliography
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Writing a Critical Review
From description to analysis
Automatic Evaluation of Linguistic Quality in Multi- Document Summarization Pitler, Louis, Nenkova 2010 Presented by Dan Feblowitz and Jeremy B. Merrill.
Blog Summarization We have built a blog summarization system to assist people in getting opinions from the blogs. After identifying topic-relevant sentences,
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
 An article review is written for an audience who is knowledgeable in the subject matter instead of a general audience  When writing an article review,
1 Generating Comparative Summaries of Contradictory Opinions in Text (CIKM09’)Hyun Duk Kim, ChengXiang Zhai 2010/05/24 Yu-wen,Hsu.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
Principals of Research Writing. What is Research Writing? Process of communicating your research  Before the fact  Research proposal  After the fact.
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
How to Write Literature Review ww.ePowerPoint.com
The Art of Persuasion English 102. Review of 6 Traits of Good Writing Content Define a specific topic with a main idea/thesis statement that supports.
Abstracting.  An abstract is a concise and accurate representation of the contents of a document, in a style similar to that of the original document.
Argumentative Writing Grades College and Career Readiness Standards for Writing Text Types and Purposes arguments 1.Write arguments to support a.
Single Document Key phrase Extraction Using Neighborhood Knowledge.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
How to Write a Summary Text ReadAnnotateWrite. Why write a summary? To locate and understand key points from a chapter to study for a test To take notes.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Abstract  An abstract is a concise summary of a larger project (a thesis, research report, performance, service project, etc.) that concisely describes.
Automatic Writing Evaluation
Academic writing.
IB Assessments CRITERION!!!.
Reading & Writing assignments in Chemistry
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
MYP Descriptors – Essay Types & Rubrics
Dr. Donna Harp Ziegenfuss
Aspect-based sentiment analysis
CSCD 506 Research Methods for Computer Science
Presentation transcript:

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically evaluating content selection in summarization without human models. In Proceedings of EMNLP. Pitler E., Louis A. and Nenkova A Automatic evaluation of linguistic quality in multi- document summarization, In Proceedings of ACL. Louis A. and Nenkova A Automatic identification of general and specific sentences by leveraging discourse annotations, In Proceedings of IJCNLP Summary Text Quality  Useful to measure text quality in a variety of settings  To further filter relevant results during search “Property of both content in the article and well-written nature” Thesis contributions 1. Factors that matter for most texts: Generic methods to score content and linguistic quality in the context of automatic summaries 2. Domain-specific factors: For the genre of writing about science 3. Applications 2. Authoring support for academic writing 3. Article recommendation for science news 1.Conference and journal publications2.Science journalism Generic text quality factors for summary evaluation  Some factors matter for all texts: syntax, easy vocabulary, good content and organization  I developed two very good systems for scoring the content and linguistic quality of summaries using generic factors Content quality : Louis & Nenkova (2009)  Input-summary similarity = measure of summary quality  How to measure similarity? Jensen-Shannon divergence  JS divergence: Difference between 2 probability distributions—Input (I) and summary (S)  Define and quantify text quality factors that are general and those that are domain specific to articles about science and test their use in applications Linguistic quality : Pitler, Louis & Nenkova (2010)  Different aspects  Word familiarity: language models (prior idea of word probabilities computed from a large corpus)  Syntactic complexity: parse tree depth, length of phrases  Sentence flow: word similarity and compatibility between adjacent sentences Factors related to quality of articles about science Data collectionAcademic writing Science journalism  Highly accurate: more than 80% for ranking systems  High correlations: 0.88 with human rankings  No existing corpus of text quality ratings for science genre  In contrast to summarization, where huge number of summaries with previously created human judgements are available 1. Academic writing  Citations  Annotations: More reflective of text quality than citations and not highly correlated with citations  Focus using questions for each section  Introduction: Why is the problem important? Has the problem been addressed before? Is the proposed solution motivated?  Only abstract, introduction, related work, conclusion 2. Science journalism  Well-written articles: New York Times (NYT) articles published in Best American Science Writing books  Average: Articles from NYT written by authors of the well- written articles  Negative: Other articles from NYT published during the same period on the similar topics  Factor 1: Prevalence of subjective sentences Hypothesis: Opinions make an article interesting Task to solve: Automatically identify subjective expressions Annotate a corpus of subjective expressions Learn positive/ negative word dictionary using unsupervised methods Develop a subjectivity classifier using dictionary and context clues  Factor 2: Distribution of rhetorical zones Hypothesis: Placement of sentences with different functions varies between good and poorly-written articles Task to solve: Predict zones automatically Already explored in work by others and good accuracies are obtainable (motivation in prior work is information extraction) Explore features related to sizes and transition between zones aim motivation example p rior work comparison Sequences in good articles  Factor 1: Presence of visual words Hypothesis: Visual descriptions make an article easy to read Task to solve: Automatically identify image-invoking words in the article I use a large corpus of images and associated tags given by people  visual words Features related to density of visual words, position in article and variety in use “Conventional methods to solve this problem are difficult and time- consuming”  Factor 2: Surprise-invoking sentences Hypothesis: Surprise elements make the article exciting Task to solve: Predict lexical, syntactic, topic correlates of surprise Likelihood under language model Parse probability Verb-argument compatibility Order of verbs Rare topics in news “Sara Lewis is fluent in firefly” lake, cloud, tree Applications Authoring support for academic writing Article recommendation for science news  Highlighting based feedback  Based on zone sizes, transition between zones and subjectivity levels  Choose topics in science  Ranking 1: Articles in order of keyword relevance  Ranking 2: Including visual and surprisal scores  User study to note preference between the two rankings 1. Automatic summary evaluation Written by academics Sophisticated audience Written by journalists For lay audience Problem definition References  My thesis differentiates generic and domain- specific factors of text quality  I aim to develop domain-specific text quality measures for the genre of writing about science  I explore summary evaluation, authoring support and article recommendation as applications for these measures of text quality 2