Using Artificial Intelligence to Support Peer Review of Writing Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research.

Slides:

Advertisements

Similar presentations

Conclusions (in general… and for this assignment).

Advertisements

Academic Writing Writing an Abstract.

SQ3R: A Reading Technique

Week 8: Ms. Lowery.  Large-scale revision and examining higher- order concerns  Revision techniques for content, structure, and adherence to the assignment.

How to Write a Critique. What is a critique?  A critique is a paper that gives a critical assessment of a book or article  A critique is a systematic.

Qualitative Grading Notes compiled by Mary D’Alleva January 18 th, 2005 Office of Faculty Development.

1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.

January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Looking at Texts from a Reader’s Point of View

Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.

Effective Marking & Feedback in Writing

GIVING A TUTORIAL ACADEMIC ENGLISH II. TUTORIAL DEVELOPMENT You will learn how to: Plan a tutorial Prepare a tutorial Practice a tutorial Present a tutorial.

Discussion examples Andrea Zhok.

The purpose of this workshop is to introduce faculty members to some of the foundation issues associated with designing assignments for writing intensive.

Advanced Research Methodology

 A summary is a brief restatement of the essential thought of a longer composition. It reproduces the theme of the original with as few words as possible.

Guidelines for Examination Candidates Raymond Hickey English Linguistics University of Duisburg and Essen (August 2015)

The impact of peer- assisted sentence- combining teaching on primary pupils’ writing.

Mining and Summarizing Customer Reviews

Chapter 1: Active Reading & Thinking Strategies

A Demonstration Read-to-Write Project for Content Area Learning 2015 Summer Institute Joni Koehler and Dr. L. Lennie Irvin San Antonio Writing Project.

Instructional Accommodations Inservice. Who deserves accommodations? Everyone! Instructional accommodations are not just for students who are struggling.

 The ACT Writing Test is an optional, 30-minute test which measures your writing skills. The test consists of one writing prompt, following by two opposing.

Group 8 ‘GudBoyz’ teaching writing to L2 learners Agus Prayogo Asih Nurakhir Nico Ouwpoly Sutarno.

Measuring Hint Level in Open Cloze Questions Juan Pino, Maxine Eskenazi Language Technologies Institute Carnegie Mellon University International Florida.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

1 Statistical NLP: Lecture 10 Lexical Acquisition.

Automatically Predicting Peer-Review Helpfulness Diane Litman Professor, Computer Science Department Senior Scientist, Learning Research & Development.

Using Computational Linguistics to Support Students and Teachers during Peer Review of Writing Diane Litman Professor, Computer Science Department Senior.

Interactive Dialogue Systems Professor Diane Litman Computer Science Department & Learning Research and Development Center University of Pittsburgh Pittsburgh,

Improving Learning from Peer Review with NLP and ITS Techniques (July 2009 – June 2011) Kevin Ashley Diane Litman Chris Schunn.

Chris Luszczek Biol2050 week 3 Lecture September 23, 2013.

Natural Language Processing for Writing Research: From Peer Review to Automated Assessment Diane Litman Senior Scientist, Learning Research & Development.

Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.

T 7.0 Chapter 7: Questioning for Inquiry Chapter 7: Questioning for Inquiry Central concepts:  Questioning stimulates and guides inquiry  Teachers use.

Module 5 Week 11 Supplement 12. SPEAKING TRUTH EFFECTIVELY How to provide insightful and effective peer reviews.

The Expository Essay An Overview

May 2009 Of Mice and Men Essay.

© 2005 Pearson Education, Inc. publishing as Longman Publishers Chapter 2: Active Reading and Learning Efficient and Flexible Reading, 7/e Kathleen T.

Peer review systems, e.g. SWoRD [1], need intelligence for detecting and responding to problems with students’ reviewing performance E.g. problem localization.

Peer Critiques. For Today:  Class Lecture and Discussion: What is a Peer Critique?  Raider Writer assignment is to critique two other 1302 students’

Student Peer Review An introductory tutorial. The peer review process Conduct study Write manuscript Peer review Submit to journal Accept Revise Reject.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Critically reviewing a journal Paper Using the Rees Model

 An article review is written for an audience who is knowledgeable in the subject matter instead of a general audience  When writing an article review,

Natural Language Processing for Enhancing Teaching and Learning at Scale: Three Case Studies Diane Litman Professor, Computer Science Department Co-Director,

1 Unit 8 Seminar Effective Writing II for Arts and Science Majors.

© 2007 Pearson Education, Inc. publishing as Longman Publishers Efficient and Flexible Reading, 8/e by Kathleen T. McWhorter Chapter 7: Techniques for.

5-Paragraph Essay Structure

 Reading Quiz  Peer Critiques  Evaluating Peer Critiques.

Research Methods Technical Writing Thesis Conference/Journal Papers

Writing Exercise Try to write a short humor piece. It can be fictional or non-fictional. Essay by David Sedaris.

DISCUSS WORKSHOPS AND PEER EDITING How to get the most out of your Peer Review.

For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.

COURSE AND SYLLABUS DESIGN

Speech and Natural Language Technology for Educational Applications Diane Litman Professor, Computer Science Department Senior Scientist, Learning Research.

Teaching Peer Review of Writing in a Large First-Year Electrical and Computer Engineering Class: Comparison of Two Methods Michael Ekoniak Molly Scanlon.

Adapted from the Walden University Writing Center SOURCE INTEGRATION with puppies!

 1. optional (check to see if your college requires it)  2. Test Length: 50 min  3. Nature of Prompt: Analyze an argument  4. Prompt is virtually.

Mrs Joslyn Fox.  TIME MANAGEMENT: Don’t leave everything until the last minute!!!

PEER REVIEW Help each other think critically about your papers (articulating your ideas and providing critical feedback is your unit participation grade).

Automatically Predicting Peer-Review Helpfulness

Unlocking Informational Text Structure

Now it’s time to write the conclusion paragraph for the recommendation

Improving a Pipeline Architecture for Shallow Discourse Parsing

Writing and Feedback.

William Dietz Writing Specialist QU Writing Lab

How to become a GREAT peer editor!

Presentation transcript:

Using Artificial Intelligence to Support Peer Review of Writing Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research and Development Center

Context Speech and Language Processing for Education Learning Language (reading, writing, speaking) Using Language (to teach everything else) Tutors Scoring Readability Processing Language Tutorial Dialogue Systems / Peers CSCL Discourse Coding Lecture Retrieval Questioning & Answering Peer Review

Outline  SWoRD  Improving Review Quality  Identifying Helpful Reviews  Summary and Current Directions

SWoRD [Cho & Schunn, 2007]  Authors submit papers  Reviewers submit (anonymous) feedback  Authors revise and resubmit papers  Authors provide back-ratings to reviewers regarding feedback helpfulness

Some Weaknesses 1. Feedback is often not stated in effective ways 2. Feedback and papers often do not focus on core aspects

Our Approach: Detect and Scaffold 1. Detect and direct reviewer attention to key feedback features such as solutions 2. Detect and direct reviewer and author attention to thesis statements in papers and feedback Improving Learning from Peer Review with NLP and ITS Techniques (with Ashley, Schunn), LRDC internal grant

Feedback Features and Positive Writing Performance [Nelson & Schunn, 2008] Solutions Summarization Localization Understanding of the Problem Implementation

I. Detecting Key Feedback Features  Natural Language Processing (NLP) to extract attributes from text, e.g. –Regular expressions (e.g. “the section about”) –Domain lexicons (e.g. “federal”, “American”) –Syntax (e.g. demonstrative determiners) –Overlapping lexical windows (quotation identification)  Machine Learning (ML) to predict whether feedback contains localization and solutions, and whether papers contain a thesis statement

Learned Localization Model [Xiong, Litman & Schunn, 2010]

Quantitative Model Evaluation Feedback Feature Classroom Corpus NBaseline Accuracy Model Accuracy Model Kappa Human Kappa Localization History87553%78% Psychology311175%85% Solution History140561%79% CogSci583167%85%.65.86

II. Predicting Feedback Helpfulness  Can expert helpfulness ratings be predicted from text? [Xiong & Litman, 2011a]  Impact of predicting student versus expert helpfulness ratings [Xiong & Litman, 2011b]

Results: Predicting Expert Ratings (average of writing and domain experts)  Techniques used in ranking product review helpfulness can be effectively adapted to peer-reviews (R =.6)  Structural attributes (e.g. review length, number of questions)  Lexical statistics  Meta-data (e.g. paper ratings)  However, the relative utility of such features varies  Peer-review features improve performance (R =.7)  Theory-motivated (e.g. localization)  Abstraction (e.g. lexical categories) better for small corpora

Changing the meaning of “helpfulness”  Helpfulness may be perceived differently by different types of people  Average of two experts (prior experiment)  Writing expert  Content expert  Student peers 14

Content versus Writing Experts –Writing-expert rating = 2 –Content-expert rating = 5 15 Your over all arguements were organized in some order but was unclear due to the lack of thesis in the paper. Inside each arguement, there was no order to the ideas presented, they went back and forth between ideas. There was good support to the arguements but yet some of it didnt not fit your arguement. First off, it seems that you have difficulty writing transitions between paragraphs. It seems that you end your paragraphs with the main idea of each paragraph. That being said, … (omit 173 words ) As a final comment, try to continually move your paper, that is, have in your mind a logical flow with every paragraph having a purpose. Writing-expert rating = 5 Content-expert rating = 2 Argumentation issue Transition issue

Results: Other Helpfulness Ratings  Generic features are more predictive for student ratings  Lexical features: transition cues, negation, suggestion words  Meta features: paper rating  Theory-supported features are more useful for experts  Both experts: solution  Writing expert: praise  Content expert: critiques, localization 16

Summary  Artificial Intelligence (NLP and ML) can be used to automatically detect desirable feedback features –localization, solution –feedback and reviewer levels  Techniques used to predict product review helpfulness can be effectively adapted to peer-review –Knowledge of peer-reviews increases performance –Helpfulness type influences feature utility 17

Current and Future Work  Extrinisic evaluation in SWoRD –Intelligent Scaffolding for Peer Reviews of Writing (with Ashley, Godley, Schunn), IES (recommended for funding)  Extend to reviews of argument diagrams –Teaching Writing and Argumentation with AI-Supported Diagramming and Peer Review (with Ashley, Schunn), NSF  Teacher dashboard –Keeping Instructors Well-informed in Computer-Supported Peer Review (with Ashley, Schunn, Wang), LRDC internal grant 18

Thank you! Questions? 19

Peer versus Product Reviews  Helpfulness is directly rated on a scale (rather than a function of binary votes)  Peer reviews frequently refer to the related papers  Helpfulness has a writing-specific semantics  Classroom corpora are typically small 20

Generic Linguistic Features typeLabelFeatures (#) StructuralSTR revLength, sentNum, question%, exclamationNum LexicalUGR, BGR tf-idf statistics of review unigrams (#= 2992) and bigrams (#= 23209) SyntacticSYN Noun%, Verb%, Adj/Adv%, 1stPVerb%, openClass% Semantic (adapted) TOPcounts of topic words (# = 288) ; posW, negW counts of positive (#= 1319) and negative sentiment words (#= 1752) Meta-data (adapted) METApaperRating, paperRatingDiff 21

TypeLabelFeatures (#) Cognitive Science cogS praise%, summary%, criticism%, plocalization%, solution% Lexical Categories LEX2Counts of 10 categories of words LocalizationLOC Features developed for identifying problem localization Specialized Features 22

Lexical Categories Extracted from: 1.Coding Manuals 2.Decision trees trained with Bag-of-Words 23 TagMeaningWord list SUGsuggestionshould, must, might, could, need, needs, maybe, try, revision, want LOClocationpage, paragraph, sentence ERRproblemerror, mistakes, typo, problem, difficulties, conclusion IDEidea verbconsider, mention LNKtransitionhowever, but NEGnegativefail, hard, difficult, bad, short, little, bit, poor, few, unclear, only, more POSpositivegreat, good, well, clearly, easily, effective, effectively, helpful, very SUMsummarizationmain, overall, also, how, job NOTnegationnot, doesn't, don't SOLsolutionrevision, specify, correction

Discussion 24 Effectiveness of generic features across domains Same best generic feature combination (STR+UGR+MET) But…

Results: Specialized Features 25 Introducing high level features does enhance the model’s performance.  Best model: Spearman correlation of and Pearson correlation of Feature Typerrsrs cogS / / LEX / / LOC / / STR+MET+UGR (Baseline) / / STR+MET+LEX / / STR+MET+LEX2+TOP / / STR+MET+LEX2+TOP+cogS / / STR+MET+LEX2+TOP+cogS+LOC / /-0.076

Student rating = 3 Expert-average rating = 5 Students versus Experts 26 The author also has great logic in this paper. How can we consider the United States a great democracy when everyone is not treated equal. All of the main points were indeed supported in this piece. I thought there were some good opportunities to provide further data to strengthen your argument. For example the statement “These methods of intimidation, and the lack of military force offered by the government to stop the KKK, led to the rescinding of African American democracy.” Maybe here include data about how … (omit 126 words) praise Critique –Student rating = 7 –Expert-average rating = 2

Sample Result: All Features 27 Feature selection of all features Students are more influenced by meta features, demonstrative determiners, number of sentences, and negation words Experts are more influenced by review length and critiques Content expert values solutions, domain words, problem localization Writing expert values praise and summary