Natural Language Processing for Writing Research: From Peer Review to Automated Assessment Diane Litman Senior Scientist, Learning Research & Development.

Slides:



Advertisements
Similar presentations
Rules of Peer Editing It must be ABSOLUTELY SILENT! No talking, whispering, etc. Little distractions are annoyances while you are trying to give great.
Advertisements

Academic Quality How do you measure up? Rubrics. Levels Basic Effective Exemplary.
Literacy Unit Standards AN ALTERNATIVE PATHWAY TO ACHIEVING LEVEL 1 LITERACY.
Using blogs to heighten the learning experience about the Roman Empire By Robert Tuma.
Family Literacy Night. Why revise? Align with Common Core State Standards Reflect the current Reading and Writing Workshop philosophy as well as the Mathematics.
Writing Good Software Engineering Research Papers A Paper by Mary Shaw In Proceedings of the 25th International Conference on Software Engineering (ICSE),
How to Write the Five Paragraph Essay
KEEP CALM AND TRY AGAIN The Evolution of a Library Research Assignment 2013 Missouri Library Association Annual Conference.
PERFORMANCE TASKS. INTRODUCTION & PURPOSE Students create products or perform tasks to show their mastery of a particular skill Students select a response.
Dr. MaLinda Hill Advanced English C1-A Designing Essays, Research Papers, Business Reports and Reflective Statements.
The Use of Student Work as a Context for Promoting Student Understanding and Reasoning Yvonne Grant Portland MI Public Schools Michigan State University.
Writing a Persuasive Essay
Using Computational Linguistics to Support Students and Teachers during Peer Review of Writing Diane Litman Professor, Computer Science Department Senior.
Writing a Persuasive Essay
GOOD MORNING! “Editing is the same as quarrelling with writers - same thing exactly. “ ~Harold Ross 14 Oct Please reclaim your English notebooks,
CTE Literacy Support Session 3 Using Templates for Scaffolding Student Writing Feedback and evaluation of student writing.
Improving Learning from Peer Review with NLP and ITS Techniques (July 2009 – June 2011) Kevin Ashley Diane Litman Chris Schunn.
What Makes an Essay an Essay. Essay is defined as a short piece of composition written from a writer’s point of view that is most commonly linked to an.
BEA 2015 University of Pittsburgh
Understand About Essays What exactly is an essay? Why do we write them? What is the basic essay structure?
Keys to success on the Gateway: A checklist  Demonstrate that you understand the writing task  Address and develop all parts of the writing task  Organize.
{ The writing process Welcome. In the prewriting stage the follow must be considered:   factual information pertaining to topic   clear definition.
DBQs What do I do?. Understand the Question Read the historical context carefully to understand what it’s all about. Read the DBQ question. In almost.
Freshmen Career Fair. » Consider the type of paper you’re writing: analytical ˃An analytical paper breaks down an issue or an idea into its component.
EDITING DAY #2! “The road to hell is paved with works-in-progress.” ~Philip Roth 3/11/14 Please take out your (updated?) rough draft with word count and.
Peer review systems, e.g. SWoRD [1], need intelligence for detecting and responding to problems with students’ reviewing performance E.g. problem localization.
Making and Supporting a Claim in Common Core, AP Language, and the Writing Assessment.
The Four P’s of an Effective Writing Tool: Personalized Practice with Proven Progress April 30, 2014.
Highfields School Thursday 8 October Welcome, thank you for coming Our Core Purpose To be an inclusive, happy community that values every individual.
Close Reading of Complex Texts in the 3-8 Modules
Using Artificial Intelligence to Support Peer Review of Writing Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research.
Interactive Read-aloud. Reading is about mind journeys and teaching reading is about outfitting the traveler: modeling how to use the map, demonstrating.
Constructed Response Developing this writing practice as part of ongoing classroom assessment The value of constructed response is that it is teaching.
CTAP 295: PORTFOLIO PRESENTATION Junior Honors Literature: The Scarlet Letter A.
Using APES to Write Short Answer Responses
Reading Text in the Social Studies Classroom District Learning Day Bolton High School August 5, 2015.
Natural Language Processing for Enhancing Teaching and Learning at Scale: Three Case Studies Diane Litman Professor, Computer Science Department Co-Director,
Essay Prompt WHAT is a major theme developed in your novel, and HOW is that theme developed throughout the piece of writing? (in discussing the HOW, you.
Guided Reading How can we make this really effective for our students?
Written Assignment NOTES AND TIPS FOR STUDENTS.  MarksLevel descriptor 0The work does not reach a standard described by the descriptors below. 1–2The.
Decompressing Teachers’ Mathematical Knowledge: The Case of Division Presented by: DeAnn Huinker Melissa Hedges Kevin McLeod Jennifer Bay-Williams Association.
Writing Exercise Try to write a short humor piece. It can be fictional or non-fictional. Essay by David Sedaris.
Portfolios A number of years ago the portfolio became part of the requirements to attain the two highest levels of graduation status. Though one.
National 5 AVU Learning Intentions: To gain knowledge on how to present information, form a conclusion and make a research sheet.
The College Board (best known for the SAT) has these eight tips for writing a solid college essay: t-in/essays/8-tips-for-crafting-your-
INTERNAL ASSESSMENT ADVICE Or…how to get a 7 on your Internal Assessment.
College of Education Meeting with the Professor: The Field Assignment Project (Case Study)
Test Writing as Genre: How to Apply What the Students Already Know Presented by: Tara Falasco and Kathleen Masone.
Warm-up 8/23 No time for warm-up today! Keep your warm- up sheet for next week.
Rubrics: Using Performance Criteria to Evaluate Student Learning PERFORMANCE RATING PERFORMANCE CRITERIABeginning 1 Developing 2 Accomplished 3 Content.
Writing To Be Awesome. First things first… Our focus: expository. What is expository writing? Expository writing is the key to all other types of writing.
9.3.3 Writing a Research Paper. Do Now: Get out your Chromebooks and open the L1 Research Frame document from your Drive Agenda: ●Do Now ●Research Check-In.
GWE Testing Tips GWE Testing Tips Student Engagement and Academic Success Student Engagement and Academic Success.
TagHelper Track Overview Carolyn Penstein Rosé Carnegie Mellon University Language Technologies Institute & Human-Computer Interaction Institute School.
Revising Your Paper Paul Lewis With thanks to Mark Weal.
Design Question 4 – Element 22
Natural Language Processing for Enhancing Teaching and Learning
AVID Ms. Richardson.
Natural Language Processing for Enhancing Teaching and Learning
Writing Paper Three Monday, November 2.
Professor Diane Litman
High Expectations for a School Community
Portfolios 2018 Wednesday, February 28th.
Writing and Feedback.
Peer Reviews Tips for the author.
Mrs. Bly Eng 4 – Outliers by Malcolm Gladwell
A Brief Introduction to Peer Conferencing
iSTART: Reading Strategy Training
Presentation transcript:

Natural Language Processing for Writing Research: From Peer Review to Automated Assessment Diane Litman Senior Scientist, Learning Research & Development Center Professor, Computer Science Department Director, Intelligent Systems Program 1

Writing Research is a Goldmine for NLP New Educational Technology! Learning Science at Scale! Can we automate human coding?

Two Case Studies SWoRD and Argument Peer – w/ Kevin Ashley, Amanda Godley, Chris Schunn Response to Text Assessment – w/ Rip Correnti, Lindsay Clare Matsumara 3

SWoRD: A web-based peer review system [Cho & Schunn, 2007] Authors submit papers (or diagrams) Peers submit reviews – Problem: reviews are often not stated effectively – Example: no localization Justification is sufficient but unclear in some parts. – Our Approach: detect and scaffold Justification is sufficient but unclear in the section on African Americans

Localization Scaffolding Make sure that for every comment below, you explain where in the diagram it applies. For example, you can indicate where your comments apply by: (1) Specifying node(s) and/or arc(s) in the author's diagram to which your comment refers Your conflicting/supporting [node-type] is really solid! (2) Quoting the excerpt from the author's textual content of node and/or arc to which your comment refers For your [node-type] that talks about body chemistry and cortisol levels, you should clarify how that is related to politeness specifically. (3) Referring explicitly to the specific line of argumentation that your comment addresses Why does claim [node-ID] support the idea that people will be more polite in the evening? I’ve revised my comments. Please check again. I don’t know how to specify where in the diagram my comments apply. Could you show me some examples? My comments don’t have the issue that you describe. Please submit comments.

A First Classroom Evaluation [Nguyen, Xiong & Litman, 2014] NLP extracts attributes from reviews in real-time Prediction models use attributes to detect localization Scaffolding if < 50% of comments predicted as localized Deployment in undergraduate Research Methods

Results: Can we Automate? Diagram reviewPaper review AccuracyKappaAccuracyKappa Majority baseline61.5% (not localized) 050.8% (localized) 0 Our models81.7% %0.46 Comment Level Review Level Diagram reviewPaper review Total scaffoldings17351 Incorrectly triggered10

Results: New Educational Technology Reviewer responseREVISEDISAGREE Diagram review54 (48%)59 (52%) Paper review13 (30%)30 (70%) Response to Scaffolding Why are reviewers disagreeing? No correlation with true localization ratio (diagrams )

A Deeper Look: Revision Performance # and % of comments (diagram reviews) NOT Localized → Localized2630.2% Localized → Localized2630.2% NOT Localized → NOT Localized3338.4% Localized → NOT Localized11.2% Comment localization is either improved or remains the same after scaffolding Localization revision continues after scaffolding is removed (see poster!)

A Deeper Look: Revision Performance # and % of comments (diagram reviews) NOT Localized → Localized2630.2% Localized → Localized2630.2% NOT Localized → NOT Localized3338.4% Localized → NOT Localized11.2% Open questions Are reviewers improving localization quality? Interface issues, or rubric non-applicability?

Automatic Scoring of an Analytical Response-To-Text Assessment (RTA) [Rahimi, Litman, Correnti, Matsumura, Wang & Kisa, 2014] Long-term goal – informative feedback for students and teachers Current work – interpretable, NLP-based features that operationalize the Evidence rubric of RTA 11

Scoring Essays for Evidence 12

Rubric-Derived Features Number of Pieces of Evidence (NPE) – Topics and words defined based on the text and by experts – Window-based algorithm Concentration (CON) – High concentration: fewer than 3 sentences with topic words Specificity (SPC) – Specific examples from different parts of the text – Window-based algorithm Word Count (WOC) – Temporary fallback feature 13

Essay with score of 4 on Evidence I was convinced that winning the fight of poverty is achievable in our lifetime. Many people couldn't afford medicine or bed nets to be treated for malaria. Many children had died from this dieseuse even though it could be treated easily. But now, bed nets are used in every sleeping site. And the medicine is free of charge. Another example is that the farmers' crops are dying because they could not afford the nessacary fertilizer and irrigation. But they are now, making progess. Farmers now have fertilizer and water to give to the crops. Also with seeds and the proper tools. Third, kids in Sauri were not well educated. Many families couldn't afford school. Even at school there was no lunch. Students were exhausted from each day of school. Now, school is free. Children excited to learn now can and they do have midday meals. Finally, Sauri is making great progress. If they keep it up that city will no longer be in poverty. Then the Millennium Village project can move on to help other countries in need. NPECONWOCSPC

Results: Can we Automate? Proposed features outperform both baselines

Results: Can we Automate? Absolute performance improves on less noisy data Complete: Complete dataset (n = 1569) Subset: Doubly-coded essays where raters agree (n=353) less training data, and only for our features

Other Results See poster – Feature analysis – Spelling correction Predictive utility generalizes to a second dataset 17

New NLP-Supported Directions Teacher dashboard for high school science writing – LRDC grant -> (expected) NSF DRK-12 – w/ Amanda Godley & Chris Schunn Peer review search and analytics in MOOCS – Google award Student reflections in undergraduate STEM – LRDC grant – w/ Muhsin Menekse & Jingtao Wang

Thank You! Questions? Further Information –

Paper Review Localization Model [Xiong, Litman & Schunn, 2010]

Diagram Review Localization Model [Nguyen & Litman, 2013] Localization again correlates with feedback implementation [Lippmann et al., 2012] Pattern-based detection algorithm – Numbered ontology type, e.g. citation 15 – Textual component content, e.g. time of day hypothesis – Unique component, e.g. the con-argument – Connected component, e.g. support of second hypothesis – Numerical regular expression, e.g. H1, #10 21

Results: Revision Performance Number (pct.) of comments of diagram reviews Scope=InScope=OutScope=No NOT Loc. → Loc %787.5%312.5% Loc. → Loc %112.5%1666.7% NOT Loc. → NOT Loc %00%520.8% Loc. → NOT Loc.11.2%00%0 Comment localization is either improved or remains the same after scaffolding] Localization revision continues after scaffolding is removed Are reviewers improving localization quality, or performing other types of revisions? Interface issues, or rubric non-applicability?

Rubric for the Evidence dimension of RTA 1234 Features one or no pieces of evidence Features at least 2 pieces of evidence Features at least 3 pieces of evidence Features at least 3 pieces of evidence Selects inappropriate or little evidence from the text; may have serious factual errors and omissions Selects some appropriate but general evidence from the text; may contain a factual error or omission Selects appropriate and concrete, specific evidence from the text Selects detailed, precise, and significant evidence from the text Demonstrates little or no development or use of selected evidence Demonstrates limited development or use of selected evidence Demonstrates use of selected details from the text to support key idea Demonstrates integral use of selected details from the text to support and extend key idea Summarize entire text or copies heavily from text Evidence provided may be listed in a sentence, not expanded upon Attempts to elaborate upon Evidence Evidence must be used to support key idea / inference(s)

Essay with score of 1 on Evidence Yes, because even though proverty is still going on now it does not mean that it can not be stop. Hannah thinks that proverty will end by 2015 but you never know. The world is going to increase more stores and schools. But if everyone really tries to end proverty I believe it can be done. Maybe starting with recycling and taking shorter showers, but no really short that you don't get clean. Then maybe if we make more money or earn it we can donate it to any charity in the world. Proverty is not on in Africa, it's practiclly every where! Even though Africa got better it didn't end proverty. Maybe they should make a law or something that says and declare that proverty needs to need. There's no specic date when it will end but it will. When it does I am going to be so proud, wheather I'm alive or not. NPECONWOCSPC

25

26

Future RTA Directions New features and other scoring dimensions Full automation – extraction of topics and words – spelling correction Downstream applications for teachers and students 27

New NLP-Supported Directions Additional measures of peer review quality – Solutions to problems – Helpfulness – Impact on writing quality Teacher dashboard (internal grant -> likely NSF DRK-12) – Reviews Quality metrics (localization, solution, helpfulness) Topic-word analytics Review summarization – Papers Revision behavior

Summing Up: Common Themes NLP for supporting writing research at scale – Learning science – Educational technology Many opportunities and challenges – Characteristics of student writing Prior NLP software often trained on newspaper texts – Model desiderata Beyond accuracy – Interactions between NLP and Educational Technologies Robustness to noisy predictions Implicit feedback for lifelong computer learning 29