 A subtask of text simplification  Replacing words or short phrases by simpler variants in a context aware fashion  Motivation  To reach out to wider.

Slides:



Advertisements
Similar presentations
You Want Me To Do What To My _______________? (Fill in the blank with educational material) Learning to Lexile and Why it is Important to You and Your.
Advertisements

GIVING CONSTRUCTIVE FEEDBACK Keeping the authors on board Michelle Proctor.
LEARNING TO WRITE IN TWO LANGUAGES Professor Anthony Liddicoat University of South Australia Bilingual Schools Network Camberwell PS, March 2013.
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Chapter 12: Word- Learning Strategies
A Survey on Text Categorization with Machine Learning Chikayama lab. Dai Saito.
Making a Clay Mask 6 Step 1 Step 2 Step 3Decision Point Step 5 Step 4 Reading ComponentsTypical Types of Tasks and Test Formats Phonological/Phonemic.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Unit 13 Integrated Skills. Teaching objectives By the end of the lesson, students should be able to:  know how to integrate the four skills  know the.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
READING – WRITING RELATIONS Are there any? 1. A GENDA The Rationale Literature Review The Purpose of the Study The Study The Research Questions The Results.
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
Chapter 7 Foregrounding Written Communication. Teaching Interactive Second Language Writing in Content- Based Classes Teachers should include a wide range.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Embedding LLN in an on-job work environment. Workshop outcomes Increased awareness of what good practice looks like when embedding LLN in on-job learning.
Groton Elementary Agenda: Discuss assessments, modifications, and accommodations Review common accommodations for assessments Study of Test.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Special Topics in Text Mining Manuel Montes y Gómez University of Alabama at Birmingham, Spring 2011.
Reading Successful Practices. Adapted from Successful Practices with English Learners: A Focus on Reading Aida Walqui, Director Teacher Professional Development.
Unit 13 Integrated Skills. Aims of the Unit -- To understand the reasons of integrating the four skills; --To grasp two ways of integrating the four skills.
Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation.
Building your child’s literacy skills: Kindergarten to Year 2.
2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.
CRESST ONR/NETC Meetings, July 2003, v1 ONR Advanced Distributed Learning Linguistic Modification of Test Items Jamal Abedi University of California,
Who?  English Language Learners  Learners of English  Students scoring below the 40 percentile on standardized tests  Students with language based.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
 Motivation:  Actor: [awards, height, age, weight, birthdate, birthplace, cause of death, real name]  Painter: [paintings, biography, bibliography,
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Psycholinguistic aspects of interlanguage
Maths and literacy Phil and Alana May Hypothesis: It is useful for students in mathematics to know about Text features and purposes of texts Vocabulary.
Creating User Interfaces Directed Speech. XML. VoiceXML Classwork/Homework: Sign up to be Voxeo developer. Do tutorials.
For Wednesday No reading Homework –Chapter 23, exercise 15 –Process: 1.Create 5 sentences 2.Select a language 3.Translate each sentence into that language.
Ameeta Agrawal Nikolay Yakovets 01 Dec …Prime Minister Vladimir V. Putin, the country's paramount leader, cut short a trip to Siberia, returning.
Research © 2008 Yahoo! Generating Succinct Titles for Web URLs Kunal Punera joint work with Deepayan Chakrabarti and Ravi Kumar Yahoo! Research.
Third Language Vocabulary Acquisition: The Influence of Serbian and Hungarian as Native Languages on the English Language Situating Strategy Use: The Interplay.
Instructions Well-written Instructions include 3 Genres Technical Description Process Description Instructions/Procedures.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Picture this. What do you Mean? Dr. Constance Ulmer Appalachian State University Summer 2008.
Second Language Learning From News Websites Word Sense Disambiguation using Word Embeddings.
WikiSimple – Automatic Simplification of Wikipedia Articles By Kristian Woodsend and Mirella Lapata Presented by Kira Belkin 05/
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Automatic Writing Evaluation
Different paths to similar outcomes
Statistical Machine Translation Part II: Word Alignments and EM
TYPES OF TRANSLATION.
== What You See Is Wiki ==
Project management assignment, Spring 2017
Top, Middle, & Bottom Cutoff Scores……
LACONEC A Large-scale Multilingual Semantics-based Dictionary
The documentation format of the Modern Language Association
The documentation format of the Modern Language Association
WordNet WordNet, WSD.
Intermediates Here is a simple profile for Intermediate proficiency speakers from ACTFL 2012.
Paraphrasing.
Translator.
The documentation format of the Modern Language Association
The documentation format of the Modern Language Association
English Through Content
A Brief Introduction to In-text Citations MLA Style Updated Feb., 2011
Intermediates Here is a simple profile for Intermediate proficiency speakers from ACTFL 2012.
Information Retrieval
Theoretical approaches to helping children to learn to read:
Presentation transcript:

 A subtask of text simplification  Replacing words or short phrases by simpler variants in a context aware fashion  Motivation  To reach out to wider range of readers having limited vocabulary ▪ Children ▪ People with low literacy level or cognitive disability ▪ Second language learners

 Identification of complex words or phrases  Substitute lookup  Synonyms from thesaurus  Distributional similarity  Context-based ranking

 Technical Medical Language  Hypertension risk factors include obesity,...  High blood pressure risk factors include excessive weight,...  Legal Language  The Products transacted through the Service are...  The Products managed through the Service are...  Low Literacy Readers  Hitler committed terrible atrocities during the second World War  Hitler committed terrible cruelties during the second World War

 Knowledge-based approach  Using thesaurus, Wordnet  Hard to capture all simplification contexts  Lexical simplification as paraphrasing  Paraphrasing does not deal with complexity reduction specifically  Lexical simplification as machine translation  Requires a complex-simple parallel corpora  Wikipedia-Simple Wikipedia corpora ▪ Not comparable

 Simple English Wikipedia (SEW)  Edition of normal or Complex English Wikipedia (CEW) written in simpler constructs with restricted vocabulary  Wikipedia for children, low literacy readers, second language readers etc.  121,095 content pages  Semi-parallel to it’s complex counterpart Resource: For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia, Yatskar et al.

Version 1 Version 2 Edits Version n Edits

 Edits in SEW versions are mix of different types of edits  The task  Separate out only simple edits from other edits

Probability estimation of fix edit

fix + simple edit

Resource: Putting it Simply: a Context-Aware Approach to Lexical Simplification, Biran et al. Self Study