David L. Chen Fast Online Lexicon Learning for Grounded Language Acquisition The 50th Annual Meeting of the Association for Computational Linguistics (ACL)

Slides:

Advertisements

Similar presentations

Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.

Advertisements

Assessment Photo Album

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.

Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Adapting Discriminative Reranking to Grounded Language Learning Joohyun Kim and Raymond J. Mooney Department of Computer Science The University of Texas.

Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.

Chapter 2: Algorithm Discovery and Design

Chapter 2: Algorithm Discovery and Design

Chapter 2: Algorithm Discovery and Design

LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.

1 Learning to Interpret Natural Language Navigation Instructions from Observation Ray Mooney Department of Computer Science University of Texas at Austin.

1 Learning Natural Language from its Perceptual Context Ray Mooney Department of Computer Science University of Texas at Austin Joint work with David Chen.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

1 David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition.

David L. Chen Supervisor: Professor Raymond J. Mooney Ph.D. Dissertation Defense January 25, 2012 Learning Language from Ambiguous Perceptual Context.

David Chen Advisor: Raymond Mooney Research Preparation Exam August 21, 2008 Learning to Sportscast: A Test of Grounded Language Acquisition.

Chapter 2: Algorithm Discovery and Design Invitation to Computer Science, C++ Version, Third Edition.

The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.

Kyoshiro SUGIYAMA, AHC-Lab., NAIST An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering Kyoshiro Sugiyama, Masahiro.

Learning to Transform Natural to Formal Language Presented by Ping Zhang Rohit J. Kate, Yuk Wah Wong, and Raymond J. Mooney.

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.

ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer 1 Ting-Hao (Kenneth) Huang Yun-Nung (Vivian) Chen Lingpeng Kong

Thanks to Dr. Kris Schindler for this (and all Karel the Robot slides)

A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:

14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.

David L. Chen and Raymond J. Mooney Department of Computer Science The University of Texas at Austin Learning to Interpret Natural Language Navigation.

Efficient Instant-Fuzzy Search with Proximity Ranking Authors: Inci Centidil, Jamshid Esmaelnezhad, Taewoo Kim, and Chen Li IDCE Conference 2014 Presented.

1 David Chen & Raymond Mooney Department of Computer Sciences University of Texas at Austin Learning to Sportscast: A Test of Grounded Language Acquisition.

Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.

Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.

1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.

Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.

Institute of Computing Technology, Chinese Academy of Sciences 1 A Unified Framework of Recommending Diverse and Relevant Queries Speaker: Xiaofei Zhu.

1 Broadcast News Segmentation using Metadata and Speech-To-Text Information to Improve Speech Recognition Sebastien Coquoz, Swiss Federal Institute of.

Robotics Club: 5:30 this evening

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Learning a Compositional Semantic Parser.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.

David Chen Supervising Professor: Raymond J. Mooney Doctoral Dissertation Proposal December 15, 2009 Learning Language from Perceptual Context 1.

INVITATION TO Computer Science 1 11 Chapter 2 The Algorithmic Foundations of Computer Science.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.

A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.

Chapter 2: Algorithm Discovery and Design Invitation to Computer Science.

Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.

Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,

ParkNet: Drive-by Sensing of Road-Side Parking Statistics Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin,

Grounded Language Learning

Compiler Design (40-414) Main Text Book:

Linguistic Graph Similarity for News Sentence Searching

Cognitive Language Processing for Rosie

Web News Sentence Searching Using Linguistic Graph Similarity

Authorship Attribution Using Probabilistic Context-Free Grammars

Semantic Parsing for Question Answering

Statistical NLP: Lecture 13

Joohyun Kim Supervising Professor: Raymond J. Mooney

Integrating Learning of Dialog Strategies and Semantic Parsing

Unified Pragmatic Models for Generating and Following Instructions

Learning to Sportscast: A Test of Grounded Language Acquisition

Sadov M. A. , NRU HSE, Moscow, Russia Kutuzov A. B

Word embeddings (continued)

Presentation transcript:

David L. Chen Fast Online Lexicon Learning for Grounded Language Acquisition The 50th Annual Meeting of the Association for Computational Linguistics (ACL) July 9, 2012 The University of Texas at Austin Google

Navigation Task Learn to interpret and follow free-form navigation instructions – e.g. Go down this hall and make a right when you see an elevator to your left Learn by observing how humans follow instructions Assume no prior linguistic knowledge Use virtual worlds and instructor/follower data from MacMahon et al. (2006)

Sample Instructions Take your first left. Go all the way down until you hit a dead end. Go towards the coat hanger and turn left at it. Go straight down the hallway and the dead end is position 4. Walk to the hat rack. Turn left. The carpet should have green octagons. Go to the end of this alley. This is p-4. Walk forward once. Turn left. Walk forward twice. Start 3 3 H H 4 4 End

Sample Instructions 3 3 H H 4 4 Take your first left. Go all the way down until you hit a dead end. Go towards the coat hanger and turn left at it. Go straight down the hallway and the dead end is position 4. Walk to the hat rack. Turn left. The carpet should have green octagons. Go to the end of this alley. This is p-4. Walk forward once. Turn left. Walk forward twice. Observed primitive actions: Forward, Left, Forward, Forward Start End

Overall System (Chen and Mooney 2011) Learning system for parsing navigation instructions Learning system for parsing navigation instructions Observation Instruction World State Execution Module (MARCO) Instruction World State Training Testing Action Trace Navigation Plan Constructor Semantic Parser Learner Plan Refinement Semantic Parser Action Trace

Potential Navigation Plans Instruction: Turn and walk to the couch Action Trace: Left, Forward, Forward Background knowledge: Layout of the map

Potential Navigation Plans Instruction: Turn and walk to the couch Action Trace: Left, Forward, Forward Background knowledge: Layout of the map Verify Travel Turn Verify LEFT 2 steps front: BLUE HALL BLUE HALL SOFA front: SOFA at:

Plan Refinement Turn and walk to the couch Verify Travel Turn Verify LEFT 2 steps front: BLUE HALL BLUE HALL SOFA front: SOFA at:

Plan Refinement Face the blue hall and walk 2 steps Verify Travel Turn Verify LEFT 2 steps front: BLUE HALL BLUE HALL SOFA front: SOFA at:

Plan Refinement Turn left. Walk forward twice. Verify Travel Turn Verify LEFT 2 steps front: BLUE HALL BLUE HALL SOFA front: SOFA at:

Plan Refinement Find the correct subplan that corresponds to the instruction First learn the meaning of words and short phrases Use the learned lexicon to remove parts of the plans unrelated to the instructions

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch 1. As an example comes in, break down the sentence and the graph into n-grams and connected subgraphs Verify Travel Turn Verify LEFT 2 steps front: BLUE HALL BLUE HALL SOFA front: SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Connected subgraph of size 1 Connected subgraph of size 2 Turn LEFT Verify … … Turn LEFT Verify Turn Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) Turn and walk to the couch turn, and, walk, to, the, couch 1-gram 2-gram 3-gram turn and, and walk, walk to, to the, the couch turn and walk, and walk to, walk to the, to the couch … Turn LEFT Verify … … Turn LEFT Verify Turn Connected subgraph of size 1 Connected subgraph of size 2 Verify Travel Turn Verify LEFT 2 steps front : BLUE HALL BLUE HALL SOFA front : SOFA at:

Subgraph Generation Online Lexicon Learning (SGOLL) turn Turn … LEFT 2. Increase the counts and co-occurrence count of each n-gram, connected-subgraph pair. Hash the connected-subgraphs for efficient update Turn RIGHT

Subgraph Generation Online Lexicon Learning (SGOLL) turn Turn … LEFT 2. Increase the counts and co-occurrence count of each n-gram, connected-subgraph pair. Hash the connected-subgraphs for efficient update Turn RIGHT

Subgraph Generation Online Lexicon Learning (SGOLL) turn Turn … LEFT 2. Increase the counts and co-occurrence count of each n-gram, connected-subgraph pair. Hash the connected-subgraphs for efficient update Turn RIGHT

Subgraph Generation Online Lexicon Learning (SGOLL) turn Turn … LEFT 3. Rank the entries by the scoring function Turn RIGHT

Evaluation Data Statistics 3 maps, 6 instructors, 1-15 followers/instruction Hand-segmented into single sentence steps ParagraphSingle-Sentence # Instructions Avg. # sentences Avg. # words Avg. # actions

Lexicon Building Time Time in seconds Chen and Mooney (2011) SGOLL 157.3

End-to-end Execution Test how well the system can perform the overall navigation task Leave-one-map-out approach Strict metric: Only successful if the final position matches exactly Upper baselines – Training with human annotated gold plans – Complete MARCO system [MacMahon, 2006] – Humans

End-to-end Execution Single SentencesParagraphs Chen and Mooney (2011) 54.40%16.18% Chen (2012) 57.28%19.18% Gold Standard Plans 62.67%29.59% MARCO 77.87%55.69% Humans N/A69.64%

Example Parse Instruction: “Place your back against the wall of the ‘T’ intersection. Turn left. Go forward along the pink-flowered carpet hall two segments to the intersection with the brick hall. This intersection contains a hatrack. Turn left. Go forward three segments to an intersection with a bare concrete hall, passing a lamp. This is Position 5.” Parse:Turn ( ), Verify ( back: WALL ), Turn ( LEFT ), Travel ( ), Verify ( side: BRICK HALLWAY ), Turn ( LEFT ), Travel ( steps: 3 ), Verify ( side: CONCRETE HALLWAY )

Mandarin Chinese Experiment Translated all the instructions from English to Chinese Train and test in the same way Chinese does not include word boundaries (spaces) Naively segment each character Use a trained Chinese word segmenter [Chang, Galley & Manning, 2008]

Mandarin Chinese Experiment Single SentencesParagraphs Segmented by character 58.54%16.11% Segmented by Stanford segmenter 58.70%20.13%

Conclusion Presented a system that learns to interpret free-form navigation instructions by observing how humans follow instructions Assumes no prior linguistic knowledge  Able to learn from multiple languages Fast online learning makes the system more scalable

Thanks to my collaborators: Raymond J. Mooney and Lu Guo More details and data/code: Questions?