Multilinguality to the Rescue Manaal Faruqui & Chris Dyer Language Technologies Institute SCS, CMU.

Slides:



Advertisements
Similar presentations
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Advertisements

Research & Development ICASSP' Analysis of Model Adaptation on Non-Native Speech for Multiple Accent Speech Recognition D. Jouvet & K. Bartkova France.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Deep Learning in NLP Word representation and how to use it for Parsing
+ Improving Vector Space Word Representations Using Multilingual Correlation Manaal Faruqui and Chris Dyer Language Technologies Institute Carnegie Mellon.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.
An Information Theoretic Approach to Bilingual Word Clustering Manaal Faruqui & Chris Dyer Language Technologies Institute SCS, CMU.
Named Entity Recognition and the Stanford NER Software Jenny Rose Finkel Stanford University March 9, 2007.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada.
Multilingual Word Sense Disambiguation using Wikipedia Bharath Dandala (University of North Texas) Rada Mihalcea (University of North Texas) Razvan Bunescu.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Truc-Vien T. Nguyen Lab: Named Entity Recognition.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
The use of machine translation tools for cross-lingual text-mining Blaz Fortuna Jozef Stefan Institute, Ljubljana John Shawe-Taylor Southampton University.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
CLEF Ǻrhus Robust – Word Sense Disambiguation exercise UBC: Eneko Agirre, Oier Lopez de Lacalle, Arantxa Otegi, German Rigau UVA & Irion: Piek Vossen.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.
CLEF 2004 – Interactive Xling Bookmarking, thesaurus, and cooperation in bilingual Q & A Jussi Karlgren – Preben Hansen –
The Necessity of Combining Adaptation Methods Cognitive Computation Group, University of Illinois Experimental Results Title Ming-Wei Chang, Michael Connor.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
 Copyright 2011 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute Enabling Networked Knowledge.
20 th of May 2004 Beatrice Alex School of Informatics The University of Edinburgh Mixed-Lingual Entity Recognition.
GUIDE : PROF. PUSHPAK BHATTACHARYYA Bilingual Terminology Mining BY: MUNISH MINIA (07D05016) PRIYANK SHARMA (07D05017)
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
Evgeniy Gabrilovich and Shaul Markovitch
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
MACHINE TRANSLATION PAPER 1 Daniel Montalvo, Chrysanthia Cheung-Lau, Jonny Wang CS159 Spring 2011.
+ Various Improvements in Vector Space Word Representations Manaal Faruqui Sujay Jauhar, Jesse Dodge Chris Dyer, Noah Smith.
A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
1 Predicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System Hapnes Toba, Mirna Adriani, and Ruli Manurung.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
This research is supported by NIH grant U54-GM114838, a grant from the Allen Institute for Artificial Intelligence (allenai.org), and Contract HR
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
Languages of Europe Romance, Germanic, and Slavic.
Cross-Lingual Named Entity Recognition via Wikification
A Simple Approach for Author Profiling in MapReduce
Cross-lingual Models of Word Embeddings: An Empirical Comparison
contrastive linguistics
Measuring Monolinguality
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Translation of Unknown Words in Low Resource Languages
contrastive linguistics
LACONEC A Large-scale Multilingual Semantics-based Dictionary
Multilingualism in UK websites Kate Fernie, MLA
Word Embeddings with Limited Memory
Machine Translation(MT)
Large scale multilingual and multimodal integration
COMPARATIVE Linguistics 2018/2019
Using Multilingual Neural Re-ranking Models for Low Resource Target Languages in Cross-lingual Document Detection Using Multilingual Neural Re-ranking.
Word embeddings (continued)
contrastive linguistics
contrastive linguistics
Presentation transcript:

Multilinguality to the Rescue Manaal Faruqui & Chris Dyer Language Technologies Institute SCS, CMU

Multilinguality Using more than one language at a time Image source:

Multilinguality Why ? Bank Images: बैंक तट Cross lingual Word Sense Disambiguation (Diab and Resnik, 2002)

Multilinguality Why ? Bilingual Word Clustering (Faruqui & Dyer, 2013)

Multilinguality Why ? Bilingual Word Clustering (Faruqui & Dyer, 2013)

Multilinguality Using data from other languages DirectIndirect Assume foreign = original language Extract information from foreign language

Direct Information Transfer NLP System Language 1 data Language 2 data Output

Direct Information Transfer Why would it work ? Works for specific tasks like NER Many NEs retain their “orthographic” form Across languages that use the same “alphabet” English, German, French, Spanish Hindi, Marathi, Bihari Specially proper nouns Names of Locations USA, London, New York, Pittsburgh Names of People Obama, William, Roger

Barack Obama hat 2012 mit dieser Strategie die Präsidentschaftswahlen gewonnen. The Obama administration has poured billions of dollars into expanding the reach of the Internet. Pour finir, en défendant les bonus et en tentant de faire dérailler les nouvelles règles prudentielles, ce démocrate s'est mis à dos Barack Obama. Direct Information Transfer... sagte Jimmy Wales dem Wall Street Journal in einem Interview in Hongkong. Mads Refslund, executive chef at Acme, forages in the overgrown spaces and hidden markets of Hongkong for regional delicacies. Les sacs de luxe, nouvelle monnaie d'échange à Hongkong.

Direct Information Transfer Semantic Generalization Deutschland (100) Ostdeutschland (5) Westdeutschland (0) Deutschland (100) Ostdeutschland (5) Westdeutschland (0) LOC

Direct Information Transfer How ? NER System Language 1 Training data Language 1 Training data Language 2 Word Clusters Language 2 Word Clusters NE-tagged Text Input

Evaluation Tools Stanford NER for training (Finkel and Manning, 2009) In-built functionality to use word clusters for generalization Word clustering software (distributional + morphological) (Clark., 2003) Data NER training data German, English: CoNLL 2003 Dutch, Spanish: CoNLL 2002 Generalization data WMT-2012 news commentary: 200 million tokens English, German, French, Spanish, Czech

Results

Improvement in F 1 scores by NE type

Quick Takeaways Multilingual data can be put to use for monolingual benefits The amount of help depends on how similar the two languages are “orthographically”

Indirect Information Transfer NLP System Language 1 data Language 2 data Output + +

Vector Space Word Models Image:

Vector Space Models Image:

Vector Space Models Monolingual Word Vectors 1 Monolingual Word Vectors Better Monolingual Word Vectors 1 ??

Indirect Information Transfer + + = Canonical Correlation Analysis n n k d2d2 n n d1d1 k + +

wxwx wxwx d1d1 k wywy wywy d2d2 k x x y y n n d2d2 d1d1 Canonical Correlation Analysis * * nn k k

Indirect Information Transfer Word Vectors in Language 1 Word Vectors in Language 1 Word Vectors in Language 2 Word Vectors in Language 2 Obtain 1-to-1 mapping using word alignments Word Vectors in Language 1 Word Vectors in Language 1 Word Vectors in Language 2 Word Vectors in Language Word Vectors in Language 1 Word Vectors in Language 1 Word Vectors in Language 2 Word Vectors in Language 2

Experiments Task: Word Pair Reranking Rank a list of word pairs according to semantic similarity Datasets WS-353: 353 word pairs RG-65: 65 noun pairs Truncation Maybe the correlation introduces noise Keep only the top k% of correlated dimensions

Evaluation Tools Word vectors: RNNLM Toolkit (Mikolov, 2009) Word alignments: cdec (Dyer et al, 2013) CCA: Matlab Toolkit Data Word vector monolingual training data WMT news commentary: 2011, 2012 English, French, Spanish, German Word alignment data WMT news commentary 2010, 09, , 06 {French, Spanish, German} - English

Results

Original English Vectors

German Projected on English

Conclusion Word vector quality can be improved using multilingual data At least for lexical semantic tasks The amount of help provided by these languages depend on how similar they are to each other A task like NER can use data from multiple languages in a simple framework

Thank You!