Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Slides:

Advertisements

Similar presentations

Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,

Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing

Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.

Introduction to Machine Learning Approach Lecture 5.

ELN – Natural Language Processing Giuseppe Attardi

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

L’età della parola Giuseppe Attardi Dipartimento di Informatica Università di Pisa ESA SoBigDataPisa, 24 febbraio 2015.

Seminar Topics and Projects Giuseppe Attardi Dipartimento di Informatica Università di Pisa.

Semantic Compositionality through Recursive Matrix-Vector Spaces

Shallow Parsing for South Asian Languages -Himanshu Agrawal.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Graph-based Dependency Parsing with Bidirectional LSTM Wenhui Wang and Baobao Chang Institute of Computational Linguistics, Peking University.

Deep Learning for Text Analysis Where do we stand?

Fill-in-The-Blank Using Sum Product Network

Language Identification and Part-of-Speech Tagging

S.Bengio, O.Vinyals, N.Jaitly, N.Shazeer

Course Outline (6 Weeks) for Professor K.H Wong

Ensembling Diverse Approaches to Question Answering

R-NET: Machine Reading Comprehension With Self-Matching Networks

Sentiment analysis using deep learning methods

Like It or Not: A Survey of Twitter Sentiment Analysis Methods

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

Sentiment analysis algorithms and applications: A survey

Deep Learning Amin Sobhani.

Natural Language and Text Processing Laboratory

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Tools for Natural Language Processing Applications

Relation Extraction CSCI-GA.2591

Wei Wei, PhD, Zhanglong Ji, PhD, Lucila Ohno-Machado, MD, PhD

Factual Claim Validation Models Extraction of Evidence

Improving a Pipeline Architecture for Shallow Discourse Parsing

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

MID-SEM REVIEW.

Are End-to-end Systems the Ultimate Solutions for NLP?

Vector-Space (Distributional) Lexical Semantics

AI in Cyber-security: Examples of Algorithms & Techniques

Text Analytics Giuseppe Attardi Università di Pisa

convolutional neural networkS

Distributed Representation of Words, Sentences and Paragraphs

convolutional neural networkS

Recurrent Neural Networks

Learning Emoji Embeddings Using Emoji Co-Occurrence Network Graph

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Seminar Topics and Projects

Ontology-Driven Sentiment Analysis of Product and Service Aspects

Word embeddings based mapping

Word embeddings based mapping

Text Mining & Natural Language Processing

Introduction to Natural Language Processing

Text Mining & Natural Language Processing

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Word embeddings (continued)

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Preposition error correction using Graph Convolutional Networks

Natural Language Processing (NLP) Systems Joseph E. Gonzalez

CS565: Intelligent Systems and Interfaces

Attention for translation

Learn to Comment Mentor: Mahdi M. Kalayeh

Lecture 21: Machine Learning Overview AP Computer Science Principles

CSE 291G : Deep Learning for Sequences

Sequence to Sequence Video to Text

Automatic Handwriting Generation

Presented by: Anurag Paul

Factual Claim Validation Models

Sentiment Classification

Tokenizing Search/regex Statistics

The experiments based on Recurrent Neural Networks

Bidirectional LSTM-CRF Models for Sequence Tagging

Lecture 9: Machine Learning Overview AP Computer Science Principles

Presentation transcript:

Giuseppe Attardi Dipartimento di Informatica Università di Pisa Project Topics Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Deep Learning Tokenizer Depling 2016 challenge requires a tokenizer for any of the Universal Dependency TreeBanks at: http://universaldependencies.org Choose one language Build a DL tokenizer using Keras based on the approach of: Basile, Valerio and Bos, Johan and Evang, Kilian A General-Purpose Machine Learning Method for Tokenization and Sentence Boundary Detection (2013), http://gmb.let.rug.nl/elephant/

Deep Learning POS for UD Depling 2016 challenge requires a POS tagger for any of the Universal Dependency TreeBanks at: http://universaldependencies.org Choose one language Build a DL POS using CNN, for example a LSTM that uses word embeddings and possible charcater embeddings.

Convolutional Networks for Sentiment Analysis SemEval Annotated Data: http://alt.qcri.org/semeval2017/task4/index.php?id=results Unannotated Data: 50 million tweets (ask Attardi) Code: DeepNL, https://github.com/attardi/deepnl Article: A. Severyn, A. Moschitti.UNITN: Training Deep Convolutional Neural Network for Twitter Sentiment Classification

POS tagging using Word Embeddings Data: https://github.com/UniversalDependencies/UD_Italian-PoSTWITA Embeddings: http://tanl.di.unipi.it/embeddings/ Article: Stratos, M. Collins. Simple Semi-Supervised POS Tagging. http://www.cs.columbia.edu/~stratos/research/naacl15semipos.pdf

Contextual Embeddings See paper: G. Attardi. Representation of Word Sentiment, Idioms and Senses. Proc. of 6th Italian Information Retrieval Workshop (IIR 2015). Cagliari, 2015. Use text from Wikipedia Extractor: https://github.com/attardi/wikiextractor

Entity Recognition and Linking See task description and data at: http://neel-it.github.io

Question Answering from FAQ See task description at: http://qa4faq.github.io

Context aware Spell Checker Words do not appear in isolation, they always have a context. You have to implement a spell checker that given a sentence with one misspelled word identifies such word and suggests the correction. Bonus: use of POS tagging; correct more than one error.

Wikipedia Related Pages Download the Wikipedia dump of a language of preference Use extractor: https://github.com/attardi/wikiextractor Implement and compare a "related pages" function based on: traditional bag-of-word representations doc2vec representations.

Negation/Speculation Extraction Determine the scope of negative or speculative statements: The lyso-platelet had no effect MnlI-AluI could suppress the basal-level activity Approach: Classifier for identifying cues Classifier to determine scope Data BioScope dataset: http://rgai.inf.u-szeged.hu/index.php?lang=en&page=bioscope

Corpus of Product Reviews Download reviews from online shops Classify as positive/negative according to number of stars Train classifier to assign score Bonus: use of a neural network for the classifier; use of pretrained embeddings.

Relation Extraction Exploit word embeddings as features + extra hand-coded features Use the Factor Based Compositional Embedding Model (FCM) http://www.cs.jhu.edu/~mrg/publications/finere-naacl-2015.pdf SemEval 2014 Relation Extraction data

Entity Linking with Embeddings Experiment with technique: R. Blanco, G. Ottaviano, E. Meiji. 2014. Fast and Space-Efficient Entity Linking in Queries. labs.yahoo.com/_c/uploads/WSDM-2015-blanco.pdf

Extraction of Semantic Hierarchies Use word embeddings as measure of semantic distance Use Wikipedia as source of text http://ir.hit.edu.cn/~jguo/papers/acl2014-hypernym.pdf Organism Plant Ranuncolacee Aconitum

Evalita 2018 Tasks from Evalita 2018: Also check previous editions Aspect-based sentiment analysis Emoji prediction Hate speech detection Irony Detection in Twitter Automatic Misogyny Identification … Also check previous editions

Deep Learning Applications Entity Hierarchy Embeddings http://www.cs.cmu.edu/~zhitingh/data/acl15entity.pdf Character RNNs for text generation http://karpathy.github.io/2015/05/21/rnn-effec8veness/ Morphology Better Word Representations with Recursive Neural Networks for Morphology – Luong et al. Polysemous words Improving Word Representa8ons Via Global Context And Multiple Word Prototypes by Huang et al. 2012 Natural language Inference (Logic) Question Answering Automatic Image captioning