Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Learning Methods for Natural Language Processing on the Internet 徐丹云.

Similar presentations


Presentation on theme: "Statistical Learning Methods for Natural Language Processing on the Internet 徐丹云."— Presentation transcript:

1 Statistical Learning Methods for Natural Language Processing on the Internet
徐丹云

2 CCF ADL 52 ~200

3 Wikification and Beyond: The Challenges of Entity and Concept Grounding
Heng Ji,Rensselaer Polytechnic Institute Motivation and Task Definition, A Skeletal View of Wikification Systems Key Challenges and Recent Advances, New Tasks, Trends and Applications

4 Wikification and Beyond: The Challenges of Entity and Concept Grounding
Input: A text document d; Output: a set of pairs (mi, ti) Identifying mentions mi in d Local Inference For each mi in d: (1)Identify a set of relevant titles T(mi); (2)Rank titles ti Global Inference For each document d: (1)Consider all mi and all ti; (2) Re-rank titles ti Heng Ji,Rensselaer Polytechnic Institute

5 Wikification and Beyond: The Challenges of Entity and Concept Grounding
Improving Wikification by Acquiring Rich Knowledge Better Meaning Representation Collaborative Title Collection Global Inference Using the Additional Knowledge Joint Mention Extraction and Linking Collective Inference Matches of Knowledge Graphs Heng Ji,Rensselaer Polytechnic Institute

6 Wikification and Beyond: The Challenges of Entity and Concept Grounding
I don’t think Republican candidates like Romney, Newt, and Johnson have a real chance for the election.(Abstract Meaning Representation) Heng Ji,Rensselaer Polytechnic Institute candidate Person:Johnson Person:Romney Person:Newt and Political-party:Republican

7 Big Learning with Bayesian Methods
Jun Zhu,Tsinghua University Basics, Big Learning Challenges, and Regularized Bayesian Inference Online Learning, Large-scale Topic Graph Learning and Visualization

8 Big Learning with Bayesian Methods
Jun Zhu,Tsinghua University Concept of Baye’s Rule Approximate Bayesian Inference Markov chain Monte Carlo methods(MCMC) Examples A Bayesian Ranking Model Latent Dirichlet Allocation

9 Big Learning with Bayesian Methods
Computationally efficient Bayesian models are becoming increasingly relevant in Big data era RegBayes: Bridges Bayesian methods, learning and optimization Offers an extra freedom to incorporate rich side information Many scalable algorithms have been developed: Online/stochastic algorithms(e.g., online BayesPA) Distributed inference algorithms(e.g, scalable CTM) Jun Zhu,Tsinghua University

10 Sentiment Analysis: Mining Opinions, Sentiments and Emotions
Bing Liu, University Of Illinois at Chicago Sentiment Analysis Essentials Advanced Topics

11 Sentiment Analysis: Mining Opinions, Sentiments and Emotions
Definition Opinion(entity, aspect, sentiment, holder, time) Analysis Document Sentence Entity Aspect Bing Liu, University Of Illinois at Chicago

12 Sentiment Analysis: Mining Opinions, Sentiments and Emotions
Aspect extraction Finding frequent nouns and noun phrases Exploiting opinion and target relations Supervised learning Topic modeling Aspect sentiment classification Lexicon-based approach Bing Liu, University Of Illinois at Chicago

13 Sentiment Analysis: Mining Opinions, Sentiments and Emotions
Advanced Topics Explicit and implicit aspects “The picture quality of this phone is great” VS ”This car is so expensive” Resource usage aspect and sentiment “This washer uses a lot of water” Coreference resolution “This phone’s sound is great. It is cheap too.” Bing Liu, University Of Illinois at Chicago

14 Semantic Matching in Search
Jun Xu, China Academy of Sciences Semantic Matching between Query and Document Approaches to Semantic Matching in Search

15 Semantic Matching in Search
Jun Xu, China Academy of Sciences Semantic Matching between Query and Document query document Term match Semantic match Seattle best hotel Seattle best hotels Partial Yes Pool schedule Swimming pool schedule Natural logarithm transform Logarithm transform China kong China hong kong No Why are windows so expensive Why are macs so expensive

16 Semantic Matching in Search
Jun Xu, China Academy of Sciences Aspects of Semantic Matching Term: NY ->NY Phrase: hot dog -> hot dog Sense: utube -> youtube Topic: Microsoft Office -> Microsoft, PowerPoint, Word, Excel… Structure: how far is sun from earth -> distance between sun and earth

17 Semantic Matching in Search
Jun Xu, China Academy of Sciences Query: michael jordan berkele Term: michael jordan berkeley Phrase: Michael Jordan berkeley Sense: michael i. jordan Topic: machine learning, berkeley Structure: michael jordan

18 Semantic Matching in Search
Jun Xu, China Academy of Sciences Document: Homepage of Michael Jordan Phrase: Michael Jordan, Berkeley, professor Topic: machine learning, berkeley Structure: michael jordan

19 Semantic Matching in Search
Jun Xu, China Academy of Sciences Approaches to Semantic Matching in Search Matching by Query Reformulation Matching with Term Dependency Model Matching with Translation Model Matching with Topic Model Matching Latent Space Model

20 From Simple Search to Search Intelligence: The Evolution of Search Engines
Jianyun Nie, University of Montreal Traditional IR Models, Query and Document Expansion Advanced Methods of Intelligent IR, Mining Relations in Documents and Query logs, Mining Search Intents

21 From Simple Search to Search Intelligence: The Evolution of Search Engines
Traditional IR Models, Query and Document Expansion Indexing Stopwords stemming Retrieval Boolean model Vector space model Probabilistic model Jianyun Nie, University of Montreal Doucument Query indexing indexing Representation (keywords) Representation (keywords) Retrieval

22 From Simple Search to Search Intelligence: The Evolution of Search Engines
Jianyun Nie, University of Montreal Advanced Methods of Intelligent IR, Mining Relations in Documents and Query logs, Mining Search Intents Using Language Model in IR Query Expansion Inference

23 Machine Learning for Search Ranking and Ad Auction
Tieyan Liu, Microsoft Research Machine learning for Web Search Machine learning for computational advertising

24 Machine Learning for Search Ranking and Ad Auction
Machine learning for Web Search Learn to rank methods Regression Classification Pairwise regression Listwise Ranking Generalization theory Tieyan Liu, Microsoft Research

25 Machine Learning for Search Ranking and Ad Auction
Machine learning for computational advertising User Click Behavior Modeling Advertiser Bidding Behavior Modeling Auction Mechanism Optimization Tieyan Liu, Microsoft Research

26 Thank You!


Download ppt "Statistical Learning Methods for Natural Language Processing on the Internet 徐丹云."

Similar presentations


Ads by Google