Download presentation
Presentation is loading. Please wait.
Published byGwendoline Johnson Modified over 9 years ago
1
NEVER-ENDING LANGUAGE LEARNER Student: Nguyễn Hữu Thành Phạm Xuân Khoái Vũ Mạnh Cầm Instructor: PhD Lê Hồng Phương Hà Nội, January 11 2014
2
Idea: Build a structuring KB. What is KB? Categories: cities, companies, sport teams…. Relations: hasOfficeIn(organisation, location) Noun Phrase What is structuring KB?
3
Globe and Mail Stanley Cup hockey NHL Toronto CFRB Wilson play hired won Maple Leafs home town city paper league Sundin Milson writer radio Maple Leaf Gardens team stadium Canada city stadium politician country Miller airport member Toskala Pearson Skydome Connaught Sunnybrook hospital city company skateshelmet uses equipment won Red Wings Detroi t hometown GM city company competes with Toyota plays in league Prius Corrola created Hino acquired automobile economic sector city stadium Idea: Structuring Knowledge Base climbing football uses equipment
4
Ideas: using Machine Learning Machine Learning: a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.artificial intelligencelearn
5
Ideas Seed examples Web NELL Knowledge Base (KB) Human trainers Initial ontology
6
Ideas: the task run 24x7, forever each day: 1.Reading task: extract more facts from the web to populate the initial ontology. 2.Learning task: learn to read (perform #1) better than yesterday.
7
NELL Architecture 3 2 1 Beliefs Candidate facts Knowledge Integrator CPL RL CMC CSEAL Data Resources Knowledge Base Subsystem Components
8
Coupled Pattern Learner (CPL) - Learns to extract category and relation instances/ pattern from unstructure text. - Learns contextual pattern that high-precision extractor for each predicate. - Eg: + Trang An la ten mot co gai. + Trang An la ten mot cong ty. Use it to improve high-precision
9
Input/Output - Input : + Larger text corpus + Initial ontology containing the information. - Output: + Proposed instances/ contextual pattern for each predicate.
10
Input: An ontology O, and a text corpus C Output: Trusted instances/patterns for each predicate for i=1,2,...,∞ do foreach predicate p in O do EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances; FILTER candidates that violate coupling; RANK candidate instances/patterns; PROMOTE top candidates; end
11
Example: Samsung v ừ a tung clip ch ế nh ạ o s ả n ph ẩ m m ớ i c ủ a Nokia. Example: Samsung v ừ a tung clip ch ế nh ạ o s ả n ph ẩ m m ớ i c ủ a Nokia. CityHa Noi, Ho Chi Minh, Da Nang,... CompanySon Ha, Kinh Do,... competesWith(AMD, Intel), (Google, Microsoft), (Samsung, Nokia),...
12
Coupled SEAL Beliefs CSEAL New candidate facts Internet
13
Coupled SEAL SEAL ( Set Expander for Any Language ): expands entities automatically by utilizing resources from the Web CSEAL adds mutual-exclusion and type- checking constraints
14
Coupled SEAL Coupled SEAL :: A semi-structured extractor Queries the internet with sets of beliefs from each category or relation; mines lists and tables for instances Uses mutual exclusion relationships to provide negative examples for filtering overly general lists and tables 5 queries/category 10 queries/relation fetches 50 web pages/query probabilities assigned as in CPL
15
Coupled SEAL Example:
16
Coupled Morphological Classifier KB Data Resources CMC New candidate facts CMC classify NP based on various morphological features (words, capitalization, affixes)
17
Coupled Morphological Classifier Ex1: Bach Mai hotel hotel(Bach Mai) Ex2: Mai person(Mai) Ex3: tradition noun(tradition)
18
Coupled Morphological Classifier Beliefs from KB are used as training instances CMC examines candidate facts proposed by other components and classifies up to 30 new beliefs/candidate
19
Rule Learner Candidate facts Beliefs RL New candidate facts RL uses categories and relations in KB as its input and make new relations for KB.
20
Rule Learner Example 1: playSport(Rooney, football) athlete(Rooney), sport(football) Example2: isCapital(Hanoi, Vietnam), liveIn(Thanh, Hanoi), roommate(Thanh, Khoai), roommate(Khoai, Cam) liveIn(Thanh, Vietnam), roommate(Thanh, Cam), liveIn(Khoai, Hanoi)…..
21
Rule Learner Some kinds of Rule Learner Systems: OneR, Ridor, PART, JRip, ConjunctiveRule. Clip: https://www.youtube.com/watch?v=5On- tDeu2ic
22
Initial result Running 24x7, since January, 12, 2010 Inputs: ontology defining >600 categories and relations 10-20 seed examples of each 100,000 web search queries per day ~ 5 minutes/day of human guidance Result: KB with > 15 million candidate beliefs, growing daily learning to reason, as well as read automatically extending its ontology
23
Initial result Demo: http://rtw.ml.cmu.edu/rtw/kbbrowser/bev erage:beer http://rtw.ml.cmu.edu/rtw/kbbrowser/bev erage:beer
24
References NELL article: http://www.cs.cmu.edu/~acarlson/papers/carlson- aaai10.pdf http://rtw.ml.cmu.edu/rtw/kbbrowser/beverage:beer http://videolectures.net/akbcwekex2012_mitchell_lan guage_learning/ http://videolectures.net/akbcwekex2012_mitchell_lan guage_learning/ Tom Mitchell’s seminar: http://www.youtube.com/watch?v=51q2IajH94A RL: http://mydatamining.wordpress.com/2008/04/14/rule- learner-or-rule-induction/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.