Presentation is loading. Please wait.

Presentation is loading. Please wait.

Commonsense Knowledge Acquisition and Applications

Similar presentations


Presentation on theme: "Commonsense Knowledge Acquisition and Applications"— Presentation transcript:

1 Commonsense Knowledge Acquisition and Applications
Towards Commonsense Enriched Machines Niket Tandon Ph.D. Supervisor: Gerhard Weikum Max Planck Institute for Informatics

2 property brown Hard Rock part of Hand, leg Person Climber is a Person
Humans understand commonsense of the environment Climbing a rock scene Adventurous Activity

3 Humans Machines Human- Machine Knowledge Gap property brown 1 Rock
Hard Rock part of Hand, leg Person 2 Hands Climber is a Person 2 Legs Climbing a rock scene Adventurous Activity 1 Person

4 objects Humans Machines Human- Machine Knowledge Gap property brown
1 Rock Hard Rock Commonsense of objects part of Hand, leg Person 2 Hands Commonsense of relationships Climber is a Person 2 Legs Climbing a rock scene Adventurous Activity 1 Person Commonsense of interactions

5 How will the machines be smarter if we fill this knowledge gap
Smarter Robots Get me a coffee (where?) Smarter Vision Better classifiers Monitor or TV? given mouse, keyboard Smarter IR Adventurous activities

6 Encyclopedic Knowledge
Can we fill the human machine knowledge gap using existing Encyclopedic KBs like FreeBase? Encyclopedic Knowledge Common sense Knowledge Facts about instances/events Facts about Instances: A. Honnold, married, Lisa Honnold Their events: A. Honnold, married on, Facts about classes/activities

7 Encyclopedic Knowledge Commonsense Knowledge
Facts about instances 1. EKB acquisition Unimodal 2. EKB Curation Textual verification 3. EKB Completion Negative training assumptions hold If (ei, rk, ej) holds, then (ei, rk, ej’ != ej) is -ve A. Honnold, bornIn, US A. Honnold, bornIn, UK Facts about classes 1. CKB acquisition Multimodal 2. CKB Curation Textual + Visual 3. CKB Completion Negative training assumptions fail climber, at location, {mountain, university}

8 Encyclopedic Knowledge Commonsense Knowledge
Facts about instances 1. EKB acquisition Unimodal 2. EKB Curation Textual verification 3. EKB Completion Negative training assumptions hold If (ei, rk, ej) holds, then (ei, rk, ej’ != ej) is -ve A. Honnold, bornIn, US A. Honnold, bornIn, UK Facts about classes 1. CKB acquisition Multimodal 2. CKB Curation Textual + Visual 3. CKB Completion Negative training assumptions fail EKBs have several functional relations hence the assumption holds. Classes generalize properties of instances

9 Commonsense knowledge acquisition is different and harder
Humans hardly express the obvious: Scarce & Implicit Spread across multiple modalities: Multimodal Unusual reported more than usual: Reporting Bias Culture specific, Location specific: Contextual

10 KBs possessing commonsense knowledge
Supervision Pros Cons Cyc manually curated accuracy cost coverage ConceptNet semi-automated coverage less organized Tandon et. al AAAI’11 bootstrapped using ConceptNet noise, Desiderata minimal supervision organized, high accuracy > 80%, high coverage >10M --- Need: automatically constructed, semantically organized Commonsense KB

11 Need: robust techniques to automatically construct semantically organized Commonsense KB

12 Three research questions: Investigate robust techniques to acquire:
RQ 1. Commonsense of objects in the environment fine-grained, semantically refined properties.

13 Three research questions: Investigate robust techniques to acquire:
RQ 2. Commonsense of relationships between objects part whole relation, comparative relation…

14 Three research questions: Investigate robust techniques to acquire:
RQ 3. Commonsense of interactions between objects. - activities and their semantic attributes.

15 Three research questions: Investigate robust techniques to acquire:

16 Three research questions: Investigate robust techniques to acquire:
RQ.1 RQ.2 RQ.3

17 Research question 1 RQ 1. Commonsense of objects in the environment fine-grained, semantically refined properties. Previous work: lump together these properties do not distinguish the meanings of the words have low coverage RQ.2 RQ.3

18 Input :𝐿𝑎𝑟𝑔𝑒 𝑡𝑒𝑥𝑡 𝑐𝑜𝑟𝑝𝑢𝑠
𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑒.𝑔. 𝑠𝑢𝑚𝑚𝑖𝑡 𝑖𝑠 𝑐𝑟𝑖𝑠𝑝 Output 𝑡𝑟𝑖𝑝𝑙𝑒𝑠 : < 𝑤1 𝑛 𝑠 , 𝑟, 𝑤2 𝑎 𝑠 > 𝑠𝑢𝑚𝑚𝑖𝑡 𝑛 2 ℎ𝑎𝑠𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 𝑐𝑟𝑖𝑠𝑝 𝑎 3

19 hasAppearance hasSound hasTaste hasTemperature evokesEmotion
Input :𝐿𝑎𝑟𝑔𝑒 𝑡𝑒𝑥𝑡 𝑐𝑜𝑟𝑝𝑢𝑠 𝑐𝑜𝑛𝑡𝑎𝑖𝑛𝑖𝑛𝑔 𝑒.𝑔. 𝑠𝑢𝑚𝑚𝑖𝑡 𝑖𝑠 𝑐𝑟𝑖𝑠𝑝 Output 𝑡𝑟𝑖𝑝𝑙𝑒𝑠 : < 𝑤1 𝑛 𝑠 , 𝑟, 𝑤2 𝑎 𝑠 > 𝑠𝑢𝑚𝑚𝑖𝑡 𝑛 2 ℎ𝑎𝑠𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 𝑐𝑟𝑖𝑠𝑝 𝑎 3 disambiguated n 1.) 2.) 3.) fine-grained relations: r∈R hasAppearance hasSound hasTaste hasTemperature evokesEmotion disambiguated a 1.) 2.) 3.)

20 Our approach Extract generic hasProperty triples over input
<noun> verb [adv] <adj> <adj> <noun> e.g. 𝑠𝑢𝑚𝑚𝑖𝑡 𝑖𝑠 𝑐𝑟𝑖𝑠𝑝.. 𝒔𝒖𝒎𝒎𝒊𝒕, 𝒄𝒓𝒊𝒔𝒑 𝒎𝒐𝒖𝒏𝒕𝒂𝒊𝒏, 𝒄𝒐𝒍𝒅 𝒄𝒉𝒊𝒍𝒊, 𝒉𝒐𝒕 Disambiguate args and classify triple

21 Extract generic hasProperty triples over input
Typically requires training data Disambiguate args and classify triple

22 Extract generic hasProperty triples over input <𝒘𝟏 𝒏 , 𝒘𝟐 𝒂 >
<𝒘𝟏 𝒏 , 𝒘𝟐 𝒂 > Suppose 𝑟=ℎ𝑎𝑠𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 𝑠𝑢𝑚𝑚𝑖𝑡, 𝑐𝑟𝑖𝑠𝑝 𝒓𝒂𝒏𝒈𝒆 𝒓 𝒊𝒏𝒇𝒆𝒓𝒆𝒏𝒄𝒆 <∗,𝒓, 𝒘𝟐 𝒂 𝒔 > 𝒄𝒓𝒊𝒔𝒑 𝒂 𝟑 , 𝒉𝒐𝒕 𝒂 𝟏 , 𝒄𝒐𝒍𝒅 𝒂 𝟏 , 𝒊𝒄𝒚 𝒂 𝟐 … 𝒅𝒐𝒎𝒂𝒊𝒏 𝒓 𝒊𝒏𝒇𝒆𝒓𝒆𝒏𝒄𝒆 < 𝒘𝟏 𝒏 𝒔 , 𝒓,∗> 𝒃𝒆𝒂𝒄𝒉 𝒏 𝟑 , 𝒔𝒖𝒎𝒎𝒊𝒕 𝒏 𝟐 , 𝒎𝒆𝒕𝒂𝒍 𝒏 𝟏 , 𝒎𝒆𝒕𝒂𝒍 𝒏 𝟐 … This is a new Transductive setting because previous transductive settings would only have relationship between triples. By having graphs for parts of triples, we can generalize by first going to abstract level (domain and range) in order to prune the otherwise hopelessly large graph. Disambiguate args and classify triple 𝒂𝒔𝒔𝒆𝒓𝒕𝒊𝒐𝒏 𝒓 𝒊𝒏𝒇𝒆𝒓𝒆𝒏𝒄𝒆 <𝒘𝟏 𝒏 𝒔 ,𝒓, 𝒘𝟐 𝒂 𝒔 > <𝒔𝒖𝒎𝒎𝒊𝒕 𝒏 𝟐 , 𝒄𝒓𝒊𝒔𝒑 𝒂 𝟑 > <𝒃𝒆𝒂𝒄𝒉 𝒏 𝟏 , 𝒉𝒐𝒕 𝒂 𝟏 > …

23 𝑑𝑜𝑚𝑎𝑖𝑛(𝑟), 𝑟𝑎𝑛𝑔𝑒(𝑟), 𝑎𝑠𝑠𝑒𝑟𝑡𝑖𝑜𝑛(𝑟) 𝑖𝑛𝑓𝑒𝑟𝑒𝑛𝑐𝑒
Noisy, Surface form candidates for 𝒓 Graph construction Graph inference

24 An instance of the problem: 𝑟𝑎𝑛𝑔𝑒(𝑟)
summit mountain dancer cold 20 50 3 hot 30 40 10 crisp 15 1 Only hirst similarity generalizes to both nouns and adjectives.

25 An instance of the problem: 𝑟𝑎𝑛𝑔𝑒(𝑟)
𝒄𝒓𝒊𝒔𝒑 𝒂 𝟏 clearly defined 𝒄𝒓𝒊𝒔𝒑 𝒂 𝟑 cold and invigorating temperature 𝒄𝒐𝒍𝒅 𝒂 𝟏 low or inadequate temperature

26 An instance of the problem: 𝑟𝑎𝑛𝑔𝑒(𝑟)
sense #1 sense #2 sense #3 1/2 1/3 1/4

27 Label propagation for graph inference, given few seeds
Label propagation for graph inference, given few seeds. - Label per node = in/not in range of hasTemperature 𝒔𝒖𝒎𝒎𝒊𝒕, 𝒄𝒓𝒊𝒔𝒑 𝒎𝒐𝒖𝒏𝒕𝒂𝒊𝒏, 𝒄𝒐𝒍𝒅 s𝒂𝒍𝒔𝒂, 𝒉𝒐𝒕 Similar nodes Similar labels But, limited training data

28 Label propagation for graph inference, given few seeds
Label propagation for graph inference, given few seeds. - Label per node = in/not in range of hasTemperature Similar nodes Similar labels But, limited training data

29 Label Propagation: Loss function (Talukdar et. al 2009)
Seed label loss Similar node diff label loss Label prior loss (high degree nodes are noise) U V

30 Similar node diff label loss
Label propagation for graph inference, given few seeds. - Label per node = in/not in range of hasTemperature Seed label loss Similar node diff label loss Label prior loss

31 WebChild : Model recap Noisy, surface form candidates for 𝒓
Graph construction Graph inference Clean, disambiguated triples in 𝒓

32 Resulting KB ... ... ... WebChild: Large (~5Million),
Semantically organized Accurate (0.82 sampled precision) Domain (hasShape) mountain-n1 leaf-n1 ... Range (hasShape) triangular-a1 tapered-a1 ... Assertions (hasSshape) lens-n1, spherical-a2 palace-n2, domed-a1 ...

33 Summary of property commonsense
WebChild: First commonsense KB with fine-grained relations and disambiguated arguments ; 4.6 million assertions including domain and range for 19 relations. Take away message: Transductive methods help overcome sparsity of commonsense in text. Say it: People usually say commonsense knowledge cannot be found in text. This paper shows that with graph-based methods you can still uncover it and even infer fine-grained disambiguated knowledge. In general, for deeper text understanding and for AI complete tasks. Give sweet house translation example..

34 RQ 3. Commonsense of interactions between objects.
Research question 3 RQ 3. Commonsense of interactions between objects. - activities and their semantic attributes. Previous work: largely discuss events, but activities only at small-scale do not organize the attributes of the activities do not distinguish the meanings of the attribute values

35 An Activity frame {Climb up a mountain , Hike up a hill} Participants
climber, boy, rope Location camp, forest, sea shore Time day, holiday Visuals

36 Semantic organization of Activity frames
Go up an elevation .. Parent activity Previous activity Next activity {Climb up a mountain , Hike up a hill} Participants climber, boy, rope Location camp, forest, sea shore Time day, holiday Visuals Get to village .. Reach at the top ..

37 Contain events but not activity knowledge
May contain activities but no visuals and varying granularity of scene boundaries, transitions.

38 Contain events but not activity knowledge
May contain activities but no visuals and varying granularity of scene boundaries, transitions. Hollywood narratives are good

39 Semantic parsing of scripts Graph construction

40 Semantic parsing of scripts Graph construction Input: Text in a scene taken from a semi-structured movie script e.g. : He began to shoot a video on the summit Output: Disambiguated, semantic roles e.g. the man : agent began to shoot : action a video : patient summit : location SRL systems are computationally expensive, domain specific

41 State of the art WSD customized for phrases
the man man.1 man.2 began to shoot shoot.1 shoot.4 a video video.1

42 Can we use two different information sources to perform SRL
State of the art WSD customized for phrases VerbNet contains curated semantic roles for verbs the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 Selectional restriction Selectional restriction Can we use two different information sources to perform SRL given no training data?

43 shoot.4 patient. inanimate video.1 Thing/ inanimate
State of the art WSD customized for phrases Jointly leverage Syntactic and semantic role semantics from VerbNet the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 WordNet VerbNet linkage WordNet class hierarchy Thing/ inanimate

44 man.1 shoot.4 patient. inanimate video.1 Thing/ inanimate
State of the art WSD customized for phrases Jointly leverage Syntactic and semantic role semantics from VerbNet the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 WordNet VerbNet linkage WordNet class hierarchy Thing/ inanimate Binary decision variable

45 shoot.4 patient. inanimate video.1 Thing/ inanimate
State of the art WSD customized for phrases Jointly leverage Syntactic and semantic role semantics from VerbNet the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 WordNet VerbNet linkage WordNet class hierarchy Thing/ inanimate WSD prior WN prior

46 shoot.4 patient. inanimate video.1 Thing/ inanimate
State of the art WSD customized for phrases Jointly leverage Syntactic and semantic role semantics from VerbNet the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 WordNet class hierarchy WN VN linkage Thing/ inanimate Sense, VN syntactic match score

47 shoot.4 patient. inanimate video.1 Thing/ inanimate
State of the art WSD customized for phrases Jointly leverage Syntactic and semantic role semantics from VerbNet the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1 WordNet class hierarchy WN VN linkage Thing/ inanimate Sense, VN semantic match score

48 Joint WSD and SRL WSD prior WN prior Word, VN match score Selectional restriction score xij = binary decision var. for word i, mapped to WN sense j One VN sense per verb WN, VN sense consistency … … Selectional restr. constraints binary decision

49 Semantic parsing Graph construction of scripts O/P Joint WSD and SRL
Agent: man.1 Action: shoot.4 Patient: video.1 the man man.1 NP VP NP man.2 agent. animate shoot.vn.1 patient. animate began to shoot shoot.1 shoot.4 agent. animate shoot.vn.3 patient. inanimate a video NP VP NP video.1

50 Semantic parsing Graph construction of scripts Climb up a mountain
Participants climber, rope Location summit, forest Time day

51 Construct a graph of activity frames with three edge types:
Semantic parsing of scripts Graph construction Go up an elevation .. Climb up a mountain Participants climber, rope Location summit, forest Time day Hike up a hill Participants climber Location sea shore Time holiday Reach top .. Construct a graph of activity frames with three edge types: TypeOf : T(a,b) Similar : S(a,b) Previous: P(a,b)

52 + Similarity: S (climb up a mountain, hike up a hill)
Activity Similarity Attribute similarity Climb up a mountain Participants climber, rope Location forest Time day Hike up a Hill Participants climber Location woods Time holiday

53 + TypeOf: T (climb up a mountain, go up an elevation)
Activity hypernymy Attribute hypernymy Climb up a mountain Participants climber, rope Location forest Time day Go up an elevation Participants Person Location Exterior Time day

54 Previous: P (reach the top, climb up a mountain)
Reach the top Scene: Carrie and Big start out early to head to the village. They climb up the beautiful mountain which felt as if they were in a different world. After several hours they eventually reach the top. Allow gaps between activities within one scene. PMI style counting to suppress generic activities.

55 Semantic parsing Graph construction of scripts parent similar temporal
Go up an elevation .. parent Climb up a mountain Participants climber, rope Location summit, forest Time day Hike up a hill Participants climber Location sea shore Time holiday similar temporal Reach top ..

56 Semantic parsing of scripts Graph construction

57 Resulting KB: Knowlywood
Statistics Scenes 1,708,782 Activity synsets 505,788 Accuracy 0.85 ± 0.01 #Images from scenes 30,000

58 Summary of activity commonsense
Knowlywood: First organized commonsense activity KB with activity attributes and disambiguated values containing nearly million activities with visuals. Take away message: Jointly leveraging different annotated resources helps overcome sparsity of training data.

59 The overall KB: WebChild KB > 3M concepts, > 18M triples, >1000 relations

60 Conclusions and take home messages: Knowledge to make machines smarter can be acquired with robust techniques that jointly leverage global information Research Question 1 Properties (WSDM’14) Research Question 2 Comparatives, part-whole (AAAI’14, AAAI’16) Research Question 3 Activities (WWW’15, CIKM’15) WEBCHILD KB Applications (CVPR’15, ACL’15, ISWC’16..)

61 Conclusions and take home messages: Knowledge to make machines smarter can be acquired with robust techniques that jointly leverage global information Thank you! RQ1 Range, domain, assertions of fine-grained relations Properties (WSDM’14) RQ2 Fine-grained comparative, part-whole relations Comparatives, part-whole (AAAI’14, AAAI’16) RQ3 Activity frames with semantic attributes Activities (WWW’15, CIKM’15) ML + NLP community limited training data can be overcome by jointly leveraging multiple cues Computer Vision community commonsense helps computer vision vision helps commonsense acquisition AI community semantically organized knowledge is a step towards filling human machine gap WEBCHILD KB Applications (CVPR’15, ACL’15, ISWC’16..) Thank you!


Download ppt "Commonsense Knowledge Acquisition and Applications"

Similar presentations


Ads by Google