Download presentation
Presentation is loading. Please wait.
Published byIris Marian Parsons Modified over 9 years ago
1
BY PHILIPP CIMIANO PRESENTED BY JOSEPH PARK CONCEPT HIERARCHY INDUCTION
2
CONCEPT HIERARCHIES Structure information into categories Provide a level of generalization Form the backbone of any ontology
3
COMMON APPROACHES Machine readable dictionaries Lexico-syntactic patterns Distributional similarity Co-occurrence analysis
4
MACHINE READABLE DICTIONARIES Exploit regularity of dictionaries Find a hypernym for the defined word Head of the first NP (genus or kernel term) spring "the season between winter and summer and in which leaves and flowers appear“ hornbeam "a type of tree with a hard wood, sometimes used in hedges“ launch "a large usu. motor-driven boat used for carrying people on rivers, lakes, harbors, etc."
5
LEXICO-SYNTACTIC PATTERNS Hearst patterns Hearstl: NP such as {NP,}* {(and | or)} NP Hearst2: such NP as {NP,}* {(and | or)} NP HearstS: NP {,NP}* {,} or other NP Hearst4: NP {,NP}* {,} and other NP Hearst5: NP including {NP,}* NP {(and | or)} NP Hearst6: NP especially {NP,}* {(and|or)} NP They should occur frequently and in many text genres They should accurately indicate the relation of interest They should be recognizable with little or no pre- encoded knowledge
6
EXAMPLE OF USING HEARST PATTERN 'Such injuries as bruises, wounds and broken bones...' hyponym(bruise, injury) hyponym(wound, injury) hyponym(broken bone, injury)
7
DISTRIBUTIONAL SIMILARITY Distributional hypothesis Words are similar to the extent they share the same context ‘you shall know a word by the company it keeps’ –Firth
8
EXAMPLE
9
CO-OCCURRENCE ANALYSIS
10
THREE MORE APPROACHES Formal Concept Analysis (FCA) Guided Clustering Learning from heterogeneous sources of evidence
11
FORMAL CONCEPT ANALYSIS Set-theoretical approach Parse corpus (extract dependencies) Verb-pp-complement Verb-object Verb-subject Extract surface dependencies (section 4.1.4)
12
PSEUDOCODE
13
EXAMPLE
14
RESULTS
15
GUIDED CLUSTERING Uses hypernyms from WordNet and Hearst patterns
16
EXAMPLE
17
RESULTS
18
MORE RESULTS
19
HETEROGENEOUS SOURCES OF EVIDENCE Naïve threshold classifier Uses Hearst patterns for corpus patterns Uses Google API for web patterns Uses Hearst patterns over downloaded pages Uses WordNet senses Uses ‘head’-heuristic (r-match) Uses corpus based subsumption Uses document based subsumption
20
RESULTS
21
MORE RESULTS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.