Download presentation
Presentation is loading. Please wait.
1
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart
2
Typical Applications of Ontologies Agent communication Data integration Description of service capabilities for matching and composition purposes Formal verification of process descriptions Unification of terminology across communities
3
Text Applications of Ontologies Information Retrieval (IR) Clustering and Classification of Documents Semantic Annotation Natural Language Processing
4
Task-Based Evaluation (Porzel and Malaka 2005)
5
Task-Based Evaluation Requirements 1.Algorithm output can be quantified 2.Task can use background knowledge 3.Ontology is an additional parameter 4.Output can be traced to the ontology
6
Contents 1.Text Clustering and Classification 2.Information Highlighting for Supporting Search 3.Related Work
7
Text Clustering and Classification What is the difference?
8
Text Clustering
9
Text Classification ArrowsWeatherFlat shapes3-D formsSmile!
10
Dot Kom Project One of many competitions
11
Approaches Bag of words Manually engineered MeSH Tree Structures Automatically constructed ontologies
12
What is a “Bag of Words” anyway? the quick brown fox
13
Bag of Words thequickbrownfoxjumpsoverthelazydog (2)
14
Building Hierarchies
15
Note on Ontologies Our ontologies (“micro”) – Like a database record schema Their ontologies (“macro”) – Like WordNet
16
Clustering Hierarchical Agglomerative Clustering Bi-Section K-means “A Comparison of Document Clustering Techniques” – www.cs.sfu.ca/~wangk/894report/chen1.pdf www.cs.sfu.ca/~wangk/894report/chen1.pdf
17
Document Representations Bag of Words Certain words + ontology -> extended features Strategies: add, replace, only
18
Vectors and Cosine Similarity
19
Classification Results (Categories)
20
Classification Results (Documents)
21
Cluster Metrics P : computer-generated clusters L : human-created clusters P, L : sets of clusters (partitioning)
22
Clustering Results
24
Information Highlighting for Supporting Search Challenge: – 10 minute limit – KMi Planet News web site – Compile a list of important People Technologies
25
Information Highlighting for Supporting Search Tools: – Regular browser – Magpie – ESpotter – C-PANKOW
26
Teams A : web browser only B : web browser with AKT information C : web browser with AKT++ information
27
AKT++ Lexicon
28
Scores
29
Conclusions (for this section) Generated ontologies can be comparable to hand-crafted ontologies Humans can trust the computer too much! (Group C drop in score)
30
Related Work Query Expansion Information Retrieval Text Clustering and Classification Natural Language Processing
31
Ambiguity resolution – Bank Compounds – Headache medicine Vague words – With, of, has – Selectional restrictions Anaphora
32
More Applications Word sense disambiguation Classification of unknown words Named Entity Recognition (NER) Anaphora Resolution Question Answering – Who wrote the Hobbit? – Tolkien is the author of the Hobbit. Information Extraction – AUTOSLOG, ASIUM
33
Analysis/Conclusion Pro/con: – Focused on two systems – Passing survey of others
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.