Download presentation
Presentation is loading. Please wait.
Published byKatrina Stone Modified over 8 years ago
1
WIKT 2007Košice, 15-16. november 20071 Tvorba sémantických metadát Michal Laclavík Ústav Informatiky SAV
2
WIKT 2007Košice, 15-16. november 20072 Semantické metadáta Ontológia –Model –Inštancie = semantické metadáta Protege Automatické formuláre –NAZOU Wrapovanie Databáz –RDB2Onto, D2R MAP, D2R, R2O Anotácia, značkovanie dokumentov
3
WIKT 2007Košice, 15-16. november 20073 Information Extraction MUC Conferencies –Named Entity recognition (NE) Finds and classifies names, places, etc. –Coreference resolution (CO) Identifies identity relations between entities. –Template Element construction (TE) Adds descriptive information to NE results (using CO). –Template Relation construction (TR) Finds relations between TE entities. –Scenario Template production (ST) Fits TE and TR results into specified event scenarios. Gate –Information Extraction platform
4
WIKT 2007Košice, 15-16. november 20074 Goal Identification of instances from the ontology –search Automatic ontology population –creation
5
WIKT 2007Košice, 15-16. november 20075 Search Disambiguity – viac zmyselnosť Aliases – Miery podobnosti (IR, NLP, IE …) –Kosinusova miera –Levenstainove operacie –...
6
WIKT 2007Košice, 15-16. november 20076 Create Patterns for creating individuals –Structure, regex, IE techniques Relevance –If individual should be really created Same problems as in Search as well
7
WIKT 2007Košice, 15-16. november 20077 Information Retrieval – Evaluation Precession Recall F-measures
8
WIKT 2007Košice, 15-16. november 20078 Manual Annotation & Browsing
9
WIKT 2007Košice, 15-16. november 20079 Wrappers Similar to IE Pattern is structure of document Not tied with KB Good results in combination with other techniques –Location: San Francisco, New York –Job Type: Permanent, Contract –Job Type: Full-time
10
WIKT 2007Košice, 15-16. november 200710 C-PANKOW POS tagging –QTag Google API for relevance
11
WIKT 2007Košice, 15-16. november 200711 KIM Separation –KB –Doc –Annotation NE recognition –GATE Lucene
12
WIKT 2007Košice, 15-16. november 200712 SemTag Only distributed annotation 264 million web pages 434 million annotations TAP Knowledge base Ambiguity resolution –Cosine measure Standford
13
WIKT 2007Košice, 15-16. november 200713 Ontea Pattern based annotation –Regex Podobné metódy –C-PANKOW, SemTag Iné jazyky ako angličtina –Slovenčina Rýchlejšie a presnejšie ako C-PANKOW Umožňuje aj tvorbu inštancií, SemTag nie Architektúra je tvorená tak aby sa dali pripojiť iné Pattern anotačné riešenia –Wraper, IE,... NAZOU, 21-23. 9. 2007, Poľana => +
14
WIKT 2007Košice, 15-16. november 200714 Evaluation
15
WIKT 2007Košice, 15-16. november 200715 Evaluation
16
WIKT 2007Košice, 15-16. november 200716 Evaluation
17
WIKT 2007Košice, 15-16. november 200717 Conclusion Good area for future research Problem of meta data need to be solved, including –Protocols –Meta data repositories –Upper ontologies –Meta data creation algorithms (annotation algorithms)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.