Presentation is loading. Please wait.

Presentation is loading. Please wait.

Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis.

Similar presentations


Presentation on theme: "Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis."— Presentation transcript:

1 Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: danica@dcs.shef.ac.uk Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction

2 W EB OF D ATA  Large datasets such as Linked Open Data available  How can we use these data?  Modigliani test: “tell me the locations of all the original paintings of Modigliani” (Richard MacManus, ReadWriteWeb) 03 J UNE 2010 ESWC 2010 2

3 PREFIX fb: PREFIX dbpedia: PREFIX dbp-prop: PREFIX dbp-ont: PREFIX umbel-sc: PREFIX rdf: PREFIX ot: SELECT DISTINCT ?painting_l ?owner_l ?city_fb_con ?city_db_loc ?city_db_cit WHERE { ?p fb:visual_art.artwork.artist dbpedia:Amedeo_Modigliani ; fb:visual_art.artwork.owners [ fb:visual_art.artwork_owner_relationship.owner ?ow ] ; ot:preferredLabel ?painting_l. ?ow ot:preferredLabel ?owner_l. OPTIONAL { ?ow fb:location.location.containedby [ ot:preferredLabel ?city_fb_con ] }. OPTIONAL { ?ow dbp-prop:location ?loc. ?loc rdf:type umbel-sc:City ; ot:preferredLabel ?city_db_loc } OPTIONAL { ?ow dbp-ont:city [ ot:preferredLabel ?city_db_cit ] } } 03 J UNE 2010 ESWC 2010 3 P ASSING M ODIGLIANI TEST Source: http://blog.l arkc.eu/http://blog.l arkc.eu/: “LDSR Passes the Modigliani Test for Semantic Web”, more than 1h to generate a SPARQL query

4 P ASSING M ODIGLIANI T EST : FUTURE 03 J UNE 2010 ESWC 2010 4 “tell me the locations of all the original paintings of Modigliani”

5 B UT, OTHERS HAVE ALREADY DONE IT ? 03 J UNE 2010 ESWC 2010 5 low precision high recall low precision low recall high precision high recall high precision low recall large datasets (several domains) simple factual questions complex questions small datasets (narrow domain) (Damljanović and Bontcheva, 2009.)

6 FRE Y A (F EEDBACK, R EFINEMENT, E XTENDED V OCABULARY A GGREGATOR )  Increase recall by:  generating the dialog whenever an “unknown” term appears in the question  Increase precision by:  generating the dialog whenever one term refers to more than one concept in the ontology  The dialog is generated by combining the language of the user and the ontology  Learn from the dialog 03 J UNE 2010 ESWC 2010 6

7 pptPlex Section Divider [FREyA: a Natural Language Interface to Ontologies] The slides after this divider will be grouped into a section and given the label you type above. Feel free to move this slide to any position in the deck. 03 J UNE 2010 ESWC 2010 7

8 FRE Y A W ORKFLOW 03 J UNE 2010 ESWC 2010 8 answer NL query POCsOCstriplesSPARQL Potential Ontology Concept (POC) Ontology Concept (OC) learn

9 F INDING POC 03 J UNE 2010 ESWC 2010 9

10 F INDING POC S 03 J UNE 2010 ESWC 2010 10

11 F INDING OC S 03 J UNE 2010 ESWC 2010 11

12 pptPlex Section Divider [Examples] The slides after this divider will be grouped into a section and given the label you type above. Feel free to move this slide to any position in the deck. 03 J UNE 2010 ESWC 2010 12

13 geo:City geo:State new york POC population geo:cityPopulation M APPING POC TO OC S 03 J UNE 2010 ESWC 2010 13 geo:State

14 N EW Y ORK IS A CITY 03 J UNE 2010 ESWC 2010 14

15 N EW Y ORK IS A STATE 03 J UNE 2010 ESWC 2010 15

16 POC state area geo:stateArea geo:State geo:isLowestPointOf point T HE U SER C ONTROLS THE O UTPUT 03 J UNE 2010 ESWC 2010 16 max geo:LoPoint geo:loElevation min

17 W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ? 03 J UNE 2010 ESWC 2010 17 TRIPLES: ?firstJoker – geo:isLowestPointOf – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?p2 ?lastJoker where { { { ?c1 ?p0 ?firstJoker} UNION { ?firstJoker ?p0 ?c1}. filter (?p0= ). } ?c1 rdf:type. ?c1 ?p2 ?lastJoker. filter (?p2= ). } ORDER BY DESC(xsd:double(?lastJoker)) however...

18 W HAT IS THE LOWEST POINT OF THE STATE WITH THE LARGEST AREA ? 03 J UNE 2010 ESWC 2010 18 TRIPLES: ?firstJoker – (min) geo:loElevation – geo:LoPoint geo:LoPoint - ?joker3 – geo:State geo:State – (max) geo:stateArea - ?lastJoker SPARQL: prefix rdf: prefix xsd: select ?firstJoker ?p0 ?c1 ?joker3 ?c2 ?p3 ?lastJoker where { ?c1 ?p0 ?firstJoker. filter (?p0= ). ?c1 rdf:type. {{ ?c2 ?joker3 ?c1 } UNION { ?c1 ?joker3 ?c2 }} ?c2 rdf:type. ?c2 ?p3 ?lastJoker. filter (?p3= ). } ORDER BY ASC(xsd:double(?firstJoker)) DESC(xsd:double(?lastJoker)) the answer for both is Death Valley

19 FRE Y A: A N ATURAL L ANGUAGE I NTERFACE TO O NTOLOGIES 03 J UNE 2010 ESWC 2010 19 http://gate.ac.uk/freya

20 pptPlex Section Divider [Evaluation] The slides after this divider will be grouped into a section and given the label you type above. Feel free to move this slide to any position in the deck. 03 J UNE 2010 ESWC 2010 20

21 EVALUATION correctness ranked suggestions learning 03 J UNE 2010ESWC 2010 21

22 EVALUATION : CORRECTNESS Mooney GeoQuery dataset: 250 questions 03 J UNE 2010 ESWC 2010 22

23 EVALUATION : SUGGESTIONS RANKING  Mooney GeoQuery dataset: 250 questions  Manually labelled correct rankings  Mean Reciprocal Rank (MRR): 0.81 03 J UNE 2010 ESWC 2010 23

24 EVALUATION : LEARNING  103 questions correctly answered by engaging the user into 1 dialog  MRR 0.72 03 J UNE 2010 ESWC 2010 24

25 EVALUATION : LEARNING  MRR improved from 0.72 to 0.78 03 J UNE 2010 ESWC 2010 25

26 N EXT S TEPS  Passing Modigliani test  Exploring unknown data structures with FREyA, especially if they are large  LDSR: DBPedia, Freebase, Geonames, UMBEL, Wordnet, CIA World Factbook, Lingvoj, MusicBrainz  http://ontotext.com/ldsr http://ontotext.com/ldsr  User-centric evaluation 03 J UNE 2010 ESWC 2010 26

27 Contact: danica@dcs.shef.ac.ukdanica@dcs.shef.ac.uk THANK YOU FOR YOUR ATTENTION ! QUESTIONS ? 27 Thanks to Abraham Bernstein and Esther Kaufmann from the University of Zurich, for sharing with us Mooney dataset in OWL format, and J. Mooney from University of Texas for making this dataset publicly available.

28 R EFERENCES  Damljanovic, D., Bontcheva, K.: Towards Enhanced Usability of Natural Language Interfaces to Knowledge Bases. In Devedzic V. and Gasevic D. (Eds.), Special issue on Semantic Web and Web 2.0, Annals of Information systems, Springer-Verlag, 2009. 03 J UNE 2010 ESWC 2010 28


Download ppt "Danica Damljanović, Milan Agatonović, Hamish Cunningham contact: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis."

Similar presentations


Ads by Google