Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic Extraction and Incorporation of Purpose Data into PurposeNet P. Kiran Mayee Rajeev Sangal Soma Paul SCONLI3 JNU NEW DELHI.

Similar presentations


Presentation on theme: "Automatic Extraction and Incorporation of Purpose Data into PurposeNet P. Kiran Mayee Rajeev Sangal Soma Paul SCONLI3 JNU NEW DELHI."— Presentation transcript:

1

2 Automatic Extraction and Incorporation of Purpose Data into PurposeNet P. Kiran Mayee Rajeev Sangal Soma Paul SCONLI3 JNU NEW DELHI

3 INTRODUCTION Purpose Purpose Need for a knowledge base of objects and actions in which the knowledge is organized around purpose. Need for a knowledge base of objects and actions in which the knowledge is organized around purpose.

4 PurposeNet PurposeNet is an intelligent knowledge- based system dealing with specialized attributes of artifacts – namely, their purpose, purpose of their types, components, accessories, as also data about their birth, processes, side- effects, maintenance and result on destruction. PurposeNet is an intelligent knowledge- based system dealing with specialized attributes of artifacts – namely, their purpose, purpose of their types, components, accessories, as also data about their birth, processes, side- effects, maintenance and result on destruction.

5 PurposeNet

6 Building the PurposeNet Template Designing Template Designing Revision & Refinement of template Revision & Refinement of template Selection of Domain Selection of Domain Information Retrieval from Web Information Retrieval from Web Ontology population Ontology population Testing Testing

7 Need for Automation Acquisition bottleneck Acquisition bottleneck Massive availability of text Massive availability of text Availability of purpose cues Availability of purpose cues

8 Purpose data required Artifact -- garage Artifact -- garage Purpose Purpose  Action -- store  Upon -- vehicle

9 Purpose Cues Word(s)‏ Word(s)‏ Lexical entities in a particular order Lexical entities in a particular order Classification Classification  Sentences beginning with artifact name  Sentences ending with artifact name  Sentence containing artifact name  Hidden Cues

10 Sentences commencing with artifact name

11 Sentences ending with artifact name We cut trees with an axe. action upon artifact

12 Sentences containing artifact name Use the air+pump to fill the tyre. Use the to the

13 Methodology for purpose data extraction

14 Algorithm for Purpose Data Extraction Algorithm PurpDataExtract(corpus)‏ Step1 : Read first sentence in Corpus. Step2 : Loop until end-of-corpus – 2a. if contains(sentence, artifact) and match( sentence, cuetable)‏ t hen extract(sentence, artifact)‏ extract(sentence, to_action)‏ extract(sentence, to_upon)‏ add_to_ontology(artifact, to_action, to_upon) else 2b. goto step 3. Step3 : Read next sentence

15 Data Wikipedia – 249 files Wikipedia – 249 files Wordnet – 81,837 descriptions Wordnet – 81,837 descriptions Princeton noun-artifact corpus – 82,115 sentences Princeton noun-artifact corpus – 82,115 sentences

16 Observations – summary results

17 Purpose Data Extraction Misses

18 IE Metrics for Extraction

19 Result BreakUp per Cue Class

20 Comparison with manually built Ontology Exponential increase in speed Exponential increase in speed High Error Rate High Error Rate

21 Issues Redundancy Redundancy Primary purpose not always obtained Primary purpose not always obtained Pronouns and brand names Pronouns and brand names Correctness and consistency not guaranteed Correctness and consistency not guaranteed One-to-one mapping assumed One-to-one mapping assumed Other sentence manifestations Other sentence manifestations

22 Further Enhancements Parsed input Parsed input Cues for hidden case Cues for hidden case Better artifact lookup list Better artifact lookup list Multipage lookup for consistency Multipage lookup for consistency Cloud computing Cloud computing Automating other attributes of PurposeNet Automating other attributes of PurposeNet

23 Conclusions A methodology was proposed for automated ontology population of purposenet A methodology was proposed for automated ontology population of purposenet The methodology was implemented on three corpora The methodology was implemented on three corpora The time-taken for purposenet 'purpose' ontology population was a fraction of that by manual methods The time-taken for purposenet 'purpose' ontology population was a fraction of that by manual methods The Error rate was found to be high The Error rate was found to be high

24 Thank You


Download ppt "Automatic Extraction and Incorporation of Purpose Data into PurposeNet P. Kiran Mayee Rajeev Sangal Soma Paul SCONLI3 JNU NEW DELHI."

Similar presentations


Ads by Google