Intelligent Database Systems Lab Presenter : Chang,Chun-Chih Authors : David Milne *, Ian H. Witten 2012, AI An open-source toolkit for mining Wikipedia
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. For developers and researchers it represents a giant multilingual database of concepts and semantic relations, a potential resource for natural language processing
Intelligent Database Systems Lab Objectives The Wikipedia Miner toolkit, an open-source software system that allows researchers and developers to integrate Wikipedia’s rich semantics into their own applications. Wikipedia Miner is intended to be a platform for sharing data mining techniques.
Intelligent Database Systems Lab Methodology - Architecture of the wikipedia Miner toolkit
Intelligent Database Systems Lab Methodology - Measuring relatedness between concepts
Intelligent Database Systems Lab Methodology - Measuring relatedness between concepts
Intelligent Database Systems Lab Methodology -Features for measuring artucle relatedness
Intelligent Database Systems Lab Experiments - Impact of thresholds for disambiguation and detection
Intelligent Database Systems Lab Experiments - Impact of relatedness dependencies
Intelligent Database Systems Lab Experiments - Impact of traning data
Intelligent Database Systems Lab Experiments - performance of the disambiguator
Intelligent Database Systems Lab Experiments - performance of the detector
Intelligent Database Systems Lab Conclusions Our aim in releasing this work open source is not to provide a complete and polished product, but rather a resource for the research community to collaborate around and continue building together.
Intelligent Database Systems Lab Comments Advantages Applications - wikipedia - Disambiguation - Annotation