13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU1 Ontology Construction & Tools Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU2 Ontology Development The Domain Expert’s Expressway: Ontology Development 101: A Guide to Creating Your First Ontology by Natalya F. Noy and Deborah L. McGuinness. Ontology Development 101 Tools used: Protégé with OntoViz API. Note that: (i) extensive domain knowledge, and (ii) ontology tools skill are required for building usefull ontologies. Example: Brusa et al: A Process for Building a Domain Ontology, AOW 2007.AOW 2007
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU3 Ontology Development Through Knowledge Discovery The (Syntactic) Discovery Approach [Davies et al. Ch. 2]: Knowledge discovery Ontology definition Semi-automatic ontology construction Ontology learning scenarios Knowledge discovery for ontology learning
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU4 Knowledge Discovery Knowledge discovery: developing techniques enabling automatic discovery of novel and interesting information from (raw) data. Lately, un-/semi-structured domains, such as: Text Mining, Web Mining, Link Analysis (graphs/networks) Relational Data Mining (relational / first order form) Stream Mining (analysis of data streams)... are of interest. => Semi-Automatic Ontology Construction
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU5 Knowledge Discovery (continued) KD relates to such research areas as: Computational Learning Theory: theoretical questions about learnability, computability, learning algoriths. Machine Learning: automated learning and knowledge representation Data Mining: using learning techniques on large-scale real-life data, Web Mining, Statistics-cum-Statistical Learning: techniques for data analysis. Conference: 9th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2007), Sept. 3-7, 2007, Regensburg, Germany. Proceedings in LNCS.DaWaK 2007 CFP due date: Submission of abstracts: April 2, 2007 Submission of full papers: April 13, 2007 Check KD subjects. DaWaK 2008 DaWaK 2008
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU6 Ontology Definition Ontology is a graph / network structure consisting of: A set of concepts (vertices in a graph) A set of relationships connecting concepts (directed edges in a graph) A set of instances of a particular concept or relationship (data records). Formal/theoretical definitions of ontology as an abstract structure: Ehrig et al. (2005): based on similarity measure Bloehdorn et al. (2005): through integration of MLs.
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU7 Ontology Engineering Semi-Automatic Ontology Construction Ontology Life Cycle of DILIGENT ontology engineering and construction methodology: building, local adaptation, analysis, revision, and local update. Semi-automatic ontology construction (a la CRISP-DM ‘data mining’ methodology): 1. Domain understanding: interest area. 2. Data understanding: data versus semi-automatic ontology construction. 3. Task definition: tasks of interest that are doable with the available data. 4. Ontology learning: semi-automatic process executing the tasks of step Ontology evaluation: estimating quality of solution to taks. 6. Refinement (semi-/manual): human-in-the-loop transformation to improve the ontology. Business Domain Ontology Domain
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU8 Ontology Learning Scenarios Typical ones are as follows: Inducing concepts and clustering of instances (given instances) Inducing relations (given concepts and instances) Ontology population (given an ontology and relevant but not-associated instances) Ontology generation (given instances and background info) Ontology updation (given an ontology and new instances).
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU9 Knowledge Discovery for Ontology Learning KD aims to extract a structure in the data. That is, mapping unstructured data into ontological structure. At the same time, keep in mind scalability issues as KD process is used necessarily on real-life dataset volumes (~terabytes). Some KD techniques used in addressing the ontology learning scenarios: Unsupervised Learning Semi-Supervised, Supervised, and Active Learning Stream Mining & Web Mining Focused Crawling Data Visualization
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU10 Unsupervised Learning By grouping like instances through comparing them against each other and suggesting labels for the groupings that evolve. Methods used are: Document Clustering Latent Semantic Indexing Ref. Section
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU11 Semi-Supervised, Supervised, and Active Learning Man-in-the-loop, tools-assisted approaches Reference Section 2.6.2
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU12 Stream Mining & Web Mining Stream mining: schemes for rapidly changing data running continuously. Web mining: Web content mining Web structure mining Web usage mining Reference Section
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU13 Focused Crawling The approaches dealing with collecting documents on the Web. Reference Section
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU14 Data Visualization For obtaining early measures of data quality, content, and distribution. Reference Section 2.6.5
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU15 Further References on Ontology Construction Reference Section 2.7. Especially note Fernandez (1999) paper on analyzing ontology development approaches against IEEE Standard for Developing Software Life Cycle Processes. Reference Section 2.8: Note hints on research directions.
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU16 Ontology Development Tools Ontology Tools Survey, Revisited by Michael Denny Ontology Tools Survey, Revisited W3C Semantic Web Tools Wiki pageSemantic Web Tools
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU17 Commercial SemWebTech Conferences Semantic Technology Conference (SemTech 2007 ), May, 2007, San Jose, California, USA. A PDF of the conference brochure is available for download at the conference website.SemTech 2007conference website. DAMA Intl Symposium & WILLSHIRE Meta-data Conference, 4-8 March, 2007, Boston, MA, USA. Download the Full Conference Program and Brochure in PDF Here (1.3 mb). Other Willshire Conference tracks. DAMA Intl Symposium & WILLSHIRE Meta-data Conference Download the Full Conference Program and Brochure in PDF Here (1.3 mb)Conference tracks
13/03/'07 upd 11/03/08CmpE 588 Spring 2008 EMU18 References John Davies, Rudi Studer, Paul Warren (Editors): Semantic Web Technologies: Trends and Research in Ontology-based Systems, John Wiley & Sons (July 11, 2006). ISBN: Ch. 2.: pp Brusa, G., Caliusco, M.L. and Chiotti, O. (2006). A Process for Building a Domain Ontology: an Experience in Developing a Government Budgetary Ontology. In Proc. Second Australasian Ontology Workshop (AOW 2006), Hobart, Australia. CRPIT, 72. Orgun, M.A. and Meyer, T., Eds., ACS Ontology Tools Survey, Revisited by Michael Denny (published July 14, 2004 on xml.com) along with Michael's famous Ontology Editor Survey 2004 Table. Ontology Tools Survey, RevisitedOntology Editor Survey 2004 Table W3C Semantic Web Tools Wiki page:Semantic Web Tools Check Jena, SemWeb, Protégé, Swoop, etc.