11111 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Benchmarking in Knowledge Web Jérôme Euzenat
22222 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Research Benchmarking Industrial Benchmarking ≠ WP 1.2 (From T.A. page 26) WP 2.1 (From T.A. Page 41) Point of view Tool recommendation Research progress Criteria Utility Scalalability Robustness Interoperability Tools Ontology development tools Annotation tools Querying and reasoning services of ontology development tools Merging and alignment tools Ontology development tools Annotation tools Querying and reasoning services of ontology development tools Semantic Web Service technology
33333 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Index Benchmarking activities in Knowledge Web Benchmarking in WP 2.1 Benchmarking in WP 2.2 Benchmarking information repository Benchmarking in Knowledge Web
44444 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Overview of the benchmarking activities: Progress What to expect from them What are their relationships/dependencies What could be shared/reused between them Benchmarking activities in KW
55555 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez D2.1.1: Benchmarking SoA D2.2.2: Benchmarking methodology for alignment D2.2.4: Benchmarking alignment results D2.1.4: Benchmarking Methodology, criteria, test suites D2.1.6: Benchmarking building tools Benchmarking querying, reasoning, annotation Benchmarking web service technology D1.31: Best practices and guidelines for industry Best practices and guidelines for business cases D1.2.1: Utility of ontology development tools Utility of merging, alignment, annotation Performance of querying, reasoning Finished Started Not started Progress: WP 1.2 Roberta Cuel WP 1.3 Luigi Lancieri WP 2.1 Raúl García WP 2.2 Jérôme Euzenat Benchmarking timeline ?
66666 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T SoA on the technology of the scalability WP T Definition of a methodology, general criteria for benchmarking T Utility of ontology- based tools T Benchmarking of ontology building tools T Design of a benchmark suite for alignment T Research on alignment techniques and implementations T Best Practices and Guidelines Benchmarking relationships Benchmarking methodology alignment Benchmark suite alignment Benchmarking methodology Benchmark suites Benchmarking overview SoA ontology tech. evaluation Benchmarking methodology Best Practices
77777 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Index Benchmarking activities in Knowledge Web Benchmarking in WP 2.1 Benchmarking in WP 2.2 Benchmarking information repository Benchmarking in Knowledge Web
88888 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T Definition of a methodology, general criteria for ontology tools benchmarking T State of the Art Benchmarking methodology Type of tools to be benchmarked: Ontology building tools Annotation tools Querying and reasoning services of ontology development tools Semantic Web Services technology General evaluation criteria: Interoperability Scalability Robustness Test suites for each type of tools Benchmarking supporting tools Overview of benchmarking, experimentation, and measurement SoA of ontology technology evaluation T Benchmarking of ontology building tools T2.1.x Benchmarking querying, reasoning, annotation, web service Specific evaluation criteria: Interoperability Scalability Robustness Test suites for ontology building tools Benchmarking supporting tools Benchmarking in WP 2.1
99999 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Ontology Technology/Methods EvaluationBenchmarking Desired attributes Weaknesses Comparative analysis... Continuous improvement Best practices Measurement Experimentation T 2.1.1: Benchmarking Ontology Technology in D Survey of Scalability Techniques for Reasoning with Ontologies Overview of benchmarking, experimentation, and measurement State of the Art of Ontology-based Technology Evaluation Recommendations
10 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Plan 1 Goals identification 2 Subject identification 3 Management involvement 4 Participant identification 5 Planning and resource allocation 6 Partner selection Experiment 7 Experiment definition 8 Experiment execution 9 Experiment results analysis Improve 10 Report writing 11 Findings communication 12 Findings implementation 13 Recalibration T 2.1.4: Benchmarking methodology, criteria, and test suites General evaluation criteria: Interoperability Scalability Robustness Benchmark suites for: Ontology building tools Annotation tools Querying and reasoning services Semantic Web Services technology Benchmarking supporting tools: Workload generators Test generators Statistical packages... Methodology
11 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T 2.1.6: Benchmarking of ontology building tools Benchmarking ontology building tools Partners/Tools: UPM... Benchmark suites: Interoperability (x tests) Scalability (y tests) Robustness (z tests) Benchmarking results: Comparative Weaknesses (Best) practices Recommendations Benchmark suites: RDF(S) Import capability OWL Import capability RDF(S) Export capability OWL Export capability Experiments: Import/export RDF(S) ontologies Import/export OWL ontologies Check for knowledge loss... Experiment results: test 1 test 2 test 3... NO OK Benchmarking results: Comparative Weaknesses (Best) practices Interoperability Do the tools import/export from/to RDF(S)/OWL? Are the imported/exported ontologies the same? Is there any knowledge loss during import/export?...
12 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Index Benchmarking activities in Knowledge Web Benchmarking in WP 2.1 Benchmarking in WP 2.2 Benchmarking information repository Benchmarking in Knowledge Web
13 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T Design of a benchmark suite for alignment Why evaluate? Comparing the possible solutions; Detecting the best methods; Finding out where we are bad. Two goals: For the developer: improving the solutions; For the user: choosing the best tools; For both: testing compliance with a norm. Results: Benchmarking methodology for alignment techniques; Benchmark suite for alignment; First evaluation campaign; Greater benchmarking effort. How evaluate? Take a real life case and set the deadline Take several cases normalizing them Take simple cases identifying what they highlight (benchmark suite) Build a challenge (MUC, TREC)
14 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T What has been done? Information Interpretation and Integration Conference (I3CON), to held at the NIST Performance Metrics for Intelligent Systems (PerMIS) Workshop: focuses on "real-life" test cases and compare algorithm global performance. Facts: 7 ontology pairs; 5 participants; Undisclosed target alignments (independently made); Ask for the alignments in normalized format; Evaluation on the F-measure. Results: Difficult to find pairs in the wild (they have been created); No dominating algorithm, no most difficult case for all; 5 participants was the targetted number, we must have more next time! The Ontology Alignment Contest at the 3rd Evaluation of Ontology-based Tools (EON) Workshop, to be held the International Semantic Web Conference (ISWC): aims at defining a proper set of benchmark tests for assessing feature-related behavior. Facts: 1 ontology and 20 variations (15 hand-crafted on some particular aspects); Target alignment (made on purpose) published; Ask for a paper, with comments on the tests and on the achieved results (as well as the results in normalized format). Results: We are currently benchmarking the tools! See you at EON Workshop, ISWC 2004, Hiroshima, JP November …
15 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez T What’s next? More consensus on what’s to be done? Learn more Take advantage of the remarks Make a more complete: real-world+bench suite+challenge? Provide automated procedures
16 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Index Benchmarking activities in Knowledge Web Benchmarking in WP 2.1 Benchmarking in WP 2.2 Benchmarking information repository Benchmarking in Knowledge Web
17 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Benchmarking information repository Web pages inside the Knowledge Web portal with: General benchmarking information (methodology, criteria, test suites, references,...) Information about the different benchmarking activities in Knowledge Web Benchmarking results and lessons learned... Objectives: Inform Coordinate Share/reuse... Proposal for a benchmarking working group in the SDK cluster.
18 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Index Benchmarking activities in Knowledge Web Benchmarking in WP 2.1 Benchmarking in WP 2.2 Benchmarking information repository Benchmarking in Knowledge Web
19 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez In Knowledge Web: Benchmarking is performed over products/methods (not processes) Benchmarking is not a continuous process Ends with findings communication, there is no findings implementation or recalibration Benchmarking technology involes evaluating technology Benchmarking technology is NOT just evaluating technology We must extract practices and best practices Benchmarking results Comparative Weaknesses (Best) practices Benchmarking results are needed! Both in industry and research... Recommendations (Continuous) Improvement What is benchmarking in Knowledge Web?
20 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez How much do we share? Benchmarking methodology, criteria, and test suites Benchmarking results Is the view about benchmarking from industry “similar” to the view from research? Is it viable to have a common methodology? Will anyone use it? Can the test suites be reused between industry/research? Can be useful a common way of presenting test suites?... Can research benchmarking results be (re)used by industry, and viceversa? Can be useful a common way of presenting results?...
21 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Provide the benchmarking methodology to industry: First draft after Manchester Research meeting. 1st October. Feedback from WP 1.2. End of October. (Almost) final version by half-November. Set up web pages with benchmarking information in the portal: Benchmarking activities Methodology Criteria Test suites Discuss in a mailing list and agree on a definition of “best practice”. Next meeting? To be decided (around November) (with O2I) Next steps
22 Benchmarking in KW. Sep 10th, 2004 © R. García-Castro, A. Gómez-Pérez Benchmarking in Knowledge Web Raúl García-Castro, Asunción Gómez-Pérez September 10th, 2004 Jérôme Euzenat