Feza Baskaya – Anne Kakkonen - University of Tampere - 20071 Second International Seminar on Subject Access to Information, Helsinki 30th November 2007.

Slides:



Advertisements
Similar presentations
PowerPoint Tutorial 1 Creating a Presentation
Advertisements

Word Tutorial 2 Editing and Formatting a Document
Excel Tutorial 3 Working with Formulas and Functions
Objectives Create an action query to create a table
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Scaling distributed search for diagnostics and prognostics applications Prof. Jim Austin Computer Science, University of York UK CEO Cybula Ltd.
The Wiki and the Blog NIH Wiki Fair D. Calvin Andrus Chief Technology Officer Center for Mission Innovation Central Intelligence Agency 28 February 2007.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
3 october Brown easyBorrow (beta) Brown University Library October 2007.
Word Tutorial 1 Creating a Document
PNS: Personalized Multi-Source News Delivery Georgios Paliouras(1), Mouzakidis Alexandros(1), Christos Ntoutsis(2), Angelos Alexopoulos(3), Christos Skourlas(2)
V3.0 12/04/ For Coral Systems With PRI Services Presented by Education Technology Services.
Linked Lists: Locking, Lock- Free, and Beyond … Based on the companion slides for The Art of Multiprocessor Programming (Maurice Herlihy & Nir Shavit)
Objectives Explore a structured range of data Freeze rows and columns
March 8, Dynamic Fault Tree analysis using Input/Output Interactive Markov Chains Hichem Boudali, Pepijn Crouzen, and Mariëlle Stoelinga. Formal.
Word Tutorial 5 Working with Templates and Outlines
Helping Online Students Do Better Academically with Useful Learning & Study Strategies. Some recommended instructor activities to help students succeed.
AskMe A Web-Based FAQ Management Tool Alex Albu. Background Fast responses to customer inquiries – key factor in customer satisfaction Costs for customer.
SEVENPRO – STREP KEG seminar, Prague, 8/November/2007 © SEVENPRO Consortium SEVENPRO – Semantic Virtual Engineering Environment for Product.
Management Information Systems, Sixth Edition
Leveraging Your Taxonomy to Increase User Productivity MAIQuery and TM Navtree.
Galia Angelova Institute for Parallel Processing, Bulgarian Academy of Sciences Visualisation and Semantic Structuring of Content (some.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
BTW (“By The Way…”) Information Annotation By Rudd Stevens, Jason Endo University of San Francisco.
File Systems and Databases
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Chapter 2: The Visual Studio.NET Development Environment Visual Basic.NET Programming: From Problem Analysis to Program Design.
Automatic Data Ramon Lawrence University of Manitoba
Editing Description Logic Ontologies with the Protege OWL Plugin.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Search Engines and Information Retrieval Chapter 1.
SITools Enhanced Use of Laboratory Services and Data Romain Conseil
Questys Text & Image Management System Records Management for the Information Age.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Adaptive Hypermedia Tutorial System Based on AHA Jing Zhai Dublin City University.
Chapter 8 Collecting Data with Forms. Chapter 8 Lessons Introduction 1.Plan and create a form 2.Edit and format a form 3.Work with form objects 4.Test.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
3 Copyright © 2004, Oracle. All rights reserved. Working in the Forms Developer Environment.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
This material is based upon work supported by the U.S. Department of Energy Office of Science under Cooperative Agreement DE-SC Michigan State.
Internet Searching the World Wide Web. The Internet and the World Wide Web The Internet is a worldwide collection of networks that allows people to communicate.
5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
Working in the Forms Developer Environment
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Web Engineering.
File Systems and Databases

Knowledge Based Workflow Building Architecture
Introduction to Information Retrieval
Presentation transcript:

Feza Baskaya – Anne Kakkonen - University of Tampere Second International Seminar on Subject Access to Information, Helsinki 30th November 2007 QUCCOO Query Construction with Ontology Ontology-based Search Interface Feza BASKAYA Anne KAKKONEN University of Tampere Department of Information Studies

Feza Baskaya – Anne Kakkonen - University of Tampere Outline  1. Background  2. Ontologies  3. Quccoo: Searching Unannotated Collections through Ontologies  4. ShOE: Creating ontologies  5. Discussion, Conclusion

Feza Baskaya – Anne Kakkonen - University of Tampere Background  Vast online information environments  billions of digital documents  many different natural languages  distributed document production and publication: no generally agreed rules  general lack of control in the process  much spam and other unwanted material

Feza Baskaya – Anne Kakkonen - University of Tampere Background, 2  Vocabulary mismatch  hard to guess the best search keys; leads to loss of search effectiveness  especially in foreign languages  hard to know word forms, compound treatment  Other problems – depending on one’s search environment  collection dependency, metadata dependency  engine and query language dependency

Feza Baskaya – Anne Kakkonen - University of Tampere Ontologies  Ontologies model semantics  concepts  rich relationships  support inference  application means resource annotation  closely related to thesauri  Belief: ontologies can solve the vocabulary problem  represents the semantics of resources (documents) better than pure natural language  retrieval becomes correct and accurate  desired: a universal world model, and a controlled language for description and reasoning about this model

Feza Baskaya – Anne Kakkonen - University of Tampere Issues in Classification and Indexing  Index languages -  modeling - coverage, viewpoint  maintenance - ageing, cost  Indexing -  specificity, exhaustivity, consistency  cost - where paid, who pays?  The over-specificity the devices created often lead to poor recall and thus they were soon mostly abandoned

Feza Baskaya – Anne Kakkonen - University of Tampere Any Room for Ontologies?  Should one thus discard ontologies?  or other vocabulary control tools?  In practice, realism tells us that  there will never be a comprehensive & up-to-date ontology – cf. UDC, which had large development community support  no one will annotate for free, for ever & consistently  no one can do that exhaustively and from many viewpoints emerging, e.g., in future  in fact, less than 0.3% of web pages had Dublin Core metadata (Rasmussen 2003)  There is no alternative to searching unannotated collections  automatic annotation does not solve the problem - if one aims at the good semantics required in the Semantic Web

Feza Baskaya – Anne Kakkonen - University of Tampere Searching Unannotated Collections through Ontologies  Searching ontologies can  provide conceptual organization  support direct access to textual content translate between concepts and textual variation translate between natural languages hide search engines / query languages may support other media / structures / features  be light-weight, narrow, and no world models personal, group or small community support versions, mutually incoherent, easily modifiable easily disposable, perhaps tradable

Feza Baskaya – Anne Kakkonen - University of Tampere Searching through Ontologies  Need to solve the vocabulary problem from concepts to textual expressions  three layers: Concepts - for user interaction Expressions - for system use Strings to match - for system use  Need to provide a handy concept browser and query constructor

Feza Baskaya – Anne Kakkonen - University of Tampere Three levels  Forest industry  forest industry  paper industry  saw mill ...  pl(saw, mill)  al(industry)  pl(paperi, tehdas) Conceptual level Linguistic level String level Concepts Search keys Character strings Search terms Codes & abbreviations Search words String constantsString patterns

Feza Baskaya – Anne Kakkonen - University of Tampere QUCCOO: Principles  QUCCOO: QUery ConstruCtion with OntOlogies for direct content access  Based on the three levels …  Aims to provide independence of …  expression variability (nutraceutical?)  natural language (French?)  collection (intranet, Web,…)  indexing (lemmatization, compounds?)  availability of metadata & world model  engine & query language (Lemur, Trip, Google, …)  You just select your concepts, targets and go!  Point, click and go

Feza Baskaya – Anne Kakkonen - University of Tampere QUCCOO: status  Web application, uses state-of-the-art Servlet technology  Supports diverse full-text database engines (Trip, InQuery, etc.) as well web search engines (e.g., Google)  Supports diverse collections  Intuitive; simple interface to access information  Supports multilingual search and various index types

Feza Baskaya – Anne Kakkonen - University of Tampere QUCCOO : Architecture search Conceptual model request Ontology server Postgress RDBMS Postgress RDBMS Java servlet Java servlet Document servers Request Concepts Concepts Query Expanded Query Results Client- AppletServer-Servlet KB DDB InQuery DDB InQuery DDB TRIP DDB TRIP DDB Lemur DDB Lemur Web Google Results 1. snippet 2. snippet... n. snippet c c c c c c c c c own keys query

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Ontology Tree Concepts given by user Search box Options button Search button

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Search Engine Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface LiberalitySelection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Expansion Level Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Ontology Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Database Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Query Lang. Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface User ID

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Ontology Lang. Selection

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Query result page

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Trip Database Engine results Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Inquery Database Engine Results Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Google Results in Finnish Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Google Results in English Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Google Results in Swedish Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Google Results with Logging Facility Ontology in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Ontology in English Google Results in English

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Ontology in English Google Results in Finnish

Feza Baskaya – Anne Kakkonen - University of Tampere Quccoo - interface Extra keyword(s) added by user

Feza Baskaya – Anne Kakkonen - University of Tampere ShOE: Creating ontologies  Search ontology editor - for creating ontologies  supports the 3 layer architecture of QUCCOO  intuitive; easy to learn and use  automatic support for the human editor  Multilingual in many aspects  GUI, User Interface language can be changed  Concepts names can be edited/displayed on-the- fly in different languages  Expressions can be edited/displayed on-the-fly in different languages.

Feza Baskaya – Anne Kakkonen - University of Tampere ShOE: implementation  Well-designed modular object-oriented architecture based on MVC paradigm  Platform independent; written in Java  Flexible; e.g. uses XML as file format, configurable tables, with XML configurable menu structure  Robust  Extensible via Plug-ins

Feza Baskaya – Anne Kakkonen - University of Tampere ShOE - Main window Concept properties Concept hierarchy tree Concept description Tabs Expression window Search field

Feza Baskaya – Anne Kakkonen - University of Tampere Conclusion  ShOE and QUCCOO are one answer to problems in semantic information access  light-weight disposable search ontologies for full content access  independencies of: collections (partially), indexing ways, availability of metadata / annotations changes of needs, variability of ”world models” search engines, query languages vocabulary variation and natural languages  a compromise, different from semantic annotation or indexing, with control at the user end

Feza Baskaya – Anne Kakkonen - University of Tampere User testing

Feza Baskaya – Anne Kakkonen - University of Tampere Cross-language Web search  Test persons  40 students from the University of Tampere and Pirkanmaa polytechnic  Ontology  Combination of two ontologies: Food concepts and geographical concepts  2 interfaces  QUCCOO + interface without ontology (basic Google search)  4 simulated search tasks  Two tasks with one interface and two with the other

Feza Baskaya – Anne Kakkonen - University of Tampere Analysis  Log files  Queries  Relevance assessments (scale 0-4)  Questionnaires  Opinions about ontology and Quccoo- interface

Feza Baskaya – Anne Kakkonen - University of Tampere Results: search success  No significant difference between systems  QUCCOO performed better when strong query structure was needed (“alcoholic beverage”)  In most self-formulated queries no phrases were used → QUCCOO helps persons who are not used to formulate structured queries

Feza Baskaya – Anne Kakkonen - University of Tampere Results: opinions  ”Structure of the ontology was logical”  ”Finding search concepts needed in the tasks in ontology was easy”  ”Using the ontology was effortless”  92 % agreed in all

Feza Baskaya – Anne Kakkonen - University of Tampere Results: opinions  32/40 thought that QUCCOO-interface was easier to use  32/40 liked QUCCOO better  Why?  Helped users to clarify task topic and to find related search keys  Made cross-language search easy (in 80% of direct searches some dictionary was used to help query formulation)

Feza Baskaya – Anne Kakkonen - University of Tampere Discussion Thank you! Over to you... questions?