Download presentation
Presentation is loading. Please wait.
1
Realizing Interoperability of Heterogeneous Repositories Daniel Olmedilla L3S Research Center / Hannover University Programa de Postgrado en Ingeniería Informática y de Telecomunicación (Máster y Doctorado) Universidad Autónoma de Madrid, 10 th April, 2008
2
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid2 Outline Introduction and Motivation Interoperability: what is it and why is it needed? Common Query Interface Common Metadata Schema Ranking Successful Interoperability Demonstrations Conclusions & Open Issues
3
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid3 Outline
4
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid4 Introduction Simple Motivation Scenario (I) Simple Scenario: Alice is interested in learning about Windows and would like to attend a lecture about it this year
5
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid5 Introduction Simple Motivation Scenario (& II)
6
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid6 Introduction Search Engine Limitations Unstructured information and lack of semantics Size and coverage of the Web Hidden Web (also Deep Web) Personalized Ranking
7
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid7 Introduction Other Approaches: Coalitions Repositories interconnected Lack of standards, ad-hoc solutions Individual agreement required to join Approaches Replication Loose control over data sometimes undesirable Federated Search Lack of standards costly
8
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid8 Introduction Other Approaches: P2P Networks Advantages Scalability No single point of failure Control remains with owners Dynamicity Disadvantages Decrease on performance Ad-hoc interfaces lack of interoperability
9
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid9 Introduction A bit More Complex Motivation Scenario Alice is a consultant and she has been asked to lead a project starting in two months. Now she needs to retrieve courses in order to refresh and improve her previous knowledge on project management get some basic knowledge about accounting and auditing practice her advanced level of English
10
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid10 Introduction Problem Statement Lack of standards and appropriate integration solutions prevent users from easily and effectively finding relevant resources to their needs
11
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid11 Outline
12
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid12 Interoperability: What and Why? Exercise 1: simple questions What is interoperability? What does it mean two systems interoperate? And at the information level?
13
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid13 Interoperability: What and Why? What is it? Summary from existing definitions: Ability of working together to accomplish a common task Work in conjunction Exchange of information and USE it Provided at different levels Without increasing the effort of the user [Concise Oxford Dictionary, NISO, IEEE: Standard Computer Dictionary, DMReview, Whatis.com]
14
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid14 Interoperability: What and Why? Interoperability encompasses … Technical Interoperability Semantic Interoperability Political Interoperability Inter-community Interoperability Legal Interoperability International Interoperability
15
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid15 Interoperability: What and Why? Investment in Technology ICT Gobally $1,45 trillion annually Technology in Europe €6,4 billion in 2004 Increasing (10% more than previous year) [Money for Growth, The European Technology Investment Report 2005. PricewaterhouseCoopers Report, Jun. 2005]
16
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid16 Interoperability: What and Why? Key Technological Issues (I) 38 industry associations in 27 different countries The most significant technology issues … included Integration (21%) Standards (20%) [International Survey of E-Commerce. World Information Technology and Services Alliance (WITSA), 2000]
17
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid17 Interoperability: What and Why? Key Technological Issues (& II) [International Survey of E-Commerce. World Information Technology and Services Alliance (WITSA), 2000]
18
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid18 Interoperability: What and Why? Interoperability Inhibited by Cost “Although interoperability is a significant strategic direction, it is often inhibited by cost” [Survey: Integration costs still hamper agility. Computerworld Today, February 2006]
19
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid19 Interoperability: What and Why? User Effectiveness: Some Facts User Effectiveness Knowledge workers spend from 15% to 35% of their time searching for information Searchers are successful in finding what they seek 50% of the time or less Total Lost not finding the right information: estimated among $2.5 to $3.5 million per year for an enterprise with 1000 knowledge workers opportunity cost: potential additional revenue of $15 million annually [Feldman. The high cost of not finding information. IDC White Paper & KMWorld Magazine, 2004]
20
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid20 Interoperability: What and Why? Challenges to achieve it
21
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid21 Interoperability: What and Why? E-Learning Study Analysis: Technical Requirements Training-life-cycle in companies across Europe Retrieving learning services from a wide variety of providers Search heuristics Metadata queries Matching skill gaps with learning service selections Matching personal development gaps with learning services [Gunnarsdottir. User Trials – Evaluation Report. EU IST ELENA Deliverable, May 2005]
22
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid22 Outline
23
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid23 Common Communication Interface Simple Query Interface (SQI) Simple but Highly flexible: targets different interoperability scenarios Official CEN/ISSS Workshop Agreement since October 2006 Listed by IMS on Query Services Widely adopted in E-Learning community
24
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid24 Common Communication Interface Simple Query Interface: Design Issues Independent of query language, result format and vocabularies Complex information sources may be queried (e.g., P2P networks) Synchronous and asynchronous Support for Lightweight implementations Stateful and stateless Access-control and search separation Easy extensibility
25
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid25 Common Communication Interface Simple Query Interface: Session Management Authentication/authorization are requirements Independent of the search interface Separation is managed via sessions session createAnonymousSession () session createSession (user, passwd) destroySession (sessionId) Other different methods are allowed (e.g., based on credentials or trust negotiations)
26
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid26 Common Communication Interface Traditional Access Control in Decentralized Systems Assumption: I already know you---you have a local account! Not a member?
27
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid27 Common Communication Interface Trust Negotiation: Features Trust is based on parties’ properties Every party can define access control policies to control outsiders’ access to their sensitive resources Establish trust iteratively and bilaterally by the disclosure of certificates and by requests for certificates
28
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid28 Common Communication Interface Trust Negotiation: Example Step 1: Alice requests a service from Bob Step 5: Alice discloses her VISA card credential Step 4: Bob discloses his BBB credential Step 6: Bob grants access to the service Service BobAlice Step 2: Bob discloses his policy for the serviceStep 3: Alice discloses her policy for VISA
29
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid29 Common Communication Interface Simple Query Interface: Query (I)
30
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid30 Common Communication Interface Simple Query Interface: Query (& II)
31
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid31 Common Communication Interface P2P Proxying Architecture [Brunkhorst, Olmedilla. Interoperability for peer-to-peer networks: Opening P2P to the rest of the World. EC-TEL, Oct 2006]
32
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid32 Outline
33
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid33 Common Metadata Schema Data Integration Local As View Global as View
34
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid34 Common Metadata Schema Data Integration Given a query reformulating it in terms of the sources Is easier in GAV (just needs unfolding of the query) Is harder in LAV Adding a new source Supposedly easier in LAV (just need to express the new source as a view of the global schema) Harder in GAV (as the global schema needs to be revised)
35
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid35 Common Metadata Schema Simple Learning Resource Schema
36
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid36 Common Metadata Schema Complex Learning Resource Schema
37
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid37 Common Metadata Schema Competence Requirements Excerpt extracted from a newspaper Complete Master’s Degree (any faculty) Expert knowledge in Java J2EE, Servlets, JSP) Very good IT English and / or Spanish Drawbacks Does not indicate what is mandatory or optional It is not machine-understandable
38
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid38 Common Metadata Schema Competence Definition “an effective performance within a domain / context at different levels of proficiency” Example: Competency “English Language”, Level “Advanced”, Context ”Computer Science”
39
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid39 Common Metadata Schema Competency We use IEEE RCD to represent a Competency Uniquely identify an isolated competency Enriched with human- readable titles and descriptions
40
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid40 Common Metadata Schema Proficiency Level Reusable scales of totally ordered proficiency levels Each level is identified by an ID, a human-readable label and an optional mapping to a numerical domain
41
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid41 Common Metadata Schema Context “... the interlaced conditions in which something exists or occurs” Competences might be interpreted differently in a different context Context are defined in tree-like hierarchies Easier to model and to handle Simpler algorithms, no cycle detection necessary May optionally link to additional ontologies
42
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid42 Common Metadata Schema Competence Links to the dimensions objects High degree of reusability Better support for gap analysis Competences can be simple or composed of other (arbitrary nested) competences Aggregation Set Selection
43
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid43 Common Metadata Schema A bit More Complex Motivation Scenario (Revisited) Alice is a consultant and she has been asked to lead a project starting in two months. Now she needs to retrieve courses in order to refresh and improve her previous knowledge on project management get some basic knowledge about accounting and auditing practice her advanced level of English
44
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid44 Outline
45
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid45 Ranking PageRank Page score based on the link structure of the web It measures page popularity page i pointing to page j means vote from i to j The more backlinks a page has, the more important it is Sum of the ranks of the backlinks
46
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid46 Ranking PageRank Example
47
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid47 Ranking PageRank Personalization It has a personalization vector Computationally expensive: not possible to make the whole computation for each user
48
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid48 Ranking Personalized PageRank Hubs: pages pointing to many important pages Compute one Personalized PageRank Vector for each user (PPV) Challenges: -Reduce storage required -Reduce time for computation Each PPV corresponding to a Preference Set P can be expressed as a linear combination of Basis Hub Vector Decomposes each Basis Hub Vector in two parts: Hub skeleton vector (common interrelationships and precomputed) Partial vector (unique values and computed at construction-time)
49
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid49 Ranking Personalized PageRank Limitations Personalization relies on user’s ability to choose a good Preference Set High quality hubs which match his preferences This process can be automated: Information collected from the user can be used to derive his Preference Set User does not even need to know what is a hub
50
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid50 Ranking A Personalized Ranking Platform (I) Personalization relies on user’s ability to choose a good Preference Set High quality hubs which match his preferences This process can be automated: Information collected from the user can be used to derive his Preference Set User does not even need to know what is a hub
51
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid51 Ranking A Personalized Ranking Platform (II) User’s interests are determined by Most surfed pages User’s bookmarks We get a set of pages from the user but They are not highly ranked hubs HubFinder is an algorithm to find related web pages It allows pluggable filtering mechanisms We use HubRank to find highly rated hubs related to a given initial set of pages User web pages set of related highly rated hubs
52
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid52 Ranking A Personalized Ranking Platform (& III)
53
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid53 Ranking Selected Example (I) Crawl with 3,000,000 web pages 30 bookmarks 15 on architecture 7 on traveling 6 on software 2 on sports 78 selected surfed pages Computed 1300 pages as hub set
54
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid54 Ranking Selected Example (II) Query Keywords PageRankPPRPROS Rel.P.Rel.Irrel.Rel.P.Rel.Irrel.Rel.P.Rel.Irrel. architecture 532370820 building 325235415 Paris 604235622 park 6048021000 surf 307424721 Total235221915163578
55
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid55 Ranking Selected Example (& III)
56
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid56 Outline
57
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid57 Successful Interoperability Demonstrations HCD-Online: Advanced Network Search
58
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid58 Successful Interoperability Demonstrations PROLEARN & GLOBE
59
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid59 Successful Interoperability Demonstrations TENCompetence, MACE, MELT, …
60
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid60 Outline
61
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid61 Conclusions & Further Work Conclusions Interoperability is a key technological issue Lack of standards and integration solutions reusability prevent users from finding the information they need
62
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid62 Conclusions & Further Work Main contributions 1.Identification of Requirements for system interoperability 2.Specification and Standardization of Simple Query Interface 3.SQI-based open-source components for easy adoption by information providers 4.Proxying architecture for distributed environments such as P2P networks 5.Data models and ontologies for semantic representation of learning objects and competences 6.Semantic integration based on query rewriting mechanisms 7.New personalized ranking algorithms for linked and unlinked corpus 8.Proof of concept integrated prototypes 9.Demonstration of interoperability achievement through several networks and projects world wide
63
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid63 Conclusions & Further Work Further Work Interfaces for other services than search (e.g., publishing) More research on flexible query languages (e.g., PLQL) Development and Evolution of schemas Adaptation, optimization and improvement of ranking algorithms
64
Daniel Olmedilla Apr. 10th, 2008Universidad Autónoma de Madrid64 Questions? olmedilla@L3S.de - http://www.olmedilla.info/ Thanks!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.