Università di AnconaCoopIS01 September 6, C OOPERATION S TRATEGIES FOR I NFORMATION I NTEGRATION Maurizio Panti, Luca Spalazzi, Loris Penserini Istituto di Informatica - Università degli studi di Ancona
Università di AnconaCoopIS01 September 6, Talk Overview Motivations and Goals Local strategies Cooperation strategies the choice of partners the choice of queries the choice of answers Discussion
Università di AnconaCoopIS01 September 6, Motivation Information systems are collections of information sources and information consumers : Distributed Heterogeneous at physical level at logical level at conceptual level (names and schemas) Dynamic changes of information sources or their schemas changes of information consumers or their needs
Università di AnconaCoopIS01 September 6, Goal Rewriting a consumers query into queries to specific information sources when we have a distributed, heterogeneous, and strongly dynamic information system.
Università di AnconaCoopIS01 September 6, Usually query rewriting and information integration systems adopt the Mediator Architecture: [TSIMMIS, Squirrel, WHIPS, Carnot, SIMS, Information Manifold, Infomaster] dynamic sources: systems are overloaded with expensive updating operations; dynamic consumers: systems do not perform user profiling. Related Work
Università di AnconaCoopIS01 September 6, wrapper a description logic as data modelling and query language [e.g. C-Classic] source query processing is based on rewriting query using views over the local source [adapted by Beeri, Levy, Rousset] Information Source
Università di AnconaCoopIS01 September 6, Mediator mediator Thesaurus Mediated Schema mediated schema a description logic as data modelling and query language [e.g. C-Classic] query processing is based on rewriting query using views over the distributed sources [adapted by Beeri, Levy, Rousset]
Università di AnconaCoopIS01 September 6, Retrieval: Conjunction of concepts that are maximally contained in the query Rewriting: Composition of rewriting of retrieved concepts query: Q = pub.(ai db) pub.acm Rewriting query using views T journal type.{Trans} acm_trans acm ai agents db pub.(journal db) J= pub.ai K= pub.(agents db) I= pub..acm_journal acm_journal K= pub.(agen ts db ) pub.(ai db) H J = pub.(ai db journal) I= pub..acm_journal pub.acm retrieved concepts: {K, (H J)} { I } rewrite: (view(K)view(I))U(view(H)view(J)view(I))
Università di AnconaCoopIS01 September 6, #id pub … ….. #id pub … ….. Mediator M rewrite: (view(K)view(I))U(view(H)view(J)view(I)) Query Execution #id pub … ….. rewrite: (view(K)view(I))U(view(H)view(J)view(I)) data: #id pub … …..
Università di AnconaCoopIS01 September 6, Local Failures In query rewriting: the mediator is not able to rewrite (some or all the components) of the input query. In query execution: the mediator is not able to execute (some or all the components) of the rewrited query.
Università di AnconaCoopIS01 September 6, Cooperation Strategies mediator Thesaurus Mediated Schema all the mediators partner mediator source new mediators succeeding mediators failing mediators all the sources new sources succeeding sources failing sources original whole single component selected concepts whole single component rewriting whole single component query answer rewriting data
Università di AnconaCoopIS01 September 6, Cooperation with Mediators Asking for Rewriting Mediator M Mediator N W1 W2 W3 1: request 3: query 4: data 2: rewrite
Università di AnconaCoopIS01 September 6, Mediator M Mediator N W1 W2 W3 1: request 2: query 3: data 4: data Cooperation with Mediators Asking for Data
Università di AnconaCoopIS01 September 6, Cooperation with Sources Mediator M Mediator N W1 W2 W3 2: rewrite 3: data 1: request
Università di AnconaCoopIS01 September 6, Strategy Comparison m: number of mediators s: number of sources s: number of sources that cooperate with N i
Università di AnconaCoopIS01 September 6, Strategy Comparison Sol N (view(Q)) Sol N (Q) Sol N (Q)
Università di AnconaCoopIS01 September 6, Conclusion 1 st scenario (s m+s) mediators can be used for user profiling, mediators can be used to solve name heterogeneity and integrate data, in order to solve schema heterogeneity, for a mediator the most efficient and effective strategy is to directly cooperate with sources, in order to update its schemas, for a mediator a lazy approach can be not appropriate.
Università di AnconaCoopIS01 September 6, Conclusion 2 nd scenario (s >> m+s) mediators can be used for user profiling, the most efficient strategy is the cooperation with other mediators, cooperation with wrappers is useful only when mediators are not able to rewrite a given query, in order to update its schemas, for a mediator a lazy approach is appropriate.
Università di AnconaCoopIS01 September 6,
Università di AnconaCoopIS01 September 6, Ms Mediated Schema
Università di AnconaCoopIS01 September 6, Rewriting query using views Composition of rewriting of retrieved concepts view( pub.(ai db) pub.acm ) = = view( pub.(ai db) ) view( pub.acm ) = (view(K)view(I))U(view(H)view(J)view(I)) = …...
Università di AnconaCoopIS01 September 6, Rewriting query using views ={ ( pub. keyword.{Agents} pub.db pub.acm_trans), ( pub. keyword.{Agents} pub.db pub. type.{Trans} pub. publisher.{ACM}), ( pub.agents pub.db pub.acm_trans), ( pub.agents pub.db pub. type.{Trans} pub. publisher.{ACM}), ( pub.journal pub.db pub. keyword.{AI} pub.acm_trans), … } =( view(K) view(I) ) U ( view(H) view(J) view(I) )
Università di AnconaCoopIS01 September 6, Local Failure in Query Rewriting query: Q= pub.(aidb) affiliation.{Stanford} T journal type.{Trans} acm_trans acm ai agents db pub.(journal db) J= pub.ai K= pub.(agents db) I= pub..acm_journal acm_journal affiliation.{Stanford} K= pub.(agen ts db ) pub.(ai db) H J = pub.(ai db journal) retrieved concepts: {K, (H J)} Ø rewrite : failure
Università di AnconaCoopIS01 September 6, Local Failure in Query Execution no answer #id pub … ….. #id pub … ….. Mediator M rewrite: (view(K)view(I))U(view(H)view(J)view(I)) data: failure
Università di AnconaCoopIS01 September 6, T acm_tods acm pub.acm_tod s pub.acm_tocl acm_tocl Mediator N query: pub.acm Mediator M Cooperation with Mediators Asking for Rewriting rewrite:{view( pub.acm _ tocl),view( pub.acm _ tods)} pub.acm_tod s pub.acm_tocl pub.ac m pub.acm_tod s pub.acm_tocl retrieved concepts : { pub.acm_tocl, pub.acm_tods}
Università di AnconaCoopIS01 September 6, Mediator M Cooperation with Mediators Asking for Rewriting rewrite:{view( pub.acm _ tocl),view( pub.acm _ tods)} data: #id pub … …..
Università di AnconaCoopIS01 September 6, T acm_tods acm pub.acm_tod s pub.acm_tocl acm_tocl Mediator N query: pub.acm Mediator M pub.acm_tod s pub.acm_tocl pub.ac m pub.acm_tod s pub.acm_tocl Cooperation with Mediators Asking for Data retrieved concepts : { pub.acm_tocl, pub.acm_tods} rewrite : {view( pub.acm _ tocl),view( pub.acm _ tods)}
Università di AnconaCoopIS01 September 6, T acm_tods acm pub.acm_tod s pub.acm_tocl acm_tocl Mediator N pub.acm_tod s pub.acm_tocl pub.ac m pub.acm_tod s pub.acm_tocl Cooperation with Mediators Asking for Data rewrite : {view( pub.acm _ tocl),view( pub.acm _ tods)} data: #id pub … …..
Università di AnconaCoopIS01 September 6, T acm_tods acm pub.acm_tod s pub.acm_tocl acm_tocl Mediator N pub.acm_tod s pub.acm_tocl pub.ac m pub.acm_tod s pub.acm_tocl Cooperation with Mediators Asking for Data rewrite : {view( pub.acm _ tocl),view( pub.acm _ tods)} data: #id pub … ….. query: pub.acm Mediator M rewrite:{view( pub.acm_tocl),view( pub.acm_tods)} data: #id pub … …..
Università di AnconaCoopIS01 September 6, Cooperation with Sources query: pub.acm Mediator M retrieved:{ pub.acm_ccs, pub.acm_jacm, pub.acm_tods} rewrite:{view( pub.acm_ccs),view( pub.acm_jacm), view( pub.acm_tods)} data: Mediator M #id pub … …..
Università di AnconaCoopIS01 September 6, Redundancy C n (M)mediated schema of M after n interactions with N C(N)mediated schema of N
Università di AnconaCoopIS01 September 6, Recall C n (M)mediated schema of M after n interactions with S 1, … S n information need of a consumer (a view of S 1, … S n )
Università di AnconaCoopIS01 September 6, Precision C n (M)mediated schema of M after n interactions with S 1, … S n information need of a consumer (a view of S 1, … S n )
Università di AnconaCoopIS01 September 6, Ns Mediated Schema
Università di AnconaCoopIS01 September 6, Ms Mediated Schema (updated)
Università di AnconaCoopIS01 September 6, Ms Mediated Schema (updated)
Università di AnconaCoopIS01 September 6, Ms Mediated Schema (updated)
Università di AnconaCoopIS01 September 6, Cooperation with Mediators Theorem. C n (M)mediated schema of M after n interactions with N C(N)mediated schema of N redundancy:
Università di AnconaCoopIS01 September 6, Cooperation with Mediators Theorem. C n (M)mediated schema of M after n interactions with N information need of a consumer (a view of S 1, … S n ) recall: precision:
Università di AnconaCoopIS01 September 6, Cooperation with Sources Theorem. C n (M)mediated schema of M after n interactions with S 1, … S n information need of a consumer (a view of S 1, … S n ) recall: precision: