ReSeTrus Development of a digital library technology based on redundancy elimination and semantic elevation, with special emphasis on trust management.

Slides:



Advertisements
Similar presentations
Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
Advertisements

2 Introduction A central issue in supporting interoperability is achieving type compatibility. Type compatibility allows (a) entities developed by various.
Combining OS and OSM Data - A Case Study for geospatial data integration Name: Du Heshan Supervisor: Dr Suchith Anand.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
A Stepwise Modeling Approach for Individual Media Semantics Annett Mitschick, Klaus Meißner TU Dresden, Department of Computer Science, Multimedia Technology.
Heterogeneous Data Warehouse Analysis and Dimensional Integration Marius Octavian Olaru XXVI Cycle Computer Engineering and Science Advisor: Prof. Maurizio.
Konstanz, Jens Gerken ZuiScat An Overview of data quality problems and data cleaning solution approaches Data Cleaning Seminarvortrag: Digital.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
1 Workshop on Novel Technologies for Digital Preservation of Cultural Heritage Collections, Ormylia, 21-22/5/2004 LABORATORIES ON SCIENCE AND TECHNOLOGY.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Distributed DBMSs A distributed database is a single logical database that is physically distributed to computers on a network. Homogeneous DDBMS has the.
A Review of Ontology Mapping, Merging, and Integration Presenter: Yihong Ding.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
1 Data & Database Development. 2 Data File Bit Byte Field Record File Database Entity Attribute Key field Key file management concepts include:
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Intelligent Systems Group Emmanuel Fernandez Larry Mazlack Ali Minai (coordinator) Carla Purdy William Wee.
The information integration wizard (Iwiz) project Report on work in progress Joachim Hammer Presented by Muhammed Al-Muhammed.
Agent-based E-travel Agency Agent Systems Laboratory Oklahoma State University
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
1st MODINIS workshop Identity management in eGovernment Frank Robben General manager Crossroads Bank for Social Security Strategic advisor Federal Public.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Database Design - Lecture 1
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.
Database System Concepts and Architecture
Development of a Digital Library Technology Based on Redundancy Elimination and Semantic Elevation, With Special Emphasis on Trust Management and Security.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
RCDL Conference, Petrozavodsk, Russia Context-Based Retrieval in Digital Libraries: Approach and Technological Framework Kurt Sandkuhl, Alexander Smirnov,
Ontology Summit 2015 Track C Report-back Summit Synthesis Session 1, 19 Feb 2015.
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
Semantics for Cybersecurity and Privacy Tim Finin, UMBC Joint work with Anupam Joshi, Karuna Joshi, Zareen Syed andmany UMBC graduate students
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
Dr. Bhavani Thuraisingham August 2006 Building Trustworthy Semantic Webs Unit #1: Introduction to The Semantic Web.
Scenarios for a Learning GRID Online Educa Nov 30 – Dec 2, 2005, Berlin, Germany Nicola Capuano, Agathe Merceron, PierLuigi Ritrovato
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Online Construction of Analytical Prediction Models for Physical Environments: Application to Traffic Scene Modeling Anurag Umbarkar, Shreyas K Rajagopal.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
Ontology-Centered Personalized Presentation of Knowledge Extracted from the Web Ralitsa Angelova.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Panel Session: Dependability and Security in Complex and Critical Information Systems Department of Communications and Information Engineering University.
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 1 Mining knowledge from natural language texts using fuzzy associated concept mapping Presenter : Wu,
Information Integration 15 th Meeting Course Name: Business Intelligence Year: 2009.
1. 2 Purpose of This Presentation ◆ To explain how spacecraft can be virtualized by using a standard modeling method; ◆ To introduce the basic concept.
Extracting value from grey literature Processes and technologies for aggregating and analysing the hidden Big Data treasure of the organisations.
Database Environment Chapter 2. The Three-Level ANSI-SPARC Architecture External Level Conceptual Level Internal Level Physical Data.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
VERA AULIA ( ).  Oil palm is one of the major edible oil traded in the global market.  Oil palm tree will start to produce fruits within three.
IT 5433 LM3 Relational Data Model. Learning Objectives: List the 5 properties of relations List the properties of a candidate key, primary key and foreign.
What is a database? (a supplement, not a substitute for Chapter 1…) some slides copied/modified from text Collection of Data? Data vs. information Example:
Internet of Things Approach to Cloud-Based Smart Car Parking
The Semantic Web By: Maulik Parikh.
facilitating the Net-enabled Ecosystem
Introduction Multimedia initial focus
Data and Applications Security Developments and Directions
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Personalized Social Image Recommendation
Big Data Quality the next semantic challenge
Big Data Quality the next semantic challenge
Stepping on Earth: A Roadmap for Data-driven Agent-Based Modelling
Big Data Quality the next semantic challenge
Presentation transcript:

ReSeTrus Development of a digital library technology based on redundancy elimination and semantic elevation, with special emphasis on trust management and security maximization Merging data sources based on semantics, contexts and trust

Outline Problem State-of-the-art Preliminary Proposal Example of use case Future work

Problem Merging of data from heterogeneous sources is becoming a common need Scenarios include: – Analyses of heterogeneous datasets collectivelly – Enrichment of private data source with some (open) on-line source – Reducing redundancy among datasets by merging them into one – etc.

State-of-the-art Several state-of-the-art approaches for matching and merging in the fields of: – (Trust-aware) schema and ontology matching – (Relational) entity resolution – Data integration – Data deduplication – Information retreival – etc.

State-of-the-art – open problem Lack of general and complete solutions for matching and merging Approaches mainly address only selected issues of more general problem, e.g.: – Variability of execution only partially supported – Supporting only homogeneous sources (with predefined level of semantics) – Trustworthiness of data and sources not considered

Preliminary Entity resolution (i.e. matching) Redundancy elimination (i.e. merging) Semantic elevation (i.e. interoperability) Trust management Security assurance Concept modeling

Preliminary – example

Proposal General and complete solution for matching and merging heterogeneous data sources (with trust management) Integration, and/or joint optimization, of: – Entity resolution & trust management (with semantic elevation) – Redundancy elimination & trust management (with semantic elevation) – Semantic elevation & security assurance – Concept modeling & security assurance

Three level architecture: abstract, semantic (ontologies) and data level (networks) Information-based (a) and data-based (b) view Serialization of levels with knowledge chunks (attribute- value representation) Data architecture

General framework

Formal representation of all possible operations in relevant (identified) dimensions Contexts control and characterize each execution Contexts

Trust management Trust modeled as a relationship: one entity’s (trust) attitude towards another Different trust management methodologies to derive trustworhiness of some entity Trust management on three levels (data source, attribute and knowledge chunk)

Entity resolution & trust management (with semantic elevation) Resolution of entities in the data (i.e. matching) Collective (agglomerative) clustering alg.: at each step, merge most similar clusters Joint similarity measure: – Attribute similarity on data level – Relational similarity on data level (collective) – Semantic similarity on semantic level All measures employ trust management

Redundancy elimination & trust management (with semantic elevat.) Elimination of redundancy within and among datasets on semantic level (i.e. merging) Trust in values seen as probability of the correctness Use of probability theory to derive most probable value, given the evidence observed (i.e. most trustworhy value)

Use case Insurance fraud Globaly 100b$ losses Most complex / expensive – organized fraud – Group of fraudsters – More vehicles / insured objects – More claims

Use case - Insurance fraud 1/6 Entity resolution (i.e. matching) – Combine information of more than one insurance company (insurance association) – Combine different insurance types – Match publicly available personal data (Facebook, Twiter…) – Match publicly available knowledge (DBpedia, Wikipedia, WolframAlpha, Carfax, Police registries…) – Match information to multimedia (Flicker, public Webcams…)

Use case - Insurance fraud 2/6 Redundancy elimination (i.e. merging) – Same claims in different insurance companies – Personal data in public data sources – Knowledge in public data sources – Only partial information

Use case - Insurance fraud 3/6 Semantic elevation (i.e. interoperability) – Elevate a set of personal information: private insurance data + other insurances + publicly available personal sources – Elevate information about accident: insurance companies + internet data (maps, weather) + internet sensors (webcams…) – Elevate information about objects: private data + public general (DBpedia, wikipedia…) + public speciffic (Carfax, police registries…)

Use case - Insurance fraud 4/6 Trust management – Select datasource – Select attributes – Resolve inconsistencies – Adaptable/dynamic trust assessment

Use case - Insurance fraud 5/6 Security assurance / maximization – Dynamic authorization – according to persons’ context (organizational role) – Consider local rules Personal data protection – Strict auditing according to the local ligislation – Strict change management

Use case - Insurance fraud 6/6 Concept modeling / data mining – Adaptable trust algorithms – Adaptable semantic elevation – Image recognition – Text mining

Future work Collective (dynamic) trust management (Collective) semantic similarity measures Soft computing, fuzzy logic (trust, contexts) Hypernetworks Semantic elevation & security assurance Concept modeling & security assurance