Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a.

Slides:



Advertisements
Similar presentations
Computer Science Dr. Peng NingCSC 774 Adv. Net. Security1 CSC 774 Advanced Network Security Topic 7.3 Secure and Resilient Location Discovery in Wireless.
Advertisements

Soft Systems Methodology
Variability Oriented Programming – A programming abstraction for adaptive service orientation Prof. Umesh Bellur Dept. of Computer Science & Engg, IIT.
Institute of Informatics and Telecommunications – NCSR “Demokritos” Shipping Routes Project Scott Phan An Nguyen Presentation 27, June 2011 studyabroad.iit.demokritos.gr.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
The Internet Useful Definitions and Concepts About the Internet.
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
Self Adaptive Software
Satzinger, Jackson, and Burd Object-Orieneted Analysis & Design
Software Engineering CSE470: Requirements Analysis 1 Requirements Analysis Defining the WHAT.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 16 Slide 1 User interface design.
Maritime Domain Awareness The Key to Maritime Security
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Faculty of Informatics and Information Technologies Slovak University of Technology Personalized Navigation in the Semantic Web Michal Tvarožek Mentor:
Annotating Search Results from Web Databases. Abstract An increasing number of databases have become web accessible through HTML form-based search interfaces.
«Tag-based Social Interest Discovery» Proceedings of the 17th International World Wide Web Conference (WWW2008) Xin Li, Lei Guo, Yihong Zhao Yahoo! Inc.,
Data Governance Data & Metadata Standards Antonio Amorin © 2011.
Evaluation David Kauchak cs458 Fall 2012 adapted from:
Evaluation David Kauchak cs160 Fall 2009 adapted from:
Query Expansion.
Performance Measurement and Analysis for Health Organizations
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
Database System Concepts and Architecture
Lecture 9: Chapter 9 Architectural Design
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
Crossing Methodological Borders to Develop and Implement an Approach for Determining the Value of Energy Efficiency R&D Programs Presented at the American.
Heuristic evaluation Functionality: Visual Design: Efficiency:
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
Approaching a Problem Where do we start? How do we proceed?
Distributed Database Systems Overview
The roots of innovation Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on:
Interoperable Visualization Framework towards enhancing mapping and integration of official statistics Haitham Zeidan Palestinian Central.
Copyright © The Center for Educational Effectiveness, All Rights Reserved. STAFF EDUCATIONAL EFFECTIVENESS SURVEY v9.0.
Understanding the Human Network Martin Kruger LCDR Jodie Gooby November 2008.
1 WARFIGHTER SUPPORT STEWARDSHIP EXCELLENCE WORKFORCE DEVELOPMENT WARFIGHTER-FOCUSED, GLOBALLY RESPONSIVE, FISCALLY RESPONSIBLE SUPPLY CHAIN LEADERSHIP.
Faculty of Informatics and Information Technologies Slovak University of Technology Personalized Navigation in the Semantic Web Michal Tvarožek Mentor:
14.1/21 Part 5: protection and security Protection mechanisms control access to a system by limiting the types of file access permitted to users. In addition,
CMPS 435 F08 These slides are designed to accompany Web Engineering: A Practitioner’s Approach (McGraw-Hill 2008) by Roger Pressman and David Lowe, copyright.
ReSeTrus Development of a digital library technology based on redundancy elimination and semantic elevation, with special emphasis on trust management.
Software Architecture Evaluation Methodologies Presented By: Anthony Register.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Florida Exports, based on State of Origin Information This information is based on shipments that were reported as being shipped from Florida.
Business Analysis. Business Analysis Concepts Enterprise Analysis ► Identify business opportunities ► Understand the business strategy ► Identify Business.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
Chapter 4 Automated Tools for Systems Development Modern Systems Analysis and Design Third Edition 4.1.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Facilimange Dynamics aka “Facilies” CS 425 Final Presentation Curtis McKay Manneet Singh Brad Vonder Haar.
The Structure of the User Interface Lecture # 8 1 Gabriel Spitz.
Analogy Technique Chapter Analogy - Method Comparative analysis of similar systems Adjust costs of an analogous system to estimate the.
Recent Trends in Fuzzy Clustering: From Data to Knowledge Shenyang, August 2009
A Security Framework with Trust Management for Sensor Networks Zhiying Yao, Daeyoung Kim, Insun Lee Information and Communication University (ICU) Kiyoung.
Content-Based Image Retrieval Using Color Space Transformation and Wavelet Transform Presented by Tienwei Tsai Department of Information Management Chihlee.
Models of the OASIS SOA Reference Architecture Foundation Ken Laskey Chair, SOA Reference Model Technical Committee 20 March 2013.
Big Data Quality Panel Norman Paton University of Manchester.
Dillon: CSE470: ANALYSIS1 Requirements l Specify functionality »model objects and resources »model behavior l Specify data interfaces »type, quantity,
Software Architecture Patterns (3) Service Oriented & Web Oriented Architecture source: microsoft.
6. (supplemental) User Interface Design. User Interface Design System users often judge a system by its interface rather than its functionality A poorly.
Virginia Exports, based on State of Origin Information This information is based on shipments that were reported as being shipped from Virginia.
SZRZ6014 Research Methodology Prepared by: Aminat Adebola Adeyemo Study of high-dimensional data for data integration.
TECHNICAL ASSISTANCE FOR THE CONVERSION OF RBPAPs INTO RBMPs DATA MANAGEMENT INCEPTION WORKSHOP ESTAMBUL February Eusebio CRUZ GARCÍA.
Applying Deep Neural Network to Enhance EMPI Searching
Modern Systems Analysis and Design Third Edition
ONGARD Modernization Closeout Certification Presentation for PCC Energy, Minerals and Natural Resources Department State Land Office Taxation and Revenue.
CS 425/625 Software Engineering Architectural Design
Modern Systems Analysis and Design Third Edition
Chapter 13 Quality Management
Modern Systems Analysis and Design Third Edition
CMPE/SE 131 Software Engineering March 7 Class Meeting
Presentation transcript:

Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a OODA Technologies b Defence Research and Development Canada

Objective Present the concept of information consistency using maritime domain awareness (MDA) pertinent information collected from multiple subsystems and propose a methodology to quantify the consistency.

Outlines MDA, information quality and trust Information consistency Prototype to assess consistency of information between multiple sources Cross comparison of information from multiple data sources Difficulties in comparing ship related information Consistency visualization

Maritime Domain Awareness MDA is the effective understanding of anything associated with the maritime domain that could impact the security, safety, economy, or environment.

MDA and Information Heart of MDA: Quality information gained from a combination of sources. Examples of Sources: maritime, land, air and space surveillance systems other government departments and the commercial sector public web sites

Quality and Trust Information accessibility brings issues: Information overload Lack of processing capabilities Trust Issues …  Quality of Information impacts trust

Consistency as a data quality attribute A piece of information is consistent if it does not conflict with other information. Source A IMO: Name: Albatros Flag: Canada Source B IMO: Name: Albatros Flag: Canada Source C IMO: Name: Albatros Flag: Spain Inconsistent information

Assumption Consistency in information helps build the trust we place in the information.  To develop trust in information, the similar data items from the various sources should be compared.

Compare-MDA Project Architecture Multiple information sources Consistency evaluation Consistency visualization

High Level Description A Service-Oriented Architecture (SOA) was developed to compare the information from diverse sources. It allows one to: Assess the consistency of information related to MDA contained in DRDC databases and from web sites Compare information from disparate sources Quantify a source consistency Visualize consistency results within a Google Earth environment.

Compare-MDA Framework Source_A Source_A WS Source A Source_B Source_B WS Source B Source_C Source_C WS Source C Source_D Source_D WS Source D DRDC Applications CA CA WS CA Client Web sites

Data Sources Source A and B: DRDC databases Source C: ITU web site Source D: ShipSpotting web site All exposed as web services in the framework Source_A Source_A WS Source A Source_B Source_B WS Source B Source_C Source_C WS Source C Source_D Source_D WS Source D DRDC Applications Web sites

DRDC Data Sources 2 MDA-related databases: Both contain static and positional information One contains ship photographs Both exposed as web services in the framework

ITU Web Site

ShipSpotting Web Site

Consistency Application Functionalities and information flow Consistency score Comparing similar items Consistency statistics Difficulties in assessing consistency CA CA WS

CA Functionalities Communicate with the data sources Send queries Process responses Align diverse source vocabularies Identify unique ship objects within different source responses Compare ship attributes among sources Persist comparison results and ship attributes Consistency Application CA Data Sources

Information Flow Consistency Check Compare items within a comparison group Vessel Matcher Create comparison groups (unique ship entities). Ship Descriptions CA Manager Build queries for DS services CA Queries Source A Source D … Vocabulary Solution (Alignment) Comparison groups CA

Consistency Check Compare items within a comparison group Consistency Check Identify consistent and inconsistent data among the sources: Compare a ship’s item, among sources. Provide statistics on the inconsistencies found at the various sources. These statistics reflect a general assessment of the source based on the number of inconsistencies found at the source. Persist comparison results and ship attributes. Comparison Group CA

Score quantifying a source consistency (1) The consistency check product is a score called Match(%) Used to quantify the ability of a source to provide consistent information or ability of a source to provide information that can be confirmed by other sources Match(Source, Item, Ship) = Unequal (0), Partial (0.5) or Exact (1).

Score quantifying a source consistency (2) ItemSource A (DRDC) Source B (DRDC) ITUShipSpotting Ship Name MMSI number CallSign IMO number Ship Flag Ship Type Match(Source i, Item j, Ship k ) = Propensity of Source i to provide information (Item j for Ship k ) that can be confirmed by other sources.

Ship Items Comparison (1) In order to compute consistency score (match), we need to compare ship items among sources. Source A MMSI: Name: Espoir Type: Oil Products Tanker Source B MMSI: Name: Espoire Type: Merchant Oil Tanker Source D MMSI: Name: Espoir Type: Tanker Source C MMSI: Name: Espira Type: Oil Tanker

3 Types of Comparison Hard Comparison: exact comparison (==) usually for numeric items Soft Comparison: based on string similarity (Levenshtein distance) among the different expressions to compare for some items typos are frequent (e.g. ship name) Pattern Comparison: based on the recurrence of words among the different expressions to be compared for ship type or other complex self-describing item.

Source Consistency Statistics A match is computed for each source, item and ship: Match(source, item, ship)  3 dimensions: source, item, ship Source consistency is assessed by averaging over all items and ships. Ship XXSource A Source B Source C Source D Name 1111 MMSI111 CallSign00.67 … Ship XYSource A Source B Source C Source D Name 0.67 MMSI0.670 CallSign111 …

Value for one Ship of one Item of the Source Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship Ship Ship Ship N 1.0 Flag N/A MMSI 0.55 Ship 1 Ship i Ship N IMIM S1S1 S2S2 S3S3 I1I1 IiIi

Statistics for one Item of the Source Ship 1 Ship i Ship N IMIM S1S1 S2S2 S3S3 I1I1 IiIi Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship Ship Ship Ship N 1.0 Flag N/A MMSI 0.55

Overall Statistics for a Source IMIM IiIi I1I1 S1S1 S2S2 S3S3 Ship 1 Ship i Ship N Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship Ship Ship Ship N 1.0 Flag N/A MMSI 0.55

Difficulties in assessing information consistency Complex… and we just compared strings. Comparison is static no consistency tracking (does a source consistency evolves in time?) no comparison of dynamic information such as destination, ETA, cargo...

Consistency Visualization Google Earth information display Traffic-light color code Ship photographs Consistency Statistics (sources, ships and items decomposition) CA CA WS CA Client

Query and GE Display

Traffic Light Visualization Green: Consistency ≥ 80% Yellow: 20%<Consistency<80% Red: Consistency≤20% Gray: No consistency assessed because only one source provides information for that ship’s item.

Concluding Remarks There are many information quality attributes: uncertainty, reliability, relevance, utility, expectability... The problem comes to quantify (build metrics to assess) these quality attributes.  In this project, we showed that consistency can be concretely quantified without any a priori suppositions.  The computed consistency can then be used to identify some sources as providing more reliable information as compared to other sources for post processing (e.g. information fusion)