Presentation is loading. Please wait.

Presentation is loading. Please wait.

Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a.

Similar presentations


Presentation on theme: "Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a."— Presentation transcript:

1 Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a OODA Technologies b Defence Research and Development Canada

2 Objective Present the concept of information consistency using maritime domain awareness (MDA) pertinent information collected from multiple subsystems and propose a methodology to quantify the consistency.

3 Outlines MDA, information quality and trust Information consistency Prototype to assess consistency of information between multiple sources Cross comparison of information from multiple data sources Difficulties in comparing ship related information Consistency visualization

4 Maritime Domain Awareness MDA is the effective understanding of anything associated with the maritime domain that could impact the security, safety, economy, or environment.

5 MDA and Information Heart of MDA: Quality information gained from a combination of sources. Examples of Sources: maritime, land, air and space surveillance systems other government departments and the commercial sector public web sites

6 Quality and Trust Information accessibility brings issues: Information overload Lack of processing capabilities Trust Issues …  Quality of Information impacts trust

7 Consistency as a data quality attribute A piece of information is consistent if it does not conflict with other information. Source A IMO: 9208021 Name: Albatros Flag: Canada Source B IMO: 9208021 Name: Albatros Flag: Canada Source C IMO: 9208021 Name: Albatros Flag: Spain Inconsistent information

8 Assumption Consistency in information helps build the trust we place in the information.  To develop trust in information, the similar data items from the various sources should be compared.

9 Compare-MDA Project Architecture Multiple information sources Consistency evaluation Consistency visualization

10 High Level Description A Service-Oriented Architecture (SOA) was developed to compare the information from diverse sources. It allows one to: Assess the consistency of information related to MDA contained in DRDC databases and from web sites Compare information from disparate sources Quantify a source consistency Visualize consistency results within a Google Earth environment.

11 Compare-MDA Framework Source_A Source_A WS Source A Source_B Source_B WS Source B Source_C Source_C WS Source C Source_D Source_D WS Source D DRDC Applications CA CA WS CA Client Web sites

12 Data Sources Source A and B: DRDC databases Source C: ITU web site Source D: ShipSpotting web site All exposed as web services in the framework Source_A Source_A WS Source A Source_B Source_B WS Source B Source_C Source_C WS Source C Source_D Source_D WS Source D DRDC Applications Web sites

13 DRDC Data Sources 2 MDA-related databases: Both contain static and positional information One contains ship photographs Both exposed as web services in the framework

14 ITU Web Site

15 ShipSpotting Web Site

16 Consistency Application Functionalities and information flow Consistency score Comparing similar items Consistency statistics Difficulties in assessing consistency CA CA WS

17 CA Functionalities Communicate with the data sources Send queries Process responses Align diverse source vocabularies Identify unique ship objects within different source responses Compare ship attributes among sources Persist comparison results and ship attributes Consistency Application CA Data Sources

18 Information Flow Consistency Check Compare items within a comparison group Vessel Matcher Create comparison groups (unique ship entities). Ship Descriptions CA Manager Build queries for DS services CA Queries Source A Source D … Vocabulary Solution (Alignment) Comparison groups CA

19 Consistency Check Compare items within a comparison group Consistency Check Identify consistent and inconsistent data among the sources: Compare a ship’s item, among sources. Provide statistics on the inconsistencies found at the various sources. These statistics reflect a general assessment of the source based on the number of inconsistencies found at the source. Persist comparison results and ship attributes. Comparison Group CA

20 Score quantifying a source consistency (1) The consistency check product is a score called Match(%) Used to quantify the ability of a source to provide consistent information or ability of a source to provide information that can be confirmed by other sources Match(Source, Item, Ship) = Unequal (0), Partial (0.5) or Exact (1).

21 Score quantifying a source consistency (2) ItemSource A (DRDC) Source B (DRDC) ITUShipSpotting Ship Name MMSI number CallSign IMO number Ship Flag Ship Type Match(Source i, Item j, Ship k ) = Propensity of Source i to provide information (Item j for Ship k ) that can be confirmed by other sources.

22 Ship Items Comparison (1) In order to compute consistency score (match), we need to compare ship items among sources. Source A MMSI: 217766555 Name: Espoir Type: Oil Products Tanker Source B MMSI: 211260540 Name: Espoire Type: Merchant Oil Tanker Source D MMSI: 211260540 Name: Espoir Type: Tanker Source C MMSI: 211260540 Name: Espira Type: Oil Tanker

23 3 Types of Comparison Hard Comparison: exact comparison (==) usually for numeric items Soft Comparison: based on string similarity (Levenshtein distance) among the different expressions to compare for some items typos are frequent (e.g. ship name) Pattern Comparison: based on the recurrence of words among the different expressions to be compared for ship type or other complex self-describing item.

24 Source Consistency Statistics A match is computed for each source, item and ship: Match(source, item, ship)  3 dimensions: source, item, ship Source consistency is assessed by averaging over all items and ships. Ship XXSource A Source B Source C Source D Name 1111 MMSI111 CallSign00.67 … Ship XYSource A Source B Source C Source D Name 0.67 MMSI0.670 CallSign111 …

25 Value for one Ship of one Item of the Source Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship 1 0.5 Ship 2 1.0 Ship 3 0.0 Ship N 1.0 Flag N/A MMSI 0.55 Ship 1 Ship i Ship N IMIM S1S1 S2S2 S3S3 I1I1 IiIi

26 Statistics for one Item of the Source Ship 1 Ship i Ship N IMIM S1S1 S2S2 S3S3 I1I1 IiIi Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship 1 0.5 Ship 2 1.0 Ship 3 0.0 Ship N 1.0 Flag N/A MMSI 0.55

27 Overall Statistics for a Source IMIM IiIi I1I1 S1S1 S2S2 S3S3 Ship 1 Ship i Ship N Ship Level Item Level (averaged over all Ships) Source Level (averaged over all Ships and Items) Source A 0.77 Name 0.86 Ship 1 0.5 Ship 2 1.0 Ship 3 0.0 Ship N 1.0 Flag N/A MMSI 0.55

28 Difficulties in assessing information consistency Complex… and we just compared strings. Comparison is static no consistency tracking (does a source consistency evolves in time?) no comparison of dynamic information such as destination, ETA, cargo...

29 Consistency Visualization Google Earth information display Traffic-light color code Ship photographs Consistency Statistics (sources, ships and items decomposition) CA CA WS CA Client

30 Query and GE Display

31 Traffic Light Visualization Green: Consistency ≥ 80% Yellow: 20%<Consistency<80% Red: Consistency≤20% Gray: No consistency assessed because only one source provides information for that ship’s item.

32 Concluding Remarks There are many information quality attributes: uncertainty, reliability, relevance, utility, expectability... The problem comes to quantify (build metrics to assess) these quality attributes.  In this project, we showed that consistency can be concretely quantified without any a priori suppositions.  The computed consistency can then be used to identify some sources as providing more reliable information as compared to other sources for post processing (e.g. information fusion)


Download ppt "Determining the Consistency of Information Between Multiple Subsystems used in Maritime Domain Awareness Marie-Odette St-Hilaire a, Anthony Isenor b a."

Similar presentations


Ads by Google