Efficient AIS Data Processing for Environmentally Safe Shipping Marios Vodas1, Nikos Pelekis1, Yannis Theodoridis1, Cyril Ray2, Vangelis Karkaletsis3, Sergios Petridis3, Anastasia Miliou4 1 University of Piraeus 2 Naval Academy, France 3 NCSR “Demokritos” 4 Archipelago – Inst. of Marine Conservation
Outline Part I: Marine Transportation Part II: Automatic Identification System (AIS) Part III: Objectives Part IV: Methodology Part V: Conclusion
i. MARITIME TRANSPORTATION
Safety (and Environmental) Issues Ships, control centers and marine officers have to face many security and safety problems due to: Staff reduction, cognitive overload, human errors Traffic increase (ports, maritime routes), dangerous contents Terrorism, pirates Technical faults (bad design, equipment breakdowns) Bad weather Etc. HELCOME AIS IRENav (NATO) MarineTraffic.com
The Most Prominent Cause of Accidents About 75-96% of marine casualties are caused, at least in part, by some form of human error * : 88% of tanker accidents 79% of towing vessel groundings 96% of collisions 75% of fires and explosions Solution to such issues requires different levels of responses taking into account : People (activities) Technology Environment Organisational factors *Rothblum A.M. (2006) “Human Error and Marine Safety”, U.S. Coast Guard Research & Development Center
Ways to Minimize Accidents Level of education and practice for mariners Work safety regulations (behaviour guidelines, normalised onboard equipments) Navigation and decision support systems providing real-time information, predictions, alerts... Integrate and use properly multiple and heterogeneous positioning systems : AIS, ARPA, Long Range Identification System (LRIT), Global Maritime Distress and Safety System (GMDSS), synthetic aperture radar, airborne radar, satellite based sensors Generalisation of vessel traffic monitoring, port control, search and rescue systems, automatic communications
Traffic Monitoring Air-based support Human and semi-automatic monitoring On-demand and on a regular basis Remote Sensing support Semi-automatic monitoring Every 2 to 6 hours Sensor-based support Almost automatic analysis and monitoring Real-time
iI. Automatic Identification System (AIS)
→ AIS is mandatory (IMO) for big ships and passengers’ boats AIS Device The Automatic Identification System identifies and locates vessels at distance It includes an antenna, a transponder, a GPS receiver and additional sensors (e.g., loch and gyrocompass) It is a broadcast system based on VHF communications It is able to operate in autonomous and continuous mode Ships fitted with AIS send navigation data to surrounding receivers (range is about 50 km) Ships or maritime control centres on shore fitted with AIS receives navigation data sent by surrounding ships → AIS is mandatory (IMO) for big ships and passengers’ boats
AIS Transmission Rate and Accuracy AIS accuracy is defined as the largest distance the ship can cover between two updates The AIS broadcasts information with different rates of updates depending on the ship’s current speed and manoeuvre The IMO assumes that accuracy of embedded GPS is 10m Vessel behaviour Time between updates Accuracy (m) Anchored 3 min = 10 metres Speed between 0-14 knots 12 s Between 10 and 95 metres Speed between 0-14 knots and changing course 4 s Between 10 and 40 metres Speed between 14-23 knots 6 s Between 55 and 80 metres Speed between 14-23 knots and changing course 2 s Between 25 and 35 metres Speed over 23 knots 3 s > 45 metres Speed over 23 knots and changing course > 35 metres General update rules have been compared to reality: it appears that update rates are lower
AIS Data The AIS provide location-based information on 2D routes, this defining point-based 3D trajectories Transmitted data include ship’s position and textual meta-information Static: ID number (MMSI), IMO code, ship name and type, dimensions Dynamic: Position (Long, Lat), speed, heading, course over ground (COG), rate of turn (ROT) Route-based: Destination, danger, estimated time of arrival (ETA) and draught That is, an ordered series of locations (X,Y,T) of a given mobile object O with T indicating the timestamp of the location (X,Y) → Time does not exist in AIS frames : to be add by receivers !AIVDM,1,1,,A,1Bwj:v0P1=1f75REQg>rPwv:0000,0*3B
iII. Objectives
Big AIS Data Processing for Environmentally Safe Shipping Objectives, based on Archipelagos Institute of Marine Conservation requests, was to Investigate factors which contribute most to the risk of a shipping accident Identify dangerous areas How : traffic database processing in order to address some requirements / queries set by Archipelagos towards semi-quantitative risk analysis of shipping traffic → Data coming from AIS → Application to the Aegean Sea
Typical Questions From Domain Experts Calculate average and minimum distances from shore or between two ships Calculate the maximum number of ships in the vicinity of another ship Find whether (and how many times) a ship goes through specified areas (e.g. narrow passages, biodiversity boxes) Calculate the number of sharp changes in ship’s direction Find typical routes vs. outliers etc. etc.
Mediterranean Sea European Maritime Safety Agency (EMSA) centralizes data from EU states and provides them through a Web service We worked on a dataset on Mediterranean sea provided By IMIS Hellas (a Greek IT company related to IMIS Global, collecting AIS data, mariweb.gr) → Data Volume is 100 million positions per month, that is about 2300 positions per minutes Focus on Aegean sea : 3 days, 3 million position records (933 distinct ships) Full dataset is more than 2000 SQL tables for a total of 2 TB covering 2,5 years of vessel activity Two datasets are available at Chorochronos.org interface (IMIS 3 days and AIS Brest)
Vessel Statistics Country Number of ships Flag of Convenience Greece 263 No Panama (Republic of) 112 Yes Turkey 96 Malta 76 Liberia (Republic of) 32 Vincent and the Grenadines 29
iV. Methodology
Populating a Database Relational database (postgres and postgis) Data model based on AIS messages : positions, ships and trips Parsing, Integration, error checking filtering Reconstructing trajectories from raw data and feeding a trajectory DB Apply “simple” queries to answer experts needs “What is the (sub)trajectory of a ship during its presence in an area” ?
MOD Engine and Rule-Based Analysis Mixed top-down / bottom-up approach involving an expert monitoring real-time traffic on a touch table An integrated approach for maritime situation awareness based on an inference engine (drools) The expert defines his rules according its needs and objectives The engine executes rules using the AIS database Hermes is a MOD engine providing extensible DBMS support for trajectory data Defines trajectory data type SQL extensions at the logical level Efficient indexing techniques at the physical level Includes trajectory clustering support http://infolab.cs.unipi.gr/hermes
Methodology Steps
Take the Maritime Environment Into Account The maritime domain is peculiar as there is no underlying network but some maritime rules define predefined paths and anchorage areas (polylines and polygons) that might constrain a given trajectory S-57 ENC (Electronic Nautical Chart) We added official vector chart and expert-defined areas of interest in the database Coastlines Starting, ending, passing, restricted areas, waiting zones Regulations and dangers (rocs, buoys, seabed) …
Exploring the Data Calculating trajectory aggregations and feeding a trajectory data warehouse Performing OLAP analysis over aggregations (eg. O/D analysis) Running KDD techniques : frequent pattern analysis, clustering, outlier detection, etc. Association of points coming from the same source-destination set Definition of a route and qualifying of positions at each time Qualifying of a new trajectory compared to the identified route Cloud of locations
Visualizing Trajectories and Patterns → Web-based visualisation using Google Maps / Earth applications, Openlayers (OSM) frequent patterns speed behaviour space-time cube: trajectory too far on the right → ← space-time cube: ship is late
V. Conclusion
Some Open Questions Q1. What kind of storage is appropriate for BIG volumes of vessel traffic data? Serial vs. parallel/distributed processing (e.g. Hadoop) (batch vs. streaming) MOD engines? What about indexing BIG mobility data? Q2. What kind of analysis on vessel traffic data makes sense? Analysis on current (location, speed, heading, …) vs. historical information (trajectories) Clusters (+ outliers), frequent patterns, next location prediction, etc. Exploit on previous knowledge to improve real-time analysis Q3. What kind of visualization is appropriate for vessel traffic data / patterns Current location vs. trajectory-based visual analytics Trajectory clustering Frequent pattern mining
Research Challenges on Data – Just a Few Examples Trajectory compression / simplification: how to compress / simplify trajectories keeping quality as high as possible? Semantic trajectory reconstruction: how to extract semantics from raw (GPS-based) trajectory data? Trajectory sampling: how to find a representative sample among a trajectory dataset? Generating trajectories by example: how to build large synthetic datasets that simulate the ‘behavior’ of a small real one? Etc.
Questions