GSN Related Developments Jan Beutel, Tonio Gsell, Roman Lim, Mustapha Yuecel, Christoph Walser, Matthias Keller, Bernhard Buchli, Josua Hunziker, Felix Sutton, Lothar Thiele, ETH Zurich
Computer Engineering and Networks Technische Informatik und Kommunikationsnetze GSN Data Integration
Data Management – Online Semantic Data Global Sensor Network (GSN) –Data streaming framework from EPFL –Organized in “virtual sensors”, i.e. data types/semantics –Hierarchies and concatenation of virtual sensors enable on-line processing –Translates data from machine representation to SI values –Adds metadata PrivatePublic Metadata ============ Position Coordinates Sensor type Validity period … Import from field GSN Web export
Multi-Site, Multi-Station, Multi-Revision Data…
Metadata Mapping Architecture Based on 2 GSN instances –Separation of load/concern across two machines –“Private” GSN instance, raw data, protected behind firewall, high availability –“Public” GSN instance, mapped and converted data, world readable, non-critical Metadata stored in version control system (CSV text files, SVN) Mapping of –Positions, coordinates, sensor types, conversion functions, sensor calibration… Conversion of –Time formats, raw to SI values… Replay of metadata/mapping possible, e.g. on bugs/errors Change management Transparency, scalability, traceability, load balancing
Metadata Change Management Allows simple exchange of sensor hard-/software at runtime Post-deployment annotation –Stop public GSN– deployment change – annotate metadata – restart public GSN Automatic synchronization with 1 day change boundaries
Issue – Data Quality and Integrity Since 07/2008: ~150’000’000 data points Inconsistencies –Between timestamps and sequence numbers Duplicates Data gaps –Sporadic –Systematic Revision / Extension June 2010 Service 2009Installation & Service 2008 Sensornode new installiation [Keller, SenSys 2009, IPSN 2011]
Mitigating Data Loss – BackLog Architecture BackLog = Auxiliary data aggregation layer at device level –Remote storage and synchronization layer for Linux systems –Python based, designed for PermaSense CoreStation –Plugin architecture for extension to custom data sources –Data multiplex from plugin to GSN wrapper over one socket Reliable (flow controlled) synchronization to GSN Schedulable plugin/script execution, remote controlled by GSN
GSN New Functionalities Data wrappers and virtual sensors (based on backlog) –GPS, various formats –Vaisala WXT520 weather station –JPEG & RAW image manipulation –Binary file grabber/storage on file system for large files (image data) –OpenSense air quality sensors –syslog-ng based log file grabber (aka remote tail –f /var/log/syslog) –Dozer beacon generation (command push) –Schedule backlog plugin –SI value conversion –CamZilla robot control –… GSN/MySQL/SensorViz performance statistics –Custom virtual sensors to measure DB access timing, processing quantities…
GSN New Functionalities Frontend enhancements –network topology graphs, table views –log file viewer –virtual sensor search –GSN uptime counter on front page –Automatic device/type detection per deployment for automating web page generation Lots of enhancements and bug fixes Cacti-based system monitoring (MySQL)
SensorViz Plotting Frontend Time series plotting of large data Backend caching server for different data aggregates Java script plotting tool for web integration Customizable views, selection, pan & zoom
Ideas to be discussed with EPFL team MySQL – GSN interface optimizations –What can be improved? –Partitioning of large tables? –MySQL version/parameterization? Alerting dashboard –Better control and overview for alert messaging – is good but there is no overview on configs and status (see tools like zabbix, cacti) SensorViz integration, improvements –Migration to other caching technique, in-DB views? –Other plotting formats –User interface enhancements “Standard” way to monitor system/component performance –Performance metrics for every VS? packets, timing, memory, bytes, rates… –Dependency graph of VS/wrappers –“traceroute” aka data provenience Performance statistics for MySQL DB –Per VS or per DB: table size, records, Mbytes, total # entries… –SHOW STATUS from table permasense_pvt