Scaling distributed search for diagnostics and prognostics applications Prof. Jim Austin Computer Science, University of York UK CEO Cybula Ltd.
OGF Outline Challenge to be addressed Background to Signal Data Explorer The industry problem: Aerospace The science problem: Neuroscience Scaling to other industries
The challenge
OGF Challenge Asset or System Data Data Store Massive data collection happening everywhere Need to get knowledge from the data Knowledge
OGF Challenge Europe China US Japan One Data Warehouse
OGF Challenge Data too large for a single data store Data changes too fast to keep up to date Networks too slow to transfer the data The system has a central point of failure Large processing needed on centralised store
OGF Signal Data Explorer PMEPMC PMEPMC PMEPMC PMEPMC SDE data architecture Local nodes
OGF Challenge Do not move the data - move the processing Thats much lower effort More reliability Scalable Networks need not be high bandwidth Leverages local computation for free (the Grid vision)
The Signal Data Explorer
OGF SDE ! Event detector Data silos Asset Data feed Basic SDE data process
OGF SDE Allows a user Set up triggers on multiple, complex, real time data feeds. Find examples of events that are seen but unknown. Manage distributed data.
OGF SDE Signal Search and analysis Finds examples like previous
OGF Enterprise SDE system
Industry Aerospace: DAME, BROADEN
OGF Industry Industry demonstrator for the SDE technology Demonstrates the impact SDE can have on the business process Allows asset management – full life support processes
OGF Rolls-Royce Engine flight data Airline office Maintenance Centre London Airport New York Airport GRID Diagnostics Centre Engine flight data European data centre Engine flight data Airline office Maintenance Centre London Airport New York Airport US data centre GRID Diagnostics Centre Engine flight data
OGF DAME and BROADEN DAME developed basic idea – developed in lab ( ) BROADEN now has demonstrated this in Rolls-Royce ( ) Engineers can now use SDE to visualise and analyse data from Trent engines and test rigs
OGF DAME and BROADEN The example has shown how this can be applied in many industries.
Science Neuroscience: CARMEN
OGF Science Data is often not shared from experiments Individuals duplicate many expensive experiments Should share both data and methods Distributed data and service repositories needed to support this
OGF CARMEN Tackles the scientific application of SDE 4 year project to build a neuroscience repository and experimental platform Project started Oct 2006 £4.5M over 4 years
OGF CARMEN resolving the neural code from the timing of action potential activity determining ion channel contribution to the timing of action potentials examining integration within networks of differing dimensions Understanding the brain may be the greatest informatics challenge of the 21 st century
OGF Neuroscience Neuroscience Gain Recording from brain tissue removed from epileptic patients (scarce tissue and data rates up to 20 Gb/h) On line analysis by distributed collaborators will enable experiment to be defined Repository will enable integration of rare case types from different laboratories New knowledge will lead to advances in treatment
Scaling to other sectors
OGF Other sectors Concept is to provide full asset monitoring Provision of the complete maintenance package, not just the asset Better value add Manufacturers of the asset are the best people to diagnose faults, manage the maintenance. The SDE system allows this
OGF Other sectors Rail industry Track and carriage monitoring Oil and Gas Pipeline monitoring – for leaks Power Generation Monitoring of generation equipment Transport Road traffic management
OGF Thanks DAME, BROADEN and CARMEN teams Support of EPSRC, DTI and industrial collaborators