CARMEN: Code Analysis, Repository and Modelling for e-Neuroscience.

Slides:



Advertisements
Similar presentations
Scaling distributed search for diagnostics and prognostics applications Prof. Jim Austin Computer Science, University of York UK CEO Cybula Ltd.
Advertisements

© Fraunhofer Institute SCAI and other members of the SIMDAT consortium Data Grids for Process and Product Development using Numerical Simulation and Knowledge.
Pattern Matching against Distributed Datasets within DAME Andy Pasley University of York.
EScience Meeting, Edinburgh, November Slide 1 CARMEN Code Analysis, Repository and Modelling for e-Neuroscience Jim Austin, Colin Ingram, Leslie.
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
Grid Computing Workshop, Stirling, October Slide 1 CARMEN or Neuroinformatics: what can E-Science offer Neuroscience or E-Science, and Neuroscience:
1 e-Science for the arts and humanities Sheila Anderson Arts and Humanities Data Service Kings College London.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
1 NEST New and emerging science and technology EUROPEAN COMMISSION - 6th Framework programme : Anticipating Scientific and Technological Needs.
1 e-Arts and Humanities Scoping an e-Science Agenda Sheila Anderson Arts and Humanities Data Service King’s College London.
Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University.
The design and implementation of the Neurophysiology Data translation Format (NDF) Developed by Bojian Liang, Martyn Fletcher, Jim Austin. Advanced Computer.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
Science Cloud Paul Watson Newcastle University, UK
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Processing raw electrophysiological signals in CARMEN:detecting and sorting spikes Leslie Smith University of Stirling.
1 Richard White Design decisions: architecture 1 July 2005 BiodiversityWorld Grid Workshop NeSC, Edinburgh, 30 June - 1 July 2005 Design decisions: architecture.
Slide 1 The Sociology of Ontologies in Neurosciences Phillip Lord, School of Computing Science, Newcastle University.
Metadata For CARMEN Phillip Lord and Frank Gibson.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University.
1 Building National Cyberinfrastructure Alan Blatecky Office of Cyberinfrastructure EPSCoR Meeting May 21,
Centre for Earth Systems Engineering Research Infrastructure Transitions Research Consortium (ITRC) David Alderson & Stuart Barr What is the aim of ITRC?
David Willshaw Institute for Adaptive & Neural Computation School of Informatics University of Edinburgh UK INCF Neuroinformatics.
ISBE An infrastructure for European (systems) biology Martijn J. Moné Seqahead meeting “ICT needs and challenges for Big Data in the Life Sciences” Pula,
Microsoft Research Faculty Summit Paul Watson Professor of Computer Science Newcastle University, UK.
DAME: Distributed Engine Health Monitoring on the Grid
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
1 NEST New and emerging science and technology EUROPEAN COMMISSION - 6th Framework programme : Anticipating Scientific and Technological Needs.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
The DAME project Professor Jim Austin University of York.
Pascucci-1 Valerio Pascucci Director, CEDMAV Professor, SCI Institute & School of Computing Laboratory Fellow, PNNL Massive Data Management, Analysis,
DAME: A Distributed Diagnostics Environment for Maintenance Duncan Russell University of Leeds.
David Carr The Wellcome Trust Data management and sharing: the Wellcome Trust’s approach Economic & Social Data Service conference.
EVA Workshop, 26 March 2003, Florence, Italy1 COINE Cultural Objects In Networked Environments Anthi Baliou University of Macedonia,Library Thessaloniki,
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Experts in numerical algorithms and High Performance Computing services Challenges of the exponential increase in data Andrew Jones March 2010 SOS14.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
DAME: A Distributed Diagnostics Environment for Maintenance Dr Tom Jackson University of York.
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Enabling e-Research in Combustion Research Community T.V Pham 1, P.M. Dew 1, L.M.S. Lau 1 and M.J. Pilling 2 1 School of Computing 2 School of Chemistry.
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Virtual Geophysics Laboratory (VGL): Scientific workflows Exploiting the Cloud Josh.
Automatic Discovery and Processing of EEG Cohorts from Clinical Records Mission: Enable comparative research by automatically uncovering clinical knowledge.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
1 e-Arts and Humanities Scoping an e-Science Agenda Sheila Anderson Arts and Humanities Data Service Arts and Humanities e-Science Support Centre King’s.
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
The National Grid Service Mike Mineter.
The Global Scene Wouter Los University of Amsterdam The Netherlands.
High throughput biology data management and data intensive computing drivers George Michaels.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
Virtual Laboratory Amsterdam L.O. (Bob) Hertzberger Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit van Amsterdam.
Semantic Web - caBIG Abstract: 21st century biomedical research is driven by massive amounts of data: automated technologies generate hundreds of.
EUDAT: collaborative pan-European infrastructure providing research data services, training and consultancy This work is licensed.
Clouds , Grids and Clusters
INTAROS WP5 Data integration and management
Neuroinformatics at Edinburgh
The CARMEN e-Science pilot project: Neuroinformatics work packages.
Notes for speaker included
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Tom Savel, MD Lead – Grid Technologies Medical Officer NCPHI, CDC
Presentation transcript:

CARMEN: Code Analysis, Repository and Modelling for e-Neuroscience

Research Challenge Understanding the brain may be the greatest informatics challenge of the 21 st century Worldwide >100,000 neuroscientists (~ 5,000 in UK) are generating vast amounts of data Principal experimental data formats: molecular (genomic/proteomic) neurophysiological (time-series electrical measures of activity) anatomical (spatial) behavioural Neuroinformatics concerns how these data are handled and integrated, including the application of computational modelling

In recent years new technological opportunities for data sharing have emerged with faster networks, improved database technologies, and affordable massive data storage capabilities Neuroinformatics is increasingly exploiting these opportunities to enable data sharing, re-use of data and novel analysis based on new combinations of data that can be performed via database systems Neuroinformatics

Need for Cooperation Understanding the brain may be the greatest informatics challenge of the 21 st century OECD identified a need to work cooperatively in order to achieve major advances and have established the International Neuroinformatics Coordinating Facility Cooperation will permit: development of common processes best value from data – long term curation mega-analysis of large data sets integration of data sets across different scales and different approaches interdisciplinary research

Technical Multiple proprietary data formats Need for detailed, standardised and evolvable metadata Volume of the data to be analysed Cultural Multiple communities each acting independently Concerns about the consequences of sharing data Difficulty in appreciating how the science could be moved forwards by e-Science Potential Barriers to Cooperation

CARMEN – Focus on Neural Activity resolving the neural code from the timing of action potential activity Understanding the brain may be the greatest informatics challenge of the 21 st century neurone 1 neurone 2 neurone 3 raw voltage signal data is collected using single or multi-electrode array recording novel optical recording, particularly the activity dynamics of large networks

Much current knowledge about brain function is based on analysis of firing patterns of individual neurones. New computer-based data acquisition systems and techniques for recording simultaneously from many neurones means data are amassing rapidly. Neural modelling generates massive simulated data sets that need to be processed, analysed and compared with experimental data. Neuronal recordings can be intra- or extra-cellular recordings of single spikes, ensembles of neurones, or field potentials. All of these data are types of time-series data which require a specialised information handling system. Electrophysiological Data

To demonstrate and sustain advances in neuroscience enabled by e-Science technology To create a grid-enabled, real time virtual laboratory environment for neurophysiological data To develop an extensible, client-defined toolkit for data extraction, analysis and modelling To provide a repository for archiving, sharing, integration and discovery of data To achieve wide community and commercial engagement in developing and using CARMEN CARMEN Objectives

Project Exemplar Recording from brain tissue removed from epileptic patients (scarce tissue and data rates up to 20 GB/h) On line analysis by distributed collaborators will enable experiment to be defined during data collection Repository will enable integration of rare case types from different laboratories New knowledge will lead to advances in treatment

CARMEN Consortium Newcastle: Colin Ingram Paul Watson Stuart Baker Marcus Kaiser Phil Lord Evelyne Sernagor Tom Smulders Miles Whittington York: Jim Austin Tom Jackson Stirling: Leslie Smith Plymouth: Roman Borisyuk Cambridge: Stephen Eglen Warwick: Jianfeng Feng Sheffield: Kevin Gurney Paul Overton Manchester: Stefano Panzeri Leicester: Rodrigio Quian Quiroga Imperial: Simon Schultz St. Andrews: Anne Smith

CARMEN Consortium Commercial Partners - applications in the pharmaceutical sector - interfacing of data acquisition software - application of database infrastructure - commercialisation of analysis tools

Work Packages Data Storage & Analysis WP1 Spike Detection & Sorting WP2 Information Theoretic Analysis of Derived Signals WP 3 Data-Driven Parameter Determination in Conductance- Based Models WP5 Measurement and Visualisation of Spike Synchronisation WP6 Multilevel Analysis and Modelling in Networks WP4 Intelligent Database Querying

Hub and Spoke Project Hub: A CAIRN repository for the storage and analysis of neuroscience data Spokes: A set of neuroscience projects that will produce data and analysis services for the hub, and use it to address key neuroscience questions CARMEN Structure

Managing vast amounts of data > 50TB primary data Extracting value from the data discovery & interpretation analysis – harnessing compute resources curation of services as well as data Controlling access to the data & services e-Science Challenges

CARMEN Active Information Repository Node OMII: Grimoire DAME: Signal Data Explorer OMII/ my Grid: Taverna/ BPEL OGSA-DAI & SRB Gold: Role & Task based Security my Grid & Gold: Feta, Provenance Dynasoar White Rose Grid Newcastle Grid

Data Collection from Electrode Array Spike Detection with User Defined Threshold Spike Sorting Analysis Visualisation Currently, this is a semi-manual process We have an initial prototype for automating this…. A Typical Scenario we want to Support

Signal Data Explorer

Example Workflow

SRB FileSystem RDBMS External Client Spike Sorting Service Reporting Dynamically Deployed Services in Dynasoar BPEL / TAVERNA Registry INPUT Data OUTPUT Metadata Available Services Repository Security Workflow Engine Query Example Workflow Enactment

Example Graph Output

Example Movie Output

Extensible, standardised metadata for neuroscience data formats (timing, data channels, etc.) experimental design (e.g. stimuli or drug treatments) concurrent data (e.g. behaviour, physiological measures) experimental idiosyncrasies (e.g. artifacts) experimental conditions (animals, temperature, treatments etc.) Some Remaining Challenges

Locating patterns in time-series data across multiple levels of abstraction Reproducible e-Science curating services as well as data public repositories of deployable services dynamic service deployment Real-time expert collaboration Some Remaining Challenges (cont.)

CARMEN CARMEN is delivering an e-Science infrastructure that can be applied across a range of diverse and challenging applications (not only neuroscience) CARMEN enables cooperation and interdisciplinary working in ways currently not possible CARMEN will deliver new results in neuroscience, computer science and medicine