Metadata For CARMEN Phillip Lord and Frank Gibson.

Slides:



Advertisements
Similar presentations
EScience Meeting, Edinburgh, November Slide 1 CARMEN Code Analysis, Repository and Modelling for e-Neuroscience Jim Austin, Colin Ingram, Leslie.
Advertisements

CARMEN: Code Analysis, Repository and Modelling for e-Neuroscience.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
UKOLN is supported by: Digital Repositories Roadmap: looking forward The JISC/CNI Meeting, July 2006 Rachel Heery Assistant Director R&D, UKOLN
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
Cloud Computing for e-Science with CARMEN Paul Watson Newcastle University.
RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Standards for Data Sharing Program Oversight Chair: Colin Ingram, Newcastle UK.
Minimum Information About a Microarray Experiment - MIAME MGED 5 workshop.
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Slide 1 The Sociology of Ontologies in Neurosciences Phillip Lord, School of Computing Science, Newcastle University.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
Data Management in the DOE Genomics:GTL Program Janet Jacobsen and Adam Arkin Lawrence Berkeley National Laboratory University of California, Berkeley.
MIAME and Data Standards Phillip Lord. Why Standards? "However, there is a subtle implication that standardization (fixation) is a good thing". An anonymous.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
The Representation of Scientific Data
Digital Curation or Digital Data? The impact of Services and Federation Phil Lord Newcastle University.
MARS: Microarray analysis, retrieval, and storage system Albert F. Cervantes.
1 FACS Data Management Workshop The Immunology Database and Analysis Portal (ImmPort) Perspective Bioinformatics Integration Support Contract (BISC) N01AI40076.
Groups 13A Group AGroup B Ella DRachel D Amelia VIsabelle L Emily DChloe S Caitlin HElla N Meghan F.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
The MGED Society Facilitating Data Sharing and Integration with Standards CTSA Omics Data Standards Working Group Chris Stoeckert Dept. of Genetics and.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
San Diego Supercomputer CenterUniversity of California, San Diego Preservation Research Roadmap Reagan W. Moore San Diego Supercomputer Center
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Jennifer Simonotto Marcus Kaiser Evelyne Sernagor Stephen Eglen NETWORK EXTRACTION AND ANALYSIS IN CARMEN.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Objectives.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
E-Science Tools For The Genomic Scale Characterisation Of Bacterial Secreted Proteins Tracy Craddock, Phillip Lord, Colin Harwood and Anil Wipat Newcastle.
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Quantitative Research is More Effective than Qualitative Research in the Studying of Science Teaching.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
Discovery Metadata for Special Collections Concepts, Considerations, Choices William E. Moen School of Library and Information Sciences Texas Center for.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of Manchester.
XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Representing Flow Cytometry Experiments within FuGE Josef Spidlen 1, Peter Wilkinson 2, and Ryan Brinkman 1 1 BC Cancer Research Centre, Vancouver, BC,
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Extending FuGE into other domains Andrew Jones School of Computer Science, University of Manchester
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK.
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
DOE Data Management Plan Requirements
Introduction and Applications of Microarray Databases Chen-hsiung Chan Department of Computer Science and Information Engineering National Taiwan University.
17 th October 2002Data Provenance Grid Data Requirements Scoping Metadata & Provenance Dave Pearson Oracle Corporation UK.
CombeDay Making Data Openly Available Simon Coles.
Workshop: Linking Models and Data in SysMO Katy Wolstencroft, SysMO-DB University of Manchester, UK.
Working Group 4 Data and metadata lifecycle management  1. Policies and infrastructure for data and metadata changes  2. Supporting file and data formats.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
ArrayExpress Ugis Sarkans EMBL - EBI
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Richard Tomsett1,2 Marcus Kaiser1,3,4
Ingenuity Pathway Analysis Alex Pico. Description "IPA is a software application that enables researchers to analyze and understand the complex biological.
Enhancements to Galaxy for delivering on NIH Commons
Scientific Research Background
Institutional role in supporting open access, open science, open data
The CARMEN e-Science pilot project: Neuroinformatics work packages.
Presentation transcript:

Metadata For CARMEN Phillip Lord and Frank Gibson

Problems “In the standard model, one collects data, publishes a paper or papers and then gradually loses the original dataset.” THE NEW KNOWLEDGE ECONOMY AND SCIENCE AND TECHNOLOGY POLICY Geoffrey Bowker, University of California, San Diego Geoffrey Bowker, University of California, San Diego

The need for clear metadata Most neurosciences data is relative simple in structure But often contextually complex Sometimes associated with behavioural features

Neuroscience spike data The raw data is just a waveform But what is the experiment for? What stimulus is the organism/tissue receiving? Even, which channel is which? The data sets being produced are (reasonably) large (10’s of Gb, or 1Tb in three months)

Information Extraction How do we get extract the information? istockphoto.com

Multi-Author data AuthorPMIDTypeSize 1Davierwala et al Synthetic_Lethality627 2Krogan et al Affinity_Capture-MS164 3Hazbun et al Affinity_Capture-MS3210 4Gavin et al Affinity_Capture-MS3596 5Ho et al Affinity_Capture-MS733 6Ito et al Two-hybrid275 From Katherine James, NCL

How do we represent… Laboratory Experiments In silico Analysis Derived data

Joseph Whitworth

Metadata Description of results Sample How it was generated Equipment Processing steps Expensive to capture Important to validate result Lab-book

The need for standards! “established by consensus and approved by a recognized body, that provides, […] rules, […] for […] the optimum degree of order in a given context” BSI -

View from microarrays Content Standard – Minimal Information MAGE -- Structure MO -- Terminology From the MGED society

Life science communities SocietyDomainWebsite The Genomics Standards Consortium (GCS) Genomicshttp://darwin.nox.ac.uk/gsc/ Microarray and Gene Expression Data Society (MGED) Genomicswww.mged.org Proteomics Standards Initiative (PSI) Proteomicshttp://psidev.info Metabolomics Standards Initiative (MSI) Metabolomicswww.metabolomicssociety.org Flow Cytometry experiment Community Flow Cytometry

MINI – electrophysiology General Features Study Subject Recording Location Task Stimulus Recording Time Series Data

Recording Location Recording Location Structure Brain Area Slice Thickness Slice Orientation Cell Type –Cell Type co-ordintates –Location conformation

View from microarrays Content Standard – Minimal Information MAGE -- Structure MO -- Terminology From the MGED society

Functional Genomics Experiment (FuGE) Model of common components in science investigations, such as materials, data, protocols, equipment and software. Provides a framework for capturing complete laboratory workflows, enabling the integration of pre-existing data formats.

Robot Reference set of 5,000 mutant strains ‘Folate’ +-+- ‘MMS’ --++ Data curation. Functional analysis. Interactions with in silico programme. * * * Robot Screen mutants for sensitivity to damage/nutrition Part of CISBAN in a nutshell

CISBAN dataflow Neil Wipat, Newcastle University

Data Entry with SYMBA Allyson Lister, Newcastle University

Data Entry with SyMBA

Summary We are generating metadata “standards” for neurosciences We are following a well-trodden path from bioinformatics We adopted FuGE and have built MINI

Future Work More neurosciences experimental datatypes. Minimal Information about a Service –Describe analysis software as well as lab experiments. Outreach!

Acknowledgements MINI: Frank Gibson, Paul G Overton, Tom V Smulders, Simon R Schultz, Stephen J Eglen, Colin D Ingram, Stefano Panzeri, Phil Bream, Evelyne Sernagor, Mark Cunningham, Christopher Adams, Christoph Echtermeyer, Jennifer Simonotto, Marcus Kaiser, Daniel C Swan, Martyn Fletcher, Phillip Lord CISBAN: Anil Wipat (PI), Allyson Lister (Research Associate),