Reporting structures for Image Cytometry: Context and Challenges Chris Taylor, EMBL-EBI & NEBC MIBBI [www.mibbi.org] HUPO Proteomics.

Slides:



Advertisements
Similar presentations
Usage Statistics in Context: related standards and tools Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services Usage Statistics and Publishers:
Advertisements

Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Long-term Digital Metadata Curation Arif Shaon University of Reading 16 April 2014.
28 March 2003e-MapScholar: content management system The e-MapScholar Content Management System (CMS) David Medyckyj-Scott Project Director.
MIBBI: Background, Context and Plans Chris Taylor gmail.com
1 MANUFACTURING AND PRODUCTION OF BIOLOGICAL PRODUCTS (ERT 455) HAZARD ANALYSIS AND CRITICAL CONTROL POINT (HACCP) SYSTEM Munira Mohamed Nazari School.
The Imperial College Tissue Bank A searchable catalogue for tissues, research projects and data outcomes Prof Gerry Thomas - Dept. Surgery & Cancer The.
An Open Access publisher’s perspective on data publishing Matthew Cockerill Managing Director, BioMed Central Dryad-UK meeting HEFCE, London, 28 April.
Announcements ●Exam II range ; mean 72
Promoting Coherent Minimum Reporting Guidelines for Biological & Biomedical Investigations: The MIBBI Project Chris Taylor, EMBL-EBI & NEBC
SE 555 Software Requirements & Specification Requirements Management.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
IS550: Software requirements engineering Dr. Azeddine Chikh 4. Validation and management.
Data, data standards and sharing Dr Daniel Swan Bioinformatics Support Unit
Release 4 of the COUNTER Code of Practice for e- Resources and new usage- based measures of impact Peter Shepherd COUNTER May 2014.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
This chapter is extracted from Sommerville’s slides. Text book chapter
Gene expression services: ArrayExpress and the Gene Expression Atlas Contact: Gabriella Rustici, PhD Functional Genomics Team EBI-EMBL
● Problem statement ● Proposed solution ● Proposed product ● Product Features ● Web Service ● Delegation ● Revocation ● Report Generation ● XACML 3.0.
Module 3: Business Information Systems Chapter 11: Knowledge Management.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
Implementation of HUBzero as a Knowledge Management System in a Large Organization HUBBUB Conference 2012 September 24 th, 2012 Gaurav Nanda, Jonathan.
Are Doctoral Candidates Switched on to the Impact of Social Media? Dr Heather Doran Winston Churchill Fellow 2015 (Social Media)
Effective User Services for High Performance Computing A White Paper by the TeraGrid Science Advisory Board May 2009.
Recordkeeping for Good Governance Toolkit Digital Recordkeeping Guidance Funafuti, Tuvalu – June 2013.
Updates from EOSDIS -- as they relate to LANCE Kevin Murphy LANCE UWG, 23rd September
Magnet Lab User Portal August 2010.
Principles and Practice in (Encouraging) the Sharing of Public Research Data Chris Taylor, The MIBBI Project Project website:
Chapter 7 Developing a Core Knowledge Framework
EASI a free web database application for collecting and managing monitoring records.
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
Advanced Higher Physics Investigation Report. Hello, and welcome to Advanced Higher Physics Investigation Presentation.
Meet and Confer Rule 26(f) of the Federal Rules of Civil Procedure states that “parties must confer as soon as practicable - and in any event at least.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Using Joinup as a catalogue for interoperability solutions March 2014 PwC EU Services.
New Ideas for IA Readings review - How to manage the process Content Management Process Management - New ideas in design Information Objects Content Genres.
The Physiome Model Repository – PMR David Nickerson Auckland Bioengineering Institute The University.
Standards Development: Necessarily a Two-Way Street Chris Taylor, EMBL-EBI & NEBC MIBBI Project [ HUPO Proteomics.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Information Management, Standards and Data Quality Brian Green ePSIplus Analyst funded by eContentPlus.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
System Changes and Interventions: Registry as a Clinical Practice Tool Mike Hindmarsh Improving Chronic Illness Care, a national program of the Robert.
Implementation Experiences METIS – April 2006 Russell Penlington & Lars Thygesen - OECD v 1.0.
Now launched! Visit nature.com/scientificdata Honorary Academic Editor Susanna-Assunta Sansone Advisory.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
The Digital Library for Earth System Science: Contributing resources and collections GCCS Internship Orientation Holly Devaul 19 June 2003.
Configuration Management and Change Control Change is inevitable! So it has to be planned for and managed.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
1 Understanding Cataloging with DLESE Metadata Karon Kelly Katy Ginger Holly Devaul
Knowledge Management & Knowledge Management Systems By: Chad Thomison MIS 650.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Data Consultant, Honorary Academic Editor Associate Director, Principal Investigator Community-driven metadata standards in the life science - analyzing.
DOE Data Management Plan Requirements
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Working with your archive organization: Broadening your user community Robert R. Downs, PhD Socioeconomic Data and Applications Center (SEDAC) Center for.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.
Session 6: Data Flow, Data Management, and Data Quality.
Describing and Annotating Experimental Data: Hands On.
Copyright 2010, The World Bank Group. All Rights Reserved. Producer prices, part 2 Measurement issues Business Statistics and Registers 1.
Tim Friede Department of Medical Statistics
Institutional role in supporting open access, open science, open data
Open Access to your Research Papers and Data
Owen Stephens 14th January 2010
Presentation transcript:

Reporting structures for Image Cytometry: Context and Challenges Chris Taylor, EMBL-EBI & NEBC MIBBI [ HUPO Proteomics Standards Initiative [psidev.sf.net] Research Information Network [

Mechanisms of scientific advance

Well-oiled cogs meshing perfectly (would be nice) How well are things working? —Cue the Tower of Babel analogy… —Situation is improving with respect to standards —But few tools, fewer carrots (though some sticks) Why do we care about that..? —Data exchange —Comprehensibility (/quality) of work —Scope for reuse (parallel or orthogonal) “Publicly-funded research data are a public good, produced in the public interest” “Publicly-funded research data should be openly available to the maximum extent possible.”

 Methods remain properly associated with the results generated —Data sets generated with specific techniques/materials can be retrieved from repositories (or excluded from results sets)  No need to repeatedly construct sets of contextualizing information —Facilitates the sharing of data with collaborators —Avoids the risk of loss of information through staff turnover —Enables time-efficient handover of projects  For industry specifically (in the light of 21 CFR Part 11) —The relevance of data can be assessed through summaries without wasting time wading through full data sets in diverse proprietary formats (‘business intelligence’) —Public data can be leveraged as ‘commercial intelligence’ 1. Increased efficiency

 Enables fully-informed assessment of results (methods used etc.)  Supports the assessment of results that may have been generated months or even years ago (e.g. for referees or regulators)  Facilitates better-informed comparisons of data sets —Increased likelihood of discovering the factors (controlled and uncontrolled) that might differentiate those data sets  Supports the discovery of sources of systematic or random error by correlating errors with metadata features such as the date or the operator concerned  Requires sufficient information to support the design of appropriate parallel or orthogonal studies to confirm or refute a given result 2. Enhanced confidence in data

 Re-using existing data sets for a purpose significantly different to that for which the data were generated  Building aggregate data sets containing (similar) data from different sources (including standards-compliant public repositories)  Integrating data from different domains —For example, correlating changes in mRNA abundances, protein turnover and metabolic fluxes in response to a stimulus  Design requirements become both explicit and stable — MIAPE modules as driving use cases (tools, formats, CV, DBs) — Promotes the development of sophisticated analysis algorithms — Presentation of information can be ‘tuned’ appropriately — Makes for a more uniform experience 3. Added value, tool development

 Data sharing is more or less a given now, and tools are emerging —Lots of sticks, but they only get the bare minimum —How to get the best out of data generators? —Need standards- and user-friendly tools, and meaningful credit  Central registries of data sets that can record reuse —Well-presented, detailed papers get cited more frequently —The same principle should apply to data sets —ISNIs for people, DOIs for data:  Side-benefits, challenges —Would also clear up problems around paper authorship —Would enable other kinds of credit (training, curation, etc.) —Community policing — researchers ‘own’ their credit portfolio (enforcement body useful, more likely through review) —Problem of ‘micro data sets’ and legacy data Credit where credit’s due

 Spanish multi-site collaboration: provision of proteomics services  MIAPE customer satisfaction survey (compiled November 2008) — —Responses from 31 proteomics experts representing 17 labs ProteoRED’s MIAPE satisfaction survey Yes: 95% No: 5%

Is the generation of MIAPE compliant reports useful? Is useful as a results report for my customer to publish proteomics data and for my lab to have all the conditions employed in each analysis. For the customers is an easy and very fast way of having a general view of the experiment and is useful for comparison of data. For my lab is a good way of data compilation and is necessary for data publication The MIAPE compliant reports is useful for the customers because they have the complete information about their experiment and for my own purposes in my lab for the same reason. Are MIAPE compliant reports useful as a quality label for your customers? A MIAPE report is right now, by itself the best quality label possible. Yes, sure. The user has all the information necessary to reproduce their analysis of a robust manner. As for today, none of our customers had the need for MIAPE documentation.

So what (/why) is a standards body again..? Consider the three main ‘omics standards bodies’ — What defines a (candidate) standards-generating body? — “A beer and an airline” (Zappa) — Requirements, formats, vocabulary — Regular full-blown open attendance meetings, lists, etc. — PSI (proteomics), GSC (genomics), MGED (transcriptomics) Hugely dependent on their respective communities — Requirements (What are we doing and why are we doing it?) — Development (By the people, for the people. Mostly.) — Testing (No it isn’t finished, but yes I’d like you to use it…) — Uptake, by all of the various kinds of stakeholder: — Publishers, funders, vendors, tool/database developers — The user community (capture, store, search, analyse)

Domain specialists & IT types (initial drafts, evolution) Journals —The real issue for any MI project is getting enough people to comment on what you have (distinguishes a toy project from something to be taken seriously — community buy-in) —Having journals help in garnering reviews is great (editorials, web site links, mail shots even). Their motive of course being that fuller reporting = better content = higher citation index. Funders —MI projects can claim to be slightly outside of 'normal' science; may form funding policy components (arguments about maximum value) —Funders therefore have a motive (similar to journals) to ensure that MI guidelines, which they may endorse down the line, are representative and mature —They can help by allocating slots at (appropriate) meetings of their award holders for you to show your stuff. Things like that. Ingredients for MI pie

Vendors —The cost of MIs in person-hours will be the major objection —Vendors can implement parameter export to an appropriate file format, ideally using some helpful CV (somebody else's problems) —Vendors also have engineers (and some sales staff) who really know their kit and make for great contributors/reviewers. —For some standards bodies (like PSI, MGED) their sponsorship has been very helpful also (believe it or not it would seem possible to monetise a standards body). Food / pharma —Already used to better, if rarely perfect data capture and management; for example, 21 CFR Part 11 (MI = exec summary…) Trainers —There is a small army of individuals training scientists, especially in relation to IT (EBI does a lot of this but I mean commercial training providers)  ‘Resource packs’ Ingredients for MI pie

Technologically-delineated views of the world A: transcriptomics B: proteomics C: metabolomics …and… Biologically-delineated views of the world A: plant biology B: epidemiology C: microbiology …and… Generic features (‘common core’) — Description of source biomaterial — Experimental design components Arrays Scanning Arrays & Scanning Columns Gels MS MS FTIR NMR Columns Modelling the biosciences

Modelling the biosciences (slightly differently) Assay:Omics and miscellaneous techniques Investigation:Medical syndrome, environmental effect, etc. Study:Toxicology, environmental science, etc.

Reporting guidelines — a case in point  MIAME, MIAPE, MIAPA, MIACA, MIARE, MIFACE, MISFISHIE, MIGS, MIMIx, MIQAS, MIRIAM, (MIAFGE, MIAO), My Goodness…  ‘MI’ checklists usually developed independently, by groups working within particular biological or technological domains —Difficult to obtain an overview of the full range of checklists —Tracking the evolution of single checklists is non-trivial —Checklists are inevitably partially redundant one against another —Where they overlap arbitrary decisions on wording and sub structuring make integration difficult  Significant difficulties for those who routinely combine information from multiple biological domains and technology platforms —Example: An investigation looking at the impact of toxins on a sentinel species using proteomics (‘eco-toxico-proteomics’) —What reporting standard(s) should they be using?

The MIBBI Project (mibbi.org)  International collaboration between communities developing ‘Minimum Information’ (MI) checklists  Two distinct goals (Portal and Foundry) —Raise awareness of various minimum reporting specifications —Promote gradual integration of checklists  Lots of enthusiasm (drafters, users, funders, journals)  31 projects committed (to the portal) to date, including: —MIGS, MINSEQE & MINIMESS (genomics, sequencing) —MIAME (μarrays), MIAPE (proteomics), CIMR (metabolomics) —MIGen & MIQAS (genotyping), MIARE (RNAi), MISFISHIE (in situ)

Nature Biotechnol 26(8), 889–896 (2008)

The MIBBI Project (

Interaction graph for projects (line thickness & colour saturation show similarity)

The MIBBI Project (

‘Pedro’ tool → XML → (via XSLT) Wiki code (etc.)

MICheckout: Supporting Users

Minimum Information guidelines: Progress on uptake  MIAME is the earliest of the ‘new generation’ of guidelines —Supported by ArrayExpress/GEO —Required by many journals  CONSORT ( & network.org) —Required by many journals ( N.B., no databases per se)  Other guidelines recommended for consideration —Individually (e.g., MIMIx, MIFlowCyt [ NPG ]) —Via MIBBI (BMC, Science [soon], OMICS, others coming too)  Many funders recommend use of ‘accepted’ community standards  But… Uptake is closer to nil for projects lacking supporting resources —Case in point: MIAPE (no usage until a web tool appeared)

ICS: overlapping guidelines registered at the MIBBI Portal  The study sample (potentially described in header metadata) —CIMR (human samples, cell culture) —MIFlowCyt (cell counting/sorting) —MIACA, MIATA (cell-based assays)  The assay —MIACA (cell-based assays) —Some general overlap (software, processing)  Image analysis —Some general overlap (image data [ MIAPE ] & statistics)

Tools?

Download similar studies From theory to practice: tools for the community Experiments EXPERIMENTALIST Java standalone components, for local installation that can work independently, or as unified system

The International Conference on Systems Biology (ICSB), August, 2008 Susanna-Assunta Sansone 28 Example of guiding the experimentalist to search and select a term from the EnvO ontology, to describe the ‘habitat’ of a sample Ontologies, accessed in real time via the Ontology Lookup Service and BioPortal

Spreadsheet functionalities, including: move, add, copy, paste, undo, redo and right click options

Groups of samples are colour coded

The International Conference on Systems Biology (ICSB), August, 2008 Susanna-Assunta Sansone 31 public instance EBI