Pathway Interaction Database (PID) Market Research BioPortals Tiger Team Meeting Mervi Heiskanen January 31, 2013.

Slides:



Advertisements
Similar presentations
NIMAC 2.0: The Accessible Media Producer Portal NIMAC 2.0 for AMPs.
Advertisements

Being Professionally Persistent
The Electronic Office Some supplementary information Corporate websites Office automation Company intranet.
July 2010 D2.1 Upgrading strategy Javier Soto Catalog Release 3. Communities.
Software Structure CHAPTER 4 The overall structure of the online learning community system : 1.user database 2.content database 3.user/content map 4.user/user.
PER User’s Guide. Development of the PER User’s Guide: Identifying key features of research-based pedagogical tools for effective implementation Sam McKagan.
SciVal Experts & SciVal Funding Information Sessions.
WEB2.0 Social Media & Independent Pharmacy Real World Use & Possibilities.
How we assist knowledge collection Serving the monks Chris Evelo Dept of Bioinformatics – BiGCaT Maastricht University.
SE 555 Software Requirements & Specification Requirements Management.
Calice Meeting DESY 13/2/07David Ward Guidelines for CALICE presentations Recently approved by the Steering Committee.
OECD Short-Term Economic Statistics Working PartyJune Analysis of revisions for short-term economic statistics Richard McKenzie OECD OECD Short.
System Implementation
Electronic EDI e-EDI. The EDI has been in use since 1999 using a paper-based system and computerized spreadsheets to collect and manage EDI data. Over.
XML, DITA and Content Repurposing By France Baril.
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
Form Builder Iteration 2 User Acceptance Testing (UAT) Denise Warzel Semantic Infrastructure Operations Team Presented to caDSR Curation Team March.
Federal Student Aid Identification username and password – this is how students and parents will sign the FAFSA application. The FSA ID process replaced.
5/5/2005Toni Räikkönen Internet based data collection from enterprises using XML questionnaires and XCola engine CoRD Meeting May 11th 2005.
Crowdsourcing Predictors of Behavioral Outcomes. Abstract Generating models from large data sets—and deter¬mining which subsets of data to mine—is becoming.
Making Sense of the Social World 4th Edition
WLE Information Management. Discussion points  What systems do we have?  Which to use for what purpose?  What information is missing and can be improved.
Commonwealth of Massachusetts Statewide Strategic IT Consolidation (ITC) Initiative ANF IT Consolidation Website Publishing / IA Working Group Kickoff.
1 Semanticommunity.info Tutorial Brand Niemann December 7, 2010.
Action Research Use of wikispaces to improve levels of independent learning in AS Physics Cath Lowe.
My Resource for Excellence. Canadian Heritage Information Network Creation of the Collections Management Software Review (CMSR) Heather Dunn, CHIN.
AGES: Activating and guiding the engagement of seniors Intervention Training.
Lead Management Tool Partner User Guide March 15, 2013
Usability Issues Documentation J. Apostolakis for Geant4 16 January 2009.
Sept 19,  Provides a common set of terminology and definitions  A framework for describing resources and processes  Enables computer based interoperability.
PAPER PRESENTATION: EMPIRICAL ASSESSMENT OF MDE IN INDUSTRY Erik Wang CAS 703.
TAIR Workshop Model Organism Databases and Community Annotation Plant and Animal Genome XVI Conference, San Diego January 13, 2008.
Research on the Interaction Between Human and Machines University of Houston-Clear Lake Tasha Y. David.
ARC Website. Mile High View ARC Portal Summary Highlights 2004 ALMIS Dba Survey 2005 ALMIS Dba Survey ARC Portal Development On the Horizon.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
FITT Fostering Interregional Exchange in ICT Technology Transfer Communication & Collaboration Tools.
© 2008 Convio, Inc. Social Media for Newbies Cynthia Balusek, Convio February 10, 2009.
World Community Grid Volunteer Study. World Community Grid Rationale for Study  World Community Grid has been around since 2004  Last significant changes.
CIP Quality System for Genebank ISO 17025
Chapter 15 Qualitative Data Collection Gay, Mills, and Airasian
S&I Integration with NIEM (DRAFT) Standards Development Support June 8, 2011.
Metadata with MMI Opening the Door to Collaboration John Graybeal, Luis Bermudez, Philip Bogden, Steven Miller, Stephanie Watson.
Getting Started with SharePoint 2010 Gareth Johns IT Skills Development Advisor.
Requirement engineering Good Practices for Requirements Engineering
Introduction This presentation is intended as an introduction to the audit process for employees of entities being audited by MACD. Please refer to the.
Towards a Glossary of Activities in the Ontology Engineering Field Mari Carmen Suárez-Figueroa and Asunción Gómez-Pérez {mcsuarez, Ontology.
CISB113 Fundamentals of Information Systems IS Development.
What students really think of their reading lists: reading list software at the University of Huddersfield Alison Sharman 2015.
Compliance Monitoring and Enforcement Audit Program - The Audit Process.
Instructional Technology Survey: Highlands School District Shawn Cressler, Summer 2013.
Copyright OpenHelix. No use or reproduction without express written consent1.
TDWG – Looking Backward and Forward Donald Hobern, Director, Atlas of Living Australia 20 October 2008.
Chapter 6 Discovering the Scope of the Incident Spring Incident Response & Computer Forensics.
COLLABORATIVE WEB 2.0 TOOLS IN EDUCATION USING WIKIS & BLOGS IN THE CLASSROOM.
Next Welcome Personalization Home Citing, Printing, and Sharing Search and Filter Results Browse Content by Subfield Saved Content Full Text View More.
NIMAC for Accessible Media Producers: February 2013 NIMAC 2.0 for AMPs.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Federated Cloud and Software Vulnerabilities Linda Cornwall, STFC 20.
Findings – January  Respondents  Access to the practice  Repeat prescription service  Test results  Practice staff  Overall satisfaction 
Component D: Activity D.3: Surveys Department EU Twinning Project.
© Trustees of Indiana University Released under Creative Commons 3.0 unported license; license terms on last slide. Take Group Projects to the Next Level.
Founded by Big Five Consulting ex-employees Oracle Gold Partner Focus on PeopleSoft 15 years of PeopleSoft experience Worked in both technical and functional.
NCI CBIIT LIMS ISIG Meeting– July 2007 NCI CBIIT LIMS Consortium Interface SIG Mission: focus on an overall goal of providing a library of interfaces/adapters.
NCI CBIIT LIMS ISIG Meeting– Aug. 21,2007 NCI CBIIT LIMS Consortium Interface SIG Mission: focus on an overall goal of providing a library of interfaces/adapters.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Systems Analysis and Design
Thawatchai Piyawat Jantawan Noiwan Anthony F. Norcio
Viral Hepatitis Prevention Project (VHPP) in Massachusetts
Our Wireless Enterprise Help Desk (WEHD) is here to assist you 24 hours a day, 7 days a week, 365 days a year. We also have our own private number.
Foster Carer Retention Project Michelle Galbraith Project Manager
AWS Migration Made Easy
Presentation transcript:

Pathway Interaction Database (PID) Market Research BioPortals Tiger Team Meeting Mervi Heiskanen January 31, 2013

Background Highly structured, curated collection of information about known biomolecular interactions and key cellular processes assembled into signaling pathways. Collaboration with Nature Publishing Group who supported the PID by providing manual curation of pathways. Curation contract ended in 2012, engineering contact ended in PID Site Stats Averages for 2012: –Average users per month: 7,000 –Average number of visits per month: 10,000 –Average visit duration: 8 minutes Next step: Market research to determine community needs, and to guide future direction of PID. 2

“PID retirement revisited” sent to PID power users (35) to schedule phone interviews. PID questionnaire attached for those preferring to respond by . Some asked to keep responses confidential. Received 11 responses, all support continuing development of PID. 31% response rate! 4 Phone Interviews 7 Responses Questionnaire 3

Pathway Interaction Database (PID) questionnaire Please describe: –How are you accessing PID data? Data Portal. Download: xml, BioPAX, SVG, JPG, API (via caBio): Is programmatic access to data useful /critical? –How critical are data updates? Frequency of updates? –Data curation: What are your thoughts on the method of PID data curation? Would you propose to retain curation by a dedicated PID analyst and review by 1-2 domain experts (as has been done in the past) or move to a community based curation model (e.g., similar to Wiki pathways)? –Would it be useful to integrate PID with other pathway resources, e.g. Reactome, Wiki pathways? How important is it to keep PID as a separate entity to retain cancer focus? –Value of the PID data portal: what are the features you find useful? –Any other comments, ideas? 4

How are you accessing PID Data? 10 responses PID portal: 9 users BioPAX: 9 users XML: 3 users API: 3 responses (might be useful in the future) SVG: 1 user Several data portal users do not use it very often, they are mostly downloading BioPAX. “I mainly access PID by downloading the UniprotKB mapping file and the pathway ontology annotations”. 5

How critical are data updates? Frequency of updates? 10 responses: maintenance and updates are important Very important to at least maintain PID: Uniprot and other accession numbers need to be kept up to date. It would be nice to have pathway updates also, e.g. every 2-6 months. Currently about 1/3 of genes in the genome represented in PID pathways, would be great to increase coverage. Given my previous experience I consider that two or three updates per year are sufficient to provide a high-­ quality service to the community. Technical error fixes, e.g BioPAX exporter are critical and should not be too difficult to release often. Resource updates can be once in several months. 6

What are your thoughts on the method of PID data curation? 11 responses You need someone to coordinate curation. One option is to fund existing pathway resources to curate data. Annotation by in-house analysts helped by outside domain experts is essential. Community annotation isn’t reliable to keep existing annotations complete and current, and fails as a general model for generating new content. Wiki Pathways effort is great but the quality and consistency of curation cannot be compared to PID. Hesitant to recommend crowd sourcing for PID, editorial review preferred. I suggest you adopt a hybrid approach. Crowd-sourcing of these pathways is useful once a pretty good model is already in place to seed the community-based editing. So the challenge for you should be to get initial pathways, reviewed by say 1 expert up on the web. Then turn it over for community editing in a WikiPathways framework 7

Importance of integration with other pathway resources and cancer focus? 10 responses. Keeping the cancer focus is very useful, but this can be accomplished in an integrated PID. If an independent PID curation effort continues focused on cancer pathways, other pathway resources like Wiki Pathways and Reactome could take over the task of managing the database. PID strongest point is its focus on cancer and I’d keep it separated. However, coordinating curation efforts with other pathway databases such as Reactome would be beneficial for both resources, since it will enable higher curation coverage. Unless you're really planning to quickly merge into Reactome or another pathway DB, you must have at least 1-2 curators and technical support person. 8

Value of the PID data portal, 10 responses Portal is nice but if cuts needed focus on data quality and BioPAX. Pathway diagrams, view distance between molecules, pathway enrichment. I find the “research highlights” and the “bioinformatics primers” interesting and useful, especially for training purposes. Being able to easily upload/paste a file of identified proteins/genes and finding the pathways they belong to immediately is what really sets PID apart. 9

Other comments, ideas? We have used PID data for about six publications to date. I recommend you work with WikiPathways. Crowd- sourcing would perhaps cut down the number of domain experts you need from 2 to 1 (or a “half” of a person). We use PID heavily as part of our GDAC efforts for the TCGA project. We've used the pathways in many major publications including our work on the ovarian, colorectal, breast, kidney, and most recently endometrial datasets all published in Nature. 10

Conclusions All responders agreed PID is a valuable and unique resource and should be continued. They would prefer to keep PID the way it is: high quality data curation and review process by trusted neutral party. Cancer focus is valuable. Data portal is useful and user friendly, BioPAX download important for other pathway data resources using PID data. Restart PID data curation process. Continue to support PID data portal at CBIIT. Explore crowdsourcing to increase visibility and community involvement. 11 Recommendation