Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,

Similar presentations


Presentation on theme: "Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,"— Presentation transcript:

1 Creating a … Community Database Organism-Specific Database Model-Organism Database

2 Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism, update on ongoing basis Create a metabolic model Perform comparative analyses

3 Model Organism Databases
DBs that describe the genome and other information about an organism Curated by experts for that organism No one group can curate all the world’s genomes Distribute workload across a community of experts to create a community resource Every sequenced organism with an active experimental community requires a MOD Integrate genome data with information about the biochemical and genetic network of the organism Integrate literature-based information with computational predictions

4 Rationale for MODs Each “complete” genome is incomplete in several respects: 40%-60% of genes have no assigned function Roughly 7% of those assigned functions are incorrect Many assigned functions are non-specific MODs are platforms for global analyses of an organism Interpret omics data in a pathway context In silico prediction of essential genes Characterize systems properties of metabolic and genetic networks

5 What is Curation? Ongoing updating and refinement of a PGDB
Correct false-positive and false-negative predictions Incorporate information from experimental literature Update genome sequence Update gene functions, gene positions, gene names Author comments and citations Add new pathways, modify existing pathways Enter information about regulatory networks

6 Issues in Creating Public MODs
Scope/prioritize the project Identify user community Obtain buy-in and help from scientific community Obtain funding IT: Set up database server, Web server Hire and train curators

7 Administering Pathway Tools

8 New Pathway Tools Releases
Major releases = External software releases Twice per year Announced on ptools-users mailing list Minor releases twice per year affect only our BioCyc.org Web site and flatfile distributions We support one prior release only Releases announced on Read release notes at Install process: Upgrade schema of your DB (software assisted)

9 PGDB Storage: File or Relational Database
File storage: Advantages: No RDBMS installation and configuration Disadvantages: Must be loaded and saved in its entirety No transaction history No concurrent access for multiple users MySQL storage: Faster read access, faster saves Concurrent update access for multiple users Stores transaction history of all PGDB updates RDBMS must be installed and configured

10 Multiuser Access to PGDBs
PGDB stored within one MySQL server Each curator installs PTools on their computer Curator computers query RDBMS server via internet For each frame access, PTools queries In-memory cache, disk cache, RDBMS server After curator saves changes, all changes made by other users are loaded into curator’s session

11 How to Release a PGDB? Decide on release frequency and schedule
Don’t wait until it’s perfect to release it! Quality assurrance Run consistency checker Tools -> Consistency Checker Also updates organism-summary statistics Update publications, authors in organism frame Update via Organism editor Create new version of PGDB ptools-local/pgdbs/yeastcyc/1.0/kb/yeastbase.ocelot Edit against the new version, release the old version Author release notes Register PGDB in SRI PGDB registry Will allow SRI to include it in BioCyc

12 Pathway Tools Data Import/Export
File->Export File->Import Export/import to/from tab-delimited files Export to Genbank, GFF3 (soon), SBML, BioPAX Export to attribute-value files Attribute-value files can be imported into BioWarehouse Relational database system for bioinformatics database integration

13 Registry: Public PGDB Sharing
PGDB registry maintained by SRI at URL Registry operations List contents of registry Download PGDBs listed in the registry Register PGDBs you have created

14 Registry Details Why register your PGDB?
Facilitate its download by other scientists Facilitate its inclusion in BioCyc.org Why download a PGDB? Desktop Navigator provides faster/more functionality than Web Comparative operations Programmatic querying and processing of PGDB

15 Changes Planned for BioCyc.org
BioCyc will be starting a subscription model July 1

16 Why? Government funding for databases shrinking
BioCyc funding cut 27% as number of genomes climbed 5X in 5 years No other foreseeable sources of funding for "Big Knowledge" in life sciences Goals: Create high-quality curated EcoCyc-like DBs for many organisms Couple with extensive user-friendly bioinformatics tools

17 How? Subscription access to BioCyc.org by institutions, individuals
Subscription rates will depend on usage levels from previous year EcoCyc and MetaCyc will remain free Pathway Tools will remain free


Download ppt "Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,"

Similar presentations


Ads by Google