Presentation is loading. Please wait.

Presentation is loading. Please wait.

Practices of Science Data Sharing Platform for Bioinformation in SCBIT

Similar presentations


Presentation on theme: "Practices of Science Data Sharing Platform for Bioinformation in SCBIT"— Presentation transcript:

1 Practices of Science Data Sharing Platform for Bioinformation in SCBIT
Guoqing Zhang Oct 7th 2010

2 Overview of SDSPB Shanghai Center for BIoformation Technology(SCBIT) takes promoting Chinese bioinformation data sharing as a goal Scientific Data Sharing Platform for Bioinformation(SDSPB) is the executive platform of data sharing in SCBIT SDSPB is aimed to build and manage databases of biological data that servers the life science research community in China, especially in Yangtze River Delta

3 Data Transmission in SDSPB
Retrieve data from 3rd data centers, including EBI, NCBI, KEGG, and so on Accept data submitted by users, including nucleotide sequences, protein sequences, proteomics, and other data Supply download service

4 SDSPB had built many databases and online services in Chinese and/or English
SDSPB classify these resources as basic resources, subject resources, online services and metadata system

5 Basic Resources and Subject Resources
Collect various data from basic resources Import related data to meet the need of subject resources hepatitis B virus

6 Full Database Update Taxonomy database as support data for all other resources NT/NR databases for blast UniProt database for protein identification KEGG pathway database for pathway analysis Entrez gene database for gene names ID mapping database for database integration

7 Data Update for Basic Resources
Update as full database in weekly or monthly Download automatically from background Once finished, is sent to the curators Once overtime or receive , the curators check the data manually The speed is between 2-3kb/s, and no more than 100kb/s About 1/5 data should be downloaded manually

8 Selective Database Update
Genome sequences for virus genomes, pathogen genomes Other nucleotide sequences of special species GEO profiles of special species RPIDE experiments of special species Structure data of special species Literature resources for special issues

9 Data Update for Subject Resources
SDSPB build some subject databases and platforms HBV drug resistance platform(in Chinese) includes HBV related genomes, protein sequences Schistosome database collects protein sequences, EST and other data Flu database …

10 How We get data selectively
Subject resources require to integrate necessary data SDSPB define the data range, make the collecting strategy, and retrieve these data automate or manually Data imported into basic resources Subject resources integrate different data with local resources

11 Data Submission and download
Genome sequences and related data NGS data Proteomics data Supply data download service, especial raw data submitted and allowed by users

12 Current situation of SDSPB
We wish to accept data by network We accept data by express now We retrieve data from 3rd data centers again and again Robust scripts to struggle with complicated network

13 What we do Introduce one line from CAS for data download (100M/s) to meet with the low speed More human resources devoted to meet with the unstable network More robust scripts written to check the network connection status

14 Thanks


Download ppt "Practices of Science Data Sharing Platform for Bioinformation in SCBIT"

Similar presentations


Ads by Google