Download presentation
Presentation is loading. Please wait.
Published bySuzanna Henderson Modified over 6 years ago
1
Practices of Science Data Sharing Platform for Bioinformation in SCBIT
Guoqing Zhang Oct 7th 2010
2
Overview of SDSPB Shanghai Center for BIoformation Technology(SCBIT) takes promoting Chinese bioinformation data sharing as a goal Scientific Data Sharing Platform for Bioinformation(SDSPB) is the executive platform of data sharing in SCBIT SDSPB is aimed to build and manage databases of biological data that servers the life science research community in China, especially in Yangtze River Delta
3
Data Transmission in SDSPB
Retrieve data from 3rd data centers, including EBI, NCBI, KEGG, and so on Accept data submitted by users, including nucleotide sequences, protein sequences, proteomics, and other data Supply download service
4
SDSPB had built many databases and online services in Chinese and/or English
SDSPB classify these resources as basic resources, subject resources, online services and metadata system
5
Basic Resources and Subject Resources
Collect various data from basic resources Import related data to meet the need of subject resources hepatitis B virus
6
Full Database Update Taxonomy database as support data for all other resources NT/NR databases for blast UniProt database for protein identification KEGG pathway database for pathway analysis Entrez gene database for gene names ID mapping database for database integration
7
Data Update for Basic Resources
Update as full database in weekly or monthly Download automatically from background Once finished, is sent to the curators Once overtime or receive , the curators check the data manually The speed is between 2-3kb/s, and no more than 100kb/s About 1/5 data should be downloaded manually
8
Selective Database Update
Genome sequences for virus genomes, pathogen genomes Other nucleotide sequences of special species GEO profiles of special species RPIDE experiments of special species Structure data of special species Literature resources for special issues
9
Data Update for Subject Resources
SDSPB build some subject databases and platforms HBV drug resistance platform(in Chinese) includes HBV related genomes, protein sequences Schistosome database collects protein sequences, EST and other data Flu database …
10
How We get data selectively
Subject resources require to integrate necessary data SDSPB define the data range, make the collecting strategy, and retrieve these data automate or manually Data imported into basic resources Subject resources integrate different data with local resources
11
Data Submission and download
Genome sequences and related data NGS data Proteomics data Supply data download service, especial raw data submitted and allowed by users
12
Current situation of SDSPB
We wish to accept data by network We accept data by express now We retrieve data from 3rd data centers again and again Robust scripts to struggle with complicated network
13
What we do Introduce one line from CAS for data download (100M/s) to meet with the low speed More human resources devoted to meet with the unstable network More robust scripts written to check the network connection status
14
Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.