Presentation is loading. Please wait.

Presentation is loading. Please wait.

GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics.

Similar presentations


Presentation on theme: "GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics."— Presentation transcript:

1 GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics

2 GENBANK OVERVIEW  Consists of EMBL, NCBI and DDBJ  Started 10 years ago  Exponential growth (graph)graph  On Saturday, the 7 th – 20.2 billion bases

3 FILE FORMAT  Header  Features  Sequence (see files)see files

4 FASTA FORMAT  Single line description begins with >  Followed by sequence data  Can be both protein or DNA

5 ENTREZ as RETRIEVAL SYSTEM  PubMed – 12 million citations from life science journals  Nucleotide – collection of DNA sequences Nucleotide  Protein – protein sequences from SwissProt Protein  Genome – genomes of over 800 organisms  Also Structure, PopSet, Taxonomy, OMIM

6 PROTEIN DATABASES  SWISS-PROT SWISS-PROT  EBI – TREMBL  NCBI – GENPEPT (already in history)GENPEPT

7 GENOME DATABASES  SGD: homepage example 1.1 example 1.2  Wormbase Wormbase  Ensembl Human Genome Browser Ensembl Human Genome Browser

8 CONCLUSIONS  Sequencing projects produce a lot of data  These data have at least to be structured in the databases  Ideally all sequences need high-quality human annotation  That’s why computer scientists are welcome in biology

9 LITERATURE  Genebank presentation by Manpreet Katari (CSE 549, Fall 2000)  Thomas Lengauer (Ed.) Bioinformatics – From Genomes to Drugs  Entrez website  Google


Download ppt "GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics."

Similar presentations


Ads by Google