Download presentation
Presentation is loading. Please wait.
Published byOsborne Preston Modified over 9 years ago
1
Other biological databases
2
Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological systems Protein families and domains Whole genome data Sequence data Ontologies -GO
3
Other Biological Databases Transcription factor binding sites - TRANSFAC Protein structure databases- PDB, SCOP, CATH Protein family databases- Pfam, Prints, PROSITE etc. Chemicals and small molecules - ChEBI Gene expression databases – GEO, ArrayExpress Metabolic pathways - Reactome, KEGG Genome Databases- Ensembl, FlyBase, WormBase etc. Human genetics-related databases –HapMap, dbSNP
4
Transcription factor binding sites TRANSFAC –database of eukaryotic transcription factors: http://www.gene- regulation.com/pub/databases.html#transfac TESS –Transcription Element Search System –for predicting transcription factor binding sites, uses TRANSFAC: http://www.cbi.upenn.edu/tess TFsearch –for searching transcription factor binding sites: http://www.cbrc.jp/research/db/TFSEARCH.html
5
Protein structure databases Main resource is Protein Data Bank (PDB): http://www.rcsb.org/pdb/ Contains the spatial coordinates of macromolecule atoms whose 3D structure has been obtained by X-ray or NMR studies Proteins represent more than 90% of available structures (others are DNA, RNA, sugars, viruses, protein/DNA complexes…) Can search by PDB code
6
Searching MSD http://www.ebi.ac.uk/msd -Search by PDB code
7
Protein structure-related databases Structural family databases based on PDB – SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/) and CATH (http://www.biochem.ucl.ac.uk/bsm/cath/) Predicted structures in SWISS-MODEL (http://swissmodel.expasy.org//SWISS- MODEL.html)
8
Protein family databases Databases that produce signatures for identifying protein families or domains Used for functional classification of proteins E.g. Pfam, PROSITE, Prints, SMART, TIGRFAMs etc. Integrated into single resource InterPro (http://www.ebi.ac.uk/interpro)
9
InterProScan sequence search Stand-alone version available
10
InterPro text search Search keyword, protein acc or InterPro acc
11
Results for protein acc
12
Example InterPro entry
13
Chemicals and small molecules Chemical abstracts- http://www.cas.org/ ChEBI- http://www.ebi.ac.uk/chebi KEGG –part of it includes chemicals http://www.genome.jp/kegg ChemID plus -chemicals cited in NLM databases http://chem2.sis.nlm.nih.gov/chemidplus/chemi dlite.jsp MSD-Chem –ligands and chemicals in MSD
14
CheBI example entry
15
Hierarchy for chemicals
16
Gene expression databases NCBI Gene Expression Omnibus (GEO) http://www.ncbi.nlm.nih.gov/geo/ ArrayExpress http://www.ncbi.nlm.nih.gov/geo/ Stanford microarray database http://genome- www5.stanford.edu/ Can usually search for experiments or particular expression profiles
17
GEO search page
18
Profiles search results
19
Specific entry and experiment info
20
ArrayExpress search results
21
What does the data look like? Info on experiment, array used, etc. Raw or processed tab delimited file containing spots and their intensities cy3/cy5 ratios) across different samples Files with meta data e.g. sample info, annotation and coordinates of each spot on array
22
Proteomics: SWISS-2DPAGE
23
Enzymes and metabolic pathways Contain information describing enzymes, biochemical reactions and metabolic pathways; ENZYME and BRENDA: nomenclature databases that store information on enzyme names and reactions; IntEnz: Integrated relational Enzyme database
24
Enzyme nomenclature E.C. (Enzyme Commission) numbers assigned based on reactions they catalyze Hierarchy, high level groups: –EC 1 –Oxidoreductases –EC 2 –Transferases –EC 3 –Hydrolases –EC 4 –Lyases –EC 5 –Isomerases –EC 6 –Ligases
25
EC example
26
Metabolic Pathway databases PATHGUIDE >200 pathways KEGG (Kyoto encyclopedia of genes and genomes): http://www.genome.jp/kegg -includes: –Database of chemicals, genes and networks (metabolic, regulatory etc.) –Well-curated and quite specific EcoCyc (Encyclopedia of E. coli K12 genes and metabolism): http://ecocyc.org –curation of entries genome Reactome –curated biological pathways: http://www.reactome.org/ GenMAPP –pathways contributed by users
27
http://www.genome.ad.jp/kegg Different pathway in different species: -> comparison
28
Pathway in Reactome
29
Example of a pathway in BioCyc
30
Protein-protein interaction databases Protein-protein interaction databases store pairwise interactions or complexes Can get 1 to more than 20,000 interactions per publication IntAct http://www.ebi.ac.uk/intact DIP (Database of Interacting Proteins) http://dip.doe- mbi.ucla.edu/ BIND (Biomolecular Interaction Network Database) http://submit.bind.ca:8080/bind/
31
Protein-protein interactions in IntAct
32
Integrated functional interactions in STRING
33
Genome browsers Integrate sequence & functional data for a genome Ensembl –genome browser for major eukaryotic genomes, e.g. human, mouse etc. http://www.ensembl.org UCSC browser -http://genome.ucsc.edu/ FlyBase –Drosophila genome database: http://www.ebi.ac.uk/flybase WormBase –C. elegans: http://www.wormbase.org PlasmoDB –Plasmodium (malaria): http://plasmodb.org Etc.
34
Ensembl genome browser
35
Ensembl gene view 1
36
Ensembl gene view 2
37
Gene within context on chromosome
38
Human genetics databases GeneCards (http://www.genecards.org/) HapMap (http://hapmap.ncbi.nlm.nih.gov/) OMIM http://www.ncbi.nlm.nih.gov/omim HGDP Human Genome Diversity Project (http://hagsc.org/hgdp/files.html)
39
Most of the databases are disease or gene centric i.e. p53 Mutation/polymorphism databases
40
dbSNP http://www.ncbi.nlm.nih.gov/SNP/ Repository of all known mutation (human and other organisms)
41
Where to find the databases Table of addresses for major databases and tools Nucleic Acids Research Database issue January each year Nucleic Acids Research Software issue –new Expasy list of tools: http://ca.expasy.org/links.html
42
Large scale data retrieval Programmatic access to many databases MySQL access to some BioMart access –public and private FTP sites –large data downloads
43
Other tutorials http://www.ensembl.org/info/website/tutorials/ind ex.html http://www.ebi.ac.uk/training/online/ http://www.ebi.ac.uk/2can/home.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.