Download presentation
Presentation is loading. Please wait.
Published byMerryl Bradley Modified over 9 years ago
1
An Introduction to Bioinformatics Molecular Biology Databases
2
AIMS OBJECTIVES To introduce the major databases - nucleotide - protein To explain how to search the appropriate databases To explain how to retrieve information from databases Choose appropriate databases for information retrieval Use of Boolean operators to search databases Retrieve nucleotide and protein sequence files
3
Introduction Hundreds! Databases of databases! Acronym rich! Subcomponents organisms structure metabolism……. Searched text, sequences
4
Historically 1960s Mary Dayhoff - Protein Sequences (Eck, R. V., and M. O. Dayhoff. 1966. Atlas of Protein Sequence and Structure 1966. National Biomedical Research Foundation, Silver Spring, Maryland.) 1980s - explosion in DNA sequences EMBL (European Molecular Biology Laboratory) NIH (National Institute of Health) Genbank DDBJ (DNA database of Japan) 1988 agreed on international collaboration
6
Experimentally determined nucleotide sequence, Inferred protein sequence –EMBL, GenBank, DDBJ nucleotides –GenPept –PIR Protein Identification Resource proteins –SWISS-PROT Which to choose? Primary Databases }
7
Composite Databases SWISS-PROT + PIR + GenPept + SWISS-PROT, Swissnew, Trembl, Tremblnew, Genbank, PIR, Wormpep and PDB
8
Secondary Databases Analytical results of primary databases Searching for related patterns –Prosite –Pfam More on these later
9
Sub-Databases EST - Expressed Sequence Tags STS - Sequence Tagged Sites SNP - Single Nucleotide Polymorphisms OMIM - Online Medelian Inheritance in Man
10
Searching and Retrieval Entrez- National Center for Biotechnology Information SRS - European Bioinformatics Institute DBGET - Japan’s GenomeNet. Capable of retrieving specific nucleotide or protein sequence. Provide links to additional related information.
11
Entrez
12
Entrez Tutorial Search for penicillin-binding genes Search for Mycobacterium tuberculosis Combine the searches Scan the output Q/ Are there any genes that code for penicillin binding in the Mycobacterium genome? Example of a text based search to identify genes that have already been annotated.
17
#1 AND #2
21
SRS guide
22
Searching the Databases Subject Accession Numbers Author e.g. AF208262AF208262
23
Boolean Operators AND will locate all records containing both the words e.g. human AND protease OR will locate all records containing either word not necessarily both e.g. human OR protease) NOT will locate records containing one word, but NOT the other word e.g. human NOT protease
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.