Download presentation
Presentation is loading. Please wait.
Published byRichard Thompson Modified over 6 years ago
1
Access to Sequence Data and Related Information
From Bioinformatics and Functional Genomics, Third Edition, Jonathan Pevsner.2015
2
Learning objectives define the types of molecular databases;
define accession numbers and the significance of RefSeq identifiers; describe the main genome browsers and use them to study features of a genomic region; and use resources to study information about both individual genes (or proteins) and large sets of genes/proteins.
3
Introduction to Biological Databases
In 1995 the complete genome of a free-living organism was sequenced for the first time, the bacterium Haemophilus influenzae DNA sequence data collected from over 300,000 different species of organisms 1970s dideoxynucleotide sequencing (“Sanger sequencing”) Since 2005 next-generation sequencing (NGS) technology
4
Centralized Databases Store DNA Sequences
5
NCBI
6
EBI
7
DDBJ
8
Growth of DNA sequence in repositories
9
Scales of DNA base pairs
10
Contents of DNA, RNA, and Protein Databases
11
Genbank data file division
12
Types of Data in GenBank/EMBL-Bank/DDBJ
13
Genomic DNA Databases DNA-Level Data: Sequence-Tagged Sites (STSs)
DNA-Level Data: Genome Survey Sequences (GSSs) DNA-Level Data: High-Throughput Genomic Sequence (HTGS)
14
Sequence-tagged site A sequence-tagged site (or STS) is a short (200 to 500 base pair) DNA sequence that has a single occurrence in the genome and whose location and base sequence are known. STSs can be easily detected by the polymerase chain reaction (PCR) using specific primers. For this reason they are useful for constructing genetic and physical maps from sequence data reported from many different laboratories. They serve as landmarks on the developing physical map of a genome. When STS loci contain genetic polymorphisms (e.g. simple sequence length polymorphisms, SSLPs, single nucleotide polymorphisms), they become valuable genetic markers, i.e. loci which can be used to distinguish individuals. They are used in shotgun sequencing, specifically to aid sequence assembly. STSs are very helpful for detecting microdeletions in some genes. For example, some STSs can be used in screening by PCR to detect microdeletions in Azoospermia (AZF) genes in infertile men. fromf
15
RNA data RNA-Level Data: cDNA Databases Corresponding to Expressed Genes RNA-Level Data: Expressed Sequence Tags (ESTs) RNA-Level Data: UniGene
16
Protein Databases UniProt
17
Central Bioinformatics Resources: NCBI and EBI
18
Accession Numbers to Label and Identify Sequences
19
The Reference Sequence (RefSeq) Project
20
Access to Information via Gene Resource at NCBI
21
Flatfile type&Fasta
22
Command-Line Access to Data at NCBI
23
Access to Information: Genome Browsers
The University of California, Santa Cruz (UCSC) Genome Browser The Ensembl Genome Browser The Map Viewer at NCBI
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.