Download presentation
Presentation is loading. Please wait.
Published byTalia Giddens Modified over 9 years ago
3
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes! For Bioinformatics, Start with:
4
The Human Genome E. coli Genome
5
SHEAR Shotgun DNA Sequencing of whole genome (WGS) DNA target sample LIGATE & CLONE Vector ReadsSEQUENCE Primer Reading:
6
Reading to Assembly:
7
The Human Genome E. coli Genome 50% of genome is repeat sequences! Assembly: The challenge of eukaryotic genomes 4 million bp 3 billion bp
8
Assembly of sequence of each chromosome from end to end END, Jan 14 begin
9
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence Whole genome shotgun OR Ordered clones find the genes ! Annotation: Robotically do dideoxy-dye data collection
11
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence find the genes ! Annotation: 1.ab initio 2.by evidence 10/1/5
12
ORFs are MOST of prokaryotic genome Annotation: For Bacterial genomes, ab initio is adequate ab initio: “from the beginning” יש מאין from first principles…
13
-85-88% of the nucleotides are associated with coding sequence in the bacterial genomes that have been completely sequenced. example: in Escherichia coli there are 4288 genes that have an average of 950 bp of coding sequence and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!! ab initio – finding ORFs Annotation:
17
-85-88% of the nucleotides are associated with coding sequence in the bacterial genomes that have been completely sequenced. example: in Escherichia coli there are 4288 genes that have an average of 950 bp of coding sequence and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!! ab initio – finding ORFs Annotation:
18
-Prokaryotes have short, simple promoters that are easy to recognize -Transcriptional terminators often consist of short inverted repeats followed by a run of Ts. -Therefore, programs that find prokaryotic genes search for: ORFs 60 or more codons long –and codon usage promoters at the 5' end Terminators at the 3' end Homology to known genes from other prokaryotes Shine-Dalgarno sequences ` ab initio – beyond ORFs Annotation: beyond ORFs:
19
Prokaryotic gene finder examples Glimmer- Interpolated Markov Model method GrailII- Neural Network method (See BioInfo text – Fig 8.8) ab initio – automated Annotation:
20
results Annotation:
22
Multicellular eukaryotes Done too 10/1/5
23
Multicellular eukaryotes Annotation: Done too 10/1/5
24
Multicellular eukaryotes Annotation: Done too 10/1/5
25
2 ways to annotate eukaryotic genomes: -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. -Genes based on previous knowledge….EVIDENCE -cDNA sequence of the gene’s message -cDNA of a closely related gene’ message sequence -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein……. -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. Annotation: Genes based on previous knowledge-EVIDENCE -cDNA sequence of the gene’s message -cDNA of a related gene’s message seq. -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein…….
26
Homology based exon predictions Consensus gene structure (both strands) start and stop site predictions Splice site predictions computational exon predictions Tracking information Unique identifiers
27
Automatically generated annotation
28
A zebrafish hit shows a gene model protein encoded by a 6 exon gene. This gene structure (intron/exon) is seen in other species, as is the protein size. The proteins, if corresponding to MSP in S. gal., must be heavily glycosylated (likely). At least some have a signal peptide.
29
The zebrafish hit can be viewed at higher resolution, and…
30
The zebrafish hit can be viewed down to nucleotide resolution
31
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes, 700 bp each read, MAX
32
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!
34
cDNAs & ESTs: Expressed Sequence Tags RNA target sample End Reads (Mates) SEQUENCE Primer cDNA Library Each cDNA provides sequence from the two ends – two ESTs Annotation:
37
Who Gets Sequenced? Models Pathogens Agriculturals
43
Array analysis: see animation from Griffiths
47
Protein Structure Database See Swiss-pdb viewer
51
RNA for ALL C. elegans genes
54
RNAi for every C. elegans gene too! -results on the web Projects to systematically Knock-out (or pseudo-knockout) every gene, in order to establish phenotype of each gene -> function of each gene
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.