Download presentation
Presentation is loading. Please wait.
Published byStella Atkins Modified over 8 years ago
1
Biological Sequence Analysis 140.638.01
2
The materials used in this class are made possible by: Zhiping Weng, http://zlab.bu.eduhttp://zlab.bu.edu Wenyi Wang Zhijin Wu Garland publishing, Alberts’s the Cell And the wealth of internet resources
3
Who are we? Sining Chen Carlo Colantuoni Giovanni Parmigiani
4
Who are you? Field of research Stats & computing background Register or audit Why are you taking this course Specific topics you are interested
5
http://astor.som.jhmi.edu/~sining/BSA/syllabus.h tm Administrative Details
6
The MHS program in Bioinfo Jointly offered by Dept. Biostatistics and Molecular Microbiology and Immunology An intensive one-year program that emphasizes biology, statistical methods, and computing
9
Goal of the class Learn to look at biological sequences from a probabilistic point of view Understand algorithms behind routine operations, e.g. BLAST. Be able to build statistical model to solve problems involving sequences
10
Carlo Colantuoni Clinical Brain Disorders Branch, NIMH, NIH Dept. Biostatistics, JHSPH ccolantu@jhsph.edu colantuc@intra.nimh.nih.gov Biological Sequence Analysis: Basic Biological Concepts
11
Molecular Cell Biology: Central Dogma RNA Protein Sequence analysis important at all 3 levels Transcription Translation DNA Replication
12
The Human Genome DADMOM YOU 2 copies in every cell (46 chr) One copy from each parent Each parent passes on a “mixed copy” Genomic Content: 3.3 billion bases ~30K genes 23 chromosomes (22+X/Y) Millions of variants
13
Nucleotides are the chemical building block of Nucleic Acids: DNA and RNA
14
Nucleotides are the chemical building block of Nucleic Acids: DNA and RNA
17
From Genomic DNA to mRNA Transcripts EXONSINTRONS Alternative splicing ~30K >30K Protein-coding genes are not easy to find - gene density is low, and exons are interrupted by introns. Promoters Poly-Adenylation
18
AAAAA STARTSTOP protein coding 5’ UTR 3’ UTR mRNA Genomic DNA 3.3 Gb Protein Molecular Cell Biology: Components of the Central Dogma Transcription Translation
20
DNA: A T G C 1:1 RNA: A U G C 3:1 Protein: 20 amino acids Transcription Translation Replication Translation - Protein Synthesis: Every 3 nucleotides (codon) are translated into one amino acid
21
Translation - Protein Synthesis 5’ -> 3’ : N-term -> C-term RNA Protein
22
Nucleotide sequence determines the amino acid sequence
24
The Human Genome DADMOM YOU 2 copies in every cell One copy from each parent Each parent passes on a “mixed copy” Genomic Content: 3.3 billion bases ~30K genes 23 chromosomes (22+X/Y) Deletions Insertions Mutations Evolutionary Scale
25
Biological Sequence Analysis: Primary Concepts Homologue Paralogue Ortholog Identity & Similarity
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.