Download presentation
Presentation is loading. Please wait.
Published byAugusta Gallagher Modified over 8 years ago
1
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu
2
The Protein Sequence and Structure Wave 1955: Sanger sequenced bovine insulin 1970: Smith-Waterman algorithm 1973: PDB 1990: BLAST 1994: BLOCKS database 1994-: CASP 1997-: Proteomics STAT1152
3
3 The Microarray Wave Microarray contains hundreds to millions of tiny probes Simultaneously detect how much each gene is expressed
4
STAT1154 ALL vs AML Golub et al, Science 1999.
5
STAT1155 ALL vs AML
6
“Microarrays” Today Infer the expression value of all the genes from 1000 probes High throughput drug screen STAT1156
7
The DNA Sequencing Wave STAT1157 1953: DNA structure 1972: Recombinant DNA 1977: Sanger sequencing 1985: PCR 1988: NCBI 1990: BLAST
8
Sequencing in the 1970s STAT1158
9
9 The Human Genome Race Human Genome Project: 1990-2003 –Originally 1990-2005 –Boosted by technology improvement and automation –Competition from Celera
10
STAT11510 Human Genome Sequencing Clone-by-clone and whole-genome shotgun
11
STAT11511 The Human Genome Race Human Genome Project: 1990-2003 –Originally 1990-2005 –Boosted by technology improvement and automation –Competition from Celera Informatics essential for both the public and private sequencing efforts –Sequence assembly and gene prediction –Working draft finished simultaneously spring 2000
12
Sequencing in 2001
13
Sequencing in 2007
14
Sequencing Today Personal genome sequencing HiSeq X –900GB data / flow cell in < 3 days, 10 * 30X human genomes, at ~$1500 / sample STAT11514
15
Personalized Disease Susceptibility Test and Treatment STAT11515 Break
16
Big Data Challenges STAT11516
17
All biology is becoming computational, much the same way it has became molecular … Otherwise “low input, high throughput and no output science” --- Sydney Brenner 2002 Nobel Prize
18
Bioinformatics and Computational Biology Interdisciplinary –Statistics, Biology, Computer Science Applied –From freshman to postdocs –Useful training for many –The more you practice, the better you get Moves with technology development STAT11518
19
Is This Class for me? Computer: –R and Python Biology: –Molecular biology, genomics Statistics: –Hypothesis testing, distributions, intuition STAT11519
20
Class Information Course website: –https://canvas.harvard.edu/courses/10740 –Video recording, slides, reading online –Office hours, auditing –Background: CS, Stats, Biology Roughly 6 modules (HW each) –Transcriptomes (microarrays and RNA-seq) –Gene regulation (transcriptional & epigenetic regulation) –Human genetics and disease (GWAS / cancer) STAT11520
21
Class Information Teaching Fellows Zhirui HuZack McCaw Labs: –Wed 6 – 8pm, Science Center B09 –Thur 6 – 8pm, HCSPH HSPH Kresge LL6 –Next Wed: Odyssey account and LINUX tutorial! STAT11521
22
HW and Grading Discussion on Canvas by HW Submission on Canvas by HW HW: 6 * 15 (STAT115) or 6 * 20 (graduate) Quiz for each module: 6 Final exams 20 Class participation: 5 (extra) Algorithm videos: 5 (extra) Late days STAT11522 Break
23
Gene Expression Microarrays
24
24 Expression Microarrays Grow cells at certain condition, collect mRNA population, and label them Microarray has high density (thousands to millions) sequence specific probes with known location for each gene/RNA Sample hybridized to microarray probes by DNA (A-T, G-C) base pairing, wash non- specific binding Measure sample mRNA value by checking labeled signals at each probe location
25
25 Affymetrix GeneChip Arrays
26
26 Labeled Samples Hybridize to DNA Probes on GeneChip
27
27 Shining Laser Light Causes Tagged Fragments to Glow
28
28 Perfect Match (PM) vs MisMatch (MM) (control for cross hybridization)
29
NimbleGen Arrays 29
30
Agilent Arrays 30
31
Microarrays Array comparison: –# probes / array, # probes / gene, probe length –Flexibility vs data reuse Why do we bother learning about microarrays now? –RNA-seq is probably more cost effective now –The amount of useful public data –The data analysis techniques STAT11531
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.