Download presentation
Presentation is loading. Please wait.
1
Welcome to CS374 Algorithms in Biology
2
Overview Administrivia Molecular Biology and Computation DNA, proteins, cells, evolution Some examples of CS in biology Computer Scientists vs Biologists
3
CS374: Algorithms in Biology cs374.stanford.edu 1.Attendance At most 2 classes missed without affecting grade 2.Lectures Most important requirement Select available topic & day, send email to Serafim and George Read papers, meet with Serafim 1-2 weeks before lecture Ask George any questions on papers while preparing presentation Schedule long (2 hr) meeting with Serafim the day before lecture Slides due at noon before lecture
4
CS374: Algorithms in Biology cs374.stanford.edu 3.Scribing Please sign up on a first-come first-serve basis Due 1 week after lecture, edited & distributed 2 weeks after lecture George will help you edit 4.Summaries Select 1 lecture among first 10, 1 lecture among rest Find one relevant paper Write a 1-page summary of the paper »Paper reference »Abstract »Discussion Ask George for questions/feedback 5.Have fun!
5
Structure of DNA double helix DNA Phosphate Group Sugar Nitrogenous Base A, C, G, T PhysicistOrnithologist
6
DNA to RNA, and genes DNA, ~3x10 9 long in humans Contains ~ 22,000 genes G A G U C A G C RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding
7
Structure of proteins Composed of a chain of amino acids. R | H2N--C--COOH | H 20 possible groups Sequence of amino acids folds to form a complex 3-D structure. The structure of a protein is intimately connected to its function.
8
All living organisms are composed of cells
9
Genetics in the 20 th Century
10
21 st Century AGTAGCACAGACTACGACGAGA CGATCGTGCGAGCGACGGCGTA GTGTGCTGTACTGTCGTGTGTG TGTACTCTCCTCTCTCTAGTCT ACGTGCTGTATGCGTTAGTGTC GTCGTCTAGTAGTCGCGATGCT CTGATGTTAGAGGATGCACGAT GCTGCTGCTACTAGCGTGCTGC TGCGATGTAGCTGTCGTACGTG TAGTGTGCTGTAAGTCGAGTGT AGCTGGCGATGTATCGTGGT AGTAGGACAGACTACGACGAGACGAT CGTGCGAGCGACGGCGTAGTGTGCTG TACTGTCGTGTGTGTGTACTCTCCTC TCTCTAGTCTACGTGCTGTATGCGTT AGTGTCGTCGTCTAGTAGTCGCGATG CTCTGATGTTAGAGGATGCACGATGC TGCTGCTACTAGCGTGCTGCTGCGAT GTAGCTGTCGTACGTGTAGTGTGCTG TAAGTCGAGTGTAGCTGGCGATGTAT CGTGGT
11
Computational Biology Organize & analyze massive amounts of biological data Enable biologists to use data Form testable hypotheses Discover new biology AGTAGCACAGACTACGACGAGA CGATCGTGCGAGCGACGGCGTA GTGTGCTGTACTGTCGTGTGTG TGTACTCTCCTCTCTCTAGTCT ACGTGCTGTATGCGTTAGTGTC GTCGTCTAGTAGTCGCGATGCT CTGATGTTAGAGGATGCACGAT GCTGCTGCTACTAGCGTGCTGC TGCGATGTAGCTGTCGTACGTG TAGTGTGCTGTAAGTCGAGTGT AGCTGGCGATGTATCGTGGT
12
DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1
13
Some examples of central role of CS 1. Sequencing AGTAGCACAGA CTACGACGAGA CGATCGTGCGA GCGACGGCGTA GTGTGCTGTAC TGTCGTGTGTG TGTACTCTCCT 3x10 9 nucleotides ~500 nucleotides
14
Some examples of central role of CS 1. Sequencing AGTAGCACAGA CTACGACGAGA CGATCGTGCGA GCGACGGCGTA GTGTGCTGTAC TGTCGTGTGTG TGTACTCTCCT 3x10 9 nucleotides Computational Fragment Assembly Introduced ~1980 1995: assemble up to 1,000,000 long DNA pieces 2000: assemble whole human genome A big puzzle ~60 million pieces
15
Complete genomes today More than 300 complete genomes have been sequenced
16
DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1 2
17
Where are the genes? 2. Gene Finding In humans: ~22,000 genes ~1.5% of human DNA
18
atg tga ggtgag caggtg cagatg cagttg caggcc ggtgag
19
Start codon ATG 5’ 3’ Exon 1 Exon 2 Exon 3 Intron 1Intron 2 Stop codon TAG/TGA/TAA Splice sites 2. Gene Finding Topics in CS374: Finding noncoding RNA genes Finding short words that regulate the expression of genes
20
DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1 2 easy 3
21
3. Protein Folding The amino-acid sequence of a protein determines the 3D fold The 3D fold of a protein determines its function Can we predict 3D fold of a protein given its amino-acid sequence? Holy grail of compbio—35 years old problem Molecular dynamics, robotics, machine learning, computational geometry Topics on Proteins in CS374 1.Protein Structure Protein Structure Comparison Evolution of Protein Domains Molecular Dynamics & Drug Targets Protein Classification Protein Folding Dynamics Protein Kinetics 2.Protein Comparison Latest multiple alignment tools Selecting parameters for alignment Phylogenetic trees
22
Complete Genomes More than 200 complete genomes have been sequenced
23
Evolution
24
Evolution at the DNA level OK X X Still OK? next generation
25
4. Sequence Comparison Sequence conservation implies function Sequence comparison is key to Finding genes Determining function Uncovering the evolutionary processes
26
Sequence Comparison—Alignment AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- | | | | | | | | | | | | | x | | | | | | | | | | | TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC Sequence Alignment Introduced ~1970 BLAST: 1990, most cited paper in history Still very active area of research query DB BLAST
27
Comparison of Human, Mouse, and Rat Topics on Genomics in CS374 Indexing Large Databases Newest BLAST techniques Repeat Detection Genomic Rearrangements Finding the order of shuffles between two genomes
28
5. Clustering of Microarrays Clinical prediction of Leukemia type 2 types Acute lymphoid (ALL) Acute myeloid (AML) Different treatment & outcomes Predict type before treatment? Bone marrow samples: ALL vs AML Measure amount of each gene
29
6. Protein networks Newer research area Construct networks from multiple data sources Navigate networks Compare networks across organisms Statistics Machine learning Graph algorithms Databases Topics on Protein Networks in CS374 1.Integration Build networks from multiple sources 2.Alignment Compare networks across species 3.Mathematical properties Modular, scale free
30
7. Human evolution A A A A G G G G A A A A A T T T C C C G T A A T T C C G A A A A T T C C G G G G A A G C G A A C A A C G A A C A C G A A C G A A C G A A A A G A T G A T T G G G A G Topics on Human Population Genetics in CS374 1.Evolution Finding fast-evolving genes in human populations 2.Migration Tracing the migration of humans out of Africa by genetic studies
31
8. Building circuits from cells
32
The abstract submission deadline is 11:59 pm, Sunday, October 1, 2006.
33
Computer Scientists vs Biologists
34
Computer scientists vs Biologists (almost) Nothing is ever true or false in Biology Everything is true or false in computer science
35
Computer scientists vs Biologists Biologists strive to understand the complicated, messy natural world Computer scientists seek to build their own clean and organized virtual worlds
36
Biologists are obsessed with being the first to discover something Computer scientists are obsessed with being the first to invent or prove something Computer scientists vs Biologists
37
Biologists are comfortable with the idea that all data have errors Computer scientists are not Computer scientists vs Biologists
38
Computer scientists get high-paid jobs after graduation Biologists typically have to complete one or more 5-year post-docs... Computer scientists vs Biologists
39
Computer Science is to Biology what Mathematics is to Physics “Antedisciplinary” Science What is computational biology? http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pcbi.0010006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.