Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome to CS374 Algorithms in Biology. Overview Administrivia Molecular Biology and Computation  DNA, proteins, cells, evolution  Some examples of.

Similar presentations


Presentation on theme: "Welcome to CS374 Algorithms in Biology. Overview Administrivia Molecular Biology and Computation  DNA, proteins, cells, evolution  Some examples of."— Presentation transcript:

1 Welcome to CS374 Algorithms in Biology

2 Overview Administrivia Molecular Biology and Computation  DNA, proteins, cells, evolution  Some examples of CS in biology Computer Scientists vs Biologists

3 CS374: Algorithms in Biology cs374.stanford.edu 1.Attendance At most 2 classes missed without affecting grade 2.Lectures Most important requirement Select available topic & day, send email to Serafim and George Read papers, meet with Serafim 1-2 weeks before lecture Ask George any questions on papers while preparing presentation Schedule long (2 hr) meeting with Serafim the day before lecture Slides due at noon before lecture

4 CS374: Algorithms in Biology cs374.stanford.edu 3.Scribing Please sign up on a first-come first-serve basis Due 1 week after lecture, edited & distributed 2 weeks after lecture George will help you edit 4.Summaries Select 1 lecture among first 10, 1 lecture among rest Find one relevant paper Write a 1-page summary of the paper »Paper reference »Abstract »Discussion Ask George for questions/feedback 5.Have fun!

5 Structure of DNA double helix DNA Phosphate Group Sugar Nitrogenous Base A, C, G, T PhysicistOrnithologist

6 DNA to RNA, and genes DNA, ~3x10 9 long in humans Contains ~ 22,000 genes G A G U C A G C RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding

7 Structure of proteins Composed of a chain of amino acids. R | H2N--C--COOH | H 20 possible groups Sequence of amino acids folds to form a complex 3-D structure. The structure of a protein is intimately connected to its function.

8 All living organisms are composed of cells

9 Genetics in the 20 th Century

10 21 st Century AGTAGCACAGACTACGACGAGA CGATCGTGCGAGCGACGGCGTA GTGTGCTGTACTGTCGTGTGTG TGTACTCTCCTCTCTCTAGTCT ACGTGCTGTATGCGTTAGTGTC GTCGTCTAGTAGTCGCGATGCT CTGATGTTAGAGGATGCACGAT GCTGCTGCTACTAGCGTGCTGC TGCGATGTAGCTGTCGTACGTG TAGTGTGCTGTAAGTCGAGTGT AGCTGGCGATGTATCGTGGT AGTAGGACAGACTACGACGAGACGAT CGTGCGAGCGACGGCGTAGTGTGCTG TACTGTCGTGTGTGTGTACTCTCCTC TCTCTAGTCTACGTGCTGTATGCGTT AGTGTCGTCGTCTAGTAGTCGCGATG CTCTGATGTTAGAGGATGCACGATGC TGCTGCTACTAGCGTGCTGCTGCGAT GTAGCTGTCGTACGTGTAGTGTGCTG TAAGTCGAGTGTAGCTGGCGATGTAT CGTGGT

11 Computational Biology Organize & analyze massive amounts of biological data  Enable biologists to use data  Form testable hypotheses  Discover new biology AGTAGCACAGACTACGACGAGA CGATCGTGCGAGCGACGGCGTA GTGTGCTGTACTGTCGTGTGTG TGTACTCTCCTCTCTCTAGTCT ACGTGCTGTATGCGTTAGTGTC GTCGTCTAGTAGTCGCGATGCT CTGATGTTAGAGGATGCACGAT GCTGCTGCTACTAGCGTGCTGC TGCGATGTAGCTGTCGTACGTG TAGTGTGCTGTAAGTCGAGTGT AGCTGGCGATGTATCGTGGT

12 DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1

13 Some examples of central role of CS 1. Sequencing AGTAGCACAGA CTACGACGAGA CGATCGTGCGA GCGACGGCGTA GTGTGCTGTAC TGTCGTGTGTG TGTACTCTCCT 3x10 9 nucleotides ~500 nucleotides

14 Some examples of central role of CS 1. Sequencing AGTAGCACAGA CTACGACGAGA CGATCGTGCGA GCGACGGCGTA GTGTGCTGTAC TGTCGTGTGTG TGTACTCTCCT 3x10 9 nucleotides Computational Fragment Assembly Introduced ~1980 1995: assemble up to 1,000,000 long DNA pieces 2000: assemble whole human genome A big puzzle ~60 million pieces

15 Complete genomes today More than 300 complete genomes have been sequenced

16 DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1 2

17 Where are the genes? 2. Gene Finding In humans: ~22,000 genes ~1.5% of human DNA

18 atg tga ggtgag caggtg cagatg cagttg caggcc ggtgag

19 Start codon ATG 5’ 3’ Exon 1 Exon 2 Exon 3 Intron 1Intron 2 Stop codon TAG/TGA/TAA Splice sites 2. Gene Finding Topics in CS374: Finding noncoding RNA genes Finding short words that regulate the expression of genes

20 DNA to RNA, and genes G A G U C A G C DNA, ~3x10 9 long in humans Contains ~ 22,000 genes RNA: carries the “message” for “translating”, or “expressing” one gene transcriptiontranslation folding 1 2 easy 3

21 3. Protein Folding The amino-acid sequence of a protein determines the 3D fold The 3D fold of a protein determines its function Can we predict 3D fold of a protein given its amino-acid sequence?  Holy grail of compbio—35 years old problem  Molecular dynamics, robotics, machine learning, computational geometry Topics on Proteins in CS374 1.Protein Structure Protein Structure Comparison Evolution of Protein Domains Molecular Dynamics & Drug Targets Protein Classification Protein Folding Dynamics Protein Kinetics 2.Protein Comparison Latest multiple alignment tools Selecting parameters for alignment Phylogenetic trees

22 Complete Genomes More than 200 complete genomes have been sequenced

23 Evolution

24 Evolution at the DNA level OK X X Still OK? next generation

25 4. Sequence Comparison Sequence conservation implies function Sequence comparison is key to Finding genes Determining function Uncovering the evolutionary processes

26 Sequence Comparison—Alignment AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- | | | | | | | | | | | | | x | | | | | | | | | | | TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC Sequence Alignment Introduced ~1970 BLAST: 1990, most cited paper in history Still very active area of research query DB BLAST

27 Comparison of Human, Mouse, and Rat Topics on Genomics in CS374 Indexing Large Databases Newest BLAST techniques Repeat Detection Genomic Rearrangements Finding the order of shuffles between two genomes

28 5. Clustering of Microarrays Clinical prediction of Leukemia type 2 types  Acute lymphoid (ALL)  Acute myeloid (AML) Different treatment & outcomes Predict type before treatment? Bone marrow samples: ALL vs AML Measure amount of each gene

29 6. Protein networks Newer research area Construct networks from multiple data sources Navigate networks Compare networks across organisms  Statistics  Machine learning  Graph algorithms  Databases Topics on Protein Networks in CS374 1.Integration Build networks from multiple sources 2.Alignment Compare networks across species 3.Mathematical properties Modular, scale free

30 7. Human evolution A A A A G G G G A A A A A T T T C C C G T A A T T C C G A A A A T T C C G G G G A A G C G A A C A A C G A A C A C G A A C G A A C G A A A A G A T G A T T G G G A G Topics on Human Population Genetics in CS374 1.Evolution Finding fast-evolving genes in human populations 2.Migration Tracing the migration of humans out of Africa by genetic studies

31 8. Building circuits from cells

32 The abstract submission deadline is 11:59 pm, Sunday, October 1, 2006.

33 Computer Scientists vs Biologists

34 Computer scientists vs Biologists (almost) Nothing is ever true or false in Biology Everything is true or false in computer science

35 Computer scientists vs Biologists Biologists strive to understand the complicated, messy natural world Computer scientists seek to build their own clean and organized virtual worlds

36 Biologists are obsessed with being the first to discover something Computer scientists are obsessed with being the first to invent or prove something Computer scientists vs Biologists

37 Biologists are comfortable with the idea that all data have errors Computer scientists are not Computer scientists vs Biologists

38 Computer scientists get high-paid jobs after graduation Biologists typically have to complete one or more 5-year post-docs... Computer scientists vs Biologists

39 Computer Science is to Biology what Mathematics is to Physics “Antedisciplinary” Science What is computational biology? http://compbiol.plosjournals.org/perlserv/?request=get-document&doi=10.1371/journal.pcbi.0010006


Download ppt "Welcome to CS374 Algorithms in Biology. Overview Administrivia Molecular Biology and Computation  DNA, proteins, cells, evolution  Some examples of."

Similar presentations


Ads by Google