Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computational Genomics Lecture 1, Tuesday April 1, 2003.

Similar presentations


Presentation on theme: "Computational Genomics Lecture 1, Tuesday April 1, 2003."— Presentation transcript:

1 Computational Genomics Lecture 1, Tuesday April 1, 2003

2 Biology in One Slide: 2 Paradigms Molecular Paradigm Evolution Paradigm

3 High Throughput Biology Biology is becoming an information science …ACGTGACTGAGGACCGTG CGACTGAGACTGACTGGGT CTAGCTAGACTACGTTTTA TATATATATACGTCGTCGT ACTGATGACTAGATTACAG ACTGATTTAGATACCTGAC TGATTTTAAAAAAATATT… Gene Expression DNA Sequencing

4 Goals of this course Introduction to Computational Biology  Basic biology for computer scientists  Breadth: mention many topics & applications In-depth coverage of Computational Genomics  Algorithms for sequence analysis  Current applications, trends, and open problems Coverage of useful algorithms  Hidden Markov models  Dynamic Programming  String algorithms  Applications of AI techniques

5

6 Topics in CS262 Part 1: In-depth coverage of basic computational methods for analysis of biological sequences  Sequence Alignment & Dynamic Programming  Hidden Markov models These methods are used heavily in most genomics applications:  DNA sequencing  Comparison of DNA and proteins across organisms  Discovery of genes, promoters, regulatory sites

7 Topics in CS262 Part 2: Topics in computational genomics, more algorithms, and areas of active research  DNA sequencing & assembly: reading a complete genome such as the human DNA  Gene finding: marking genes on the DNA sequence  Large-scale comparative genomics: comparing whole genomes from multiple organisms  Microarrays & regulation: understanding the regulatory code, and potential disease-causing genes  RNA structure: predicting the folding of RNA  Phylogeny and evolution: quantifying the evolution of biological sequences

8 Course responsibilities Homeworks[72%]  4 challenging problem sets, 4-5 problems/pset  Collaboration allowed – please give credit  Hws due Thursday, solutions explained Friday  Two worst problems in all hws do not count Final[18%]  Takehome, 1 day  Collaboration not allowed  Basic questions – much easier than homeworks Scribing[10%]  Due one week after the lecture, except special permission

9 Reading material Books  “Biological sequence analysis” by Durbin, Eddy, Krogh, Mitchinson Chapters 1-4, 6, (7-8), (9-10)  “Algorithms on strings, trees, and sequences” by Gusfield Chapters (5-7), 11-12, (13), 14, (17) Papers Lecture notes

10 Topic 1. Sequence Alignment

11 Complete genomes

12 Evolution

13 Evolution at the DNA level …ACGGTGCAGTCACCA… …ACGTTGCAGTCCACCA… C SEQUENCE EDITSREARRANGEMENTS

14 Evolutionary Rates OK X X Still OK? next generation Changes in non-functional sites are OK, so will be propagated Most changes in functional sites are deleterious and will be rejected

15 Sequence conservation implies function Interleukin region in human and mouse 100% 40%

16 Sequence Alignment -AGGCTATCACCTGACCTCCAGGCCGA--TGCCC--- TAG-CTATCAC--GACCGC--GGTCGATTTGCCCGAC Definition Given two strings x = x 1 x 2...x M, y = y 1 y 2 …y N, an alignment is an assignment of gaps to positions 0,…, M in x, and 0,…, N in y, so as to line up each letter in one sequence with either a letter, or a gap in the other sequence AGGCTATCACCTGACCTCCAGGCCGATGCCC TAGCTATCACGACCGCGGTCGATTTGCCCGAC

17 What is a good alignment? Alignment: The “best” way to match the letters of one sequence with those of the other How do we define “best”? Alignment: A hypothesis that the two sequences come from a common ancestor through sequence edits Parsimonious explanation: Find the minimum number of edits that transform one sequence into the other

18 Scoring Function Sequence edits: AGGCCTC  Mutations AGGACTC  Insertions AGGGCCTC  Deletions AGG.CTC Scoring Function: Match: +m Mismatch: -s Gap:-d Score F = (# matches)  m - (# mismatches)  s – (#gaps)  d


Download ppt "Computational Genomics Lecture 1, Tuesday April 1, 2003."

Similar presentations


Ads by Google