Download presentation
Presentation is loading. Please wait.
1
Course information To reach me: Barry Cohen bcohen@cis.njit.edu GITC 4301 W 4:00-5:30 F 4:45-5:55 www.cs.njit.edu/~bcohenbcohen@cis.njit.edu Web site, chat, web board, schedule my.njit.edu (no ‘www’) Guinea pig’s prerogative
2
Projects Team projects (4 person) One hour presentations Literature review / algorithms / programs Sample applications Open problems Homework for practice
3
Texts Intro to Computational Molecular Biology Setubal/Meidanis Biological sequence analysis Durbin, Eddy, Krogh, Mitchison Recommended: Computational Methods in Molecular Biology – Salzberg/Searls/Kasif
4
Watson & Crick, 1953 http://www.nature.com/genomics/human/watson-crick/
5
Stylized double helix
6
Replication ‘It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.’
7
Sequence to structure
8
The information cycle
9
The triplet code
10
In the beginning … Life began when the earth was young Life arose from simple chemistry (most life still is relatively simple) Universal common ancestor Common molecular machinery (oldest fossils are living fossils)
11
What is life? Information and metabolism RNA world hypothesis DNA as program file (information coding for activity) Replication (information which codes for itself) Variation, evolution (life adapts to its environment)
12
DNA DNA is a polymer (sequence, string) DNA is composed of just four kinds of chemical units (A, C, G, T) DNA is redundant (double helix); A’s pair with U’s, G’s pair with T’s Some DNA codes for RNA, proteins (exons – expressed regions) Some DNA is noncoding (introns – intervening regions) Coherent sets of DNA are genes
13
RNA RNA is a also polymer (sequence, string) RNA is composed of just four kinds of chemical units (A, C, G, U) RNA is single stranded Some RNA codes for proteins, some is functional (e.g., tRNA)
14
Proteins Proteins account for most life activity and structure A protein is a polymer (sequence, string) Proteins are composed of 20 kinds of chemical units (amino acids) Proteins fold into a specific shape, which determines their function Proteins are made from genetic templates (they don’t code)
15
Evolution Darwin – evolution is adaption Nature has no aim, it is a result of random events Most events are DNA string edits (indels, substititions) Some events are on ‘higher level’ structures (e.g., chromosomes)
16
The ‘tree of life’ Some errors is replication divide gene pools into two (speciation). (Or vice versa.) These bifurcations give the history of life a tree-like structure
17
rRNA universal tree of life
18
Algorithms An algorithm is a precise set of instructions for solving a problem (what do we mean by ‘precise’?) An algorithm must terminate Algorithms operate on data (inputs) Algorithms use data structures
19
Data structures A string is a natural mathematical model of a biological sequence A directed acyclic graph may represent familial descent A tree may represent species relations
20
Efficiency Bigger problems take more time and/or space (biology problems are often big) Harder problems take longer or more space (many biology problems are hard) Time (space), as a function of size, measures the complexity of an algorithm Many computable are problems intractable
21
Complexity classes Search a sorted list – log n Sort by comparison – n log n Text search – n Polynomial v. exponential time NP-complete problems
22
Probability Base molecular events in evolution occur with a certain probability (frequency) Probability models predict what may occur (likelihood of a pair of jacks) Probability models may also infer what most likely has occurred
23
Entropy Entropy is a measure of information content How many y/n questions are needed to get an answer? DNA positions differ in entropy, depending on how ‘conserved’ they are
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.