Download presentation
Presentation is loading. Please wait.
1
It & Health 2009 Summary Thomas Nordahl Petersen
2
Teachers Thomas Nordahl Petersen Rasmus Wernersson Lisbeth Nielsen Fink Anders Gorm Pedersen Bent Petersen Ramneek Gupta Thomas Blicher
3
Outline of the course Topics will cover a general introduction to bioinformatics –Evolution –DNA / Protein –Alignment and scoring matrices How does it work & what are the numbers –Visualization of multiple alignments Phylogenetic trees and logo plots –Commonly used databases Uniprot/Genbank & Genome browsers –Protein 3D-structure –Artificial neural networks & case stories –Practical use of bioinformatics tools Preparation for exam
4
Topics covered - (some of them)
5
Information flow in biological systems
6
Amino Acids Amine and carboxyl groups. Sidechain ‘R’ is attached to C-alpha carbon The amino acids found in Living organisms are L-amino acids
7
Amino Acids - peptide bond N-terminalC-terminal
8
1 and 3-letter codes 1.There are 20 naturally occurring amino acids 2.Normally the one/three codes are used Ala - A Cys - C Asp - D Glu - E Phe - F Gly - G His - H Ile - I Lys - K Leu - L Met - M Asn - N Pro - P Gln - Q Arg - R Ser - S Thr - T Val - V Trp - W Tyr - Y
9
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Theory of evolution Charles Darwin 1809-1882
10
Phylogenetic tree
11
Global versus local alignments Global alignment: align full length of both sequences. (The “Needleman-Wunsch” algorithm). Local alignment: find best partial alignment of two sequences (the “Smith-Waterman” algorithm). Global alignment Seq 1 Seq 2 Local alignment
12
Pairwise alignment: the solution ” Dynamic programming ” (the Needleman-Wunsch algorithm)
13
Sequence alignment - Blast
15
Blosum & PAM matrices Blosum matrices are the most commonly used substitution matrices. Blosum50, Blosum62, blosum80 PAM - Percent Accepted Mutations PAM-0 is the identity matrix. PAM-1 diagonal small deviations from 1, off- diag has small deviations from 0 PAM-250 is PAM-1 multiplied by itself 250 times.
16
Sequence profiles (1J2J.B) >1J2J.B mol:aa PROTEIN TRANSPORT NVIFEDEEKSKMLARLLKSSHPEDLRAANKLIKEMVQEDQKRMEK
17
Log-odds scores BLOSUM is a log-likelihood matrix: Likelihood of observing j given you have i is –P(j|i) = P ij /P i The prior likelihood of observing j is –Q j, which is simply the frequency The log-likelihood score is –S ij = 2log 2 (P(j|i)/log(Q j ) = 2log 2 (P ij /(Q i Q j )) –Where, Log 2 (x)=log n (x)/log n (2) –S has been normalized to half bits, therefore the factor 2
18
BLAST Exercise
19
Genome browsers - UCSC Intron - Exon structure Single Nucleotide polymorphism - SNP
20
SNPs
21
Protein 3D-structure
22
Protein structure Primary structure: Amino acids sequences Secondary structure: Helix/Beta sheet Tertiary structure: Fold, 3D cordinates
23
Protein structure -helix helix3 residues/turn - few, but not uncommon - helix3.6 residues/turn - by far the most common helix Pi-helix4.1 residues/turn - very rare
24
Protein structure strand/sheet
25
Protein folds Class 4’th is ‘few secondary structure Architecture Overall shape of a domain Topology Share secondary structure connectivity
26
Protein 3D-structure
27
Neural Networks From knowledge to information Protein sequence Biological feature
28
A data-driven method to predict a feature, given a set of training data In biology input features could be amino acid sequence or nucleotides Secondary structure prediction Signal peptide prediction Surface accessibility Propeptide prediction Use of artificial neural networks N C Signal peptide Propeptide Mature/active protein
29
Prediction of biological features Surface accessible Predict surface accessible from amino acid sequence only.
30
Logo plots Information content, how is it calculated - what does it mean.
31
Logo plots - Information Content Sequence-logo Calculate Information Content I = a p a log 2 p a + log 2 (4), Maximal value is 2 bits Total height at a position is the ‘Information Content’ measured in bits. Height of letter is the proportional to the frequency of that letter. A Logo plot is a visualization of a mutiple alignment. ~0.5 each Completely conserved
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.