Download presentation
Presentation is loading. Please wait.
1
Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing
2
Fa 05CSE182 Announcements Midterm 1: Nov 1, in class. Assignment 2: Online, due October 20.
3
Fa 05CSE182 Distinguishing between families
4
Fa 05CSE182 Distinguishing between families Assignment 2
5
Fa 05CSE182 Profiles Start with an alignment of strings of length m, over an alphabet A, Build an |A| X m matrix F=(f ki ) Each entry f ki represents the frequency of symbol k in position i 0.71 0.14 0.28
6
Fa 05CSE182 Scoring Profiles k i s f ki Scoring Matrix
7
Fa 05CSE182 Psi-BLAST idea Multiple alignments are important for capturing remote homology. Profile based scores are a natural way to handle this. Q: What if the query is a single sequence. A: Iterate: –Find homologs using Blast on query –Discard very similar homologs –Align, make a profile, search with profile.
8
Fa 05CSE182 Psi-BLAST speed Two time consuming steps. 1.Multiple alignment of homologs 2.Searching with Profiles. 1.Does the keyword search idea work? Multiple alignment: –Use ungapped multiple alignments only Pigeonhole principle again: –If profile of length m must score >= T –Then, a sub-profile of length l must score >= lT|/m –Generate all l-mers that score at least lT|/M –Search using an automaton
9
Fa 05CSE182 Protein Domains An important realization (in the last decade) is that proteins have a modular architecture of domains/folds. Example: The zinc finger domain is a DNA-binding domain. What is a domain? –Part of a sequence that can fold independently, and is present in other sequences as well
10
Fa 05CSE182 Domain review What is a domain? How are domains expressed –Motifs (Regular expression & others) –Multiple alignments –Profiles –Profile HMMs
11
Fa 05CSE182 Domain databases Can you speed up HMM search?
12
Fa 05CSE182 A structural view of proteins
13
Fa 05CSE182 CS view of a protein >sp|P00974|BPT1_BOVIN Pancreatic trypsin inhibitor precursor (Basic protease inhibitor) (BPI) (BPTI) (Aprotinin) - Bos taurus (Bovine). MKMSRLCLSVALLVLLGTLAASTPGCDTSNQAKAQ RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGG CRAKRNNFKSAEDCMRTCGGAIGPWENL
14
Fa 05CSE182 Protein structure basics
15
Fa 05CSE182 Side chains determine amino-acid type The residues may have different properties. Aspartic acid (D), and Glutamic Acid (E) are acidic residues
16
Fa 05CSE182 Bond angles form structural constraints
17
Fa 05CSE182 Various constraints determine 3d structure Constraints –Structural constraints due to physiochemical properties –Constraints due to bond angles –H-bond formation Surprisingly, a few conformations are seen over and over again.
18
Fa 05CSE182 Alpha-helix 3.6 residues per turn H-bonds between 1st and 4th residue stabilize the structure. First discovered by Linus Pauling
19
Fa 05CSE182 Beta-sheet Each strand by itself has 2 residues per turn, and is not stable. Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel. Beta sheets have long range interactions that stabilize the structure, while alpha-helices have local interactions.
20
Fa 05CSE182 Domains The basic structures (helix, strand, loop) combine to form complex 3D structures. Certain combinations are popular. Many sequences, but only a few folds
21
Fa 05CSE182 3D structure Predicting tertiary structure is an important problem in Bioinformatics. Premise: Clues to structure can be found in the sequence. While de novo tertiary structure prediction is hard, there are many intermediate, and tractable goals. The PDB database is a compendium of structures PDB
22
Fa 05CSE182 Searching structure databases Threading, and other 3d Alignments can be used to align structures. Database filtering is possible through geometric hashing.
23
Fa 05CSE182 Trivia Quiz What research won the Nobel prize in Chemistry in 2004? In 2002?
24
Fa 05CSE182 How are Proteins Sequenced? Mass Spec 101:
25
Fa 05CSE182 Nobel Citation 2002
26
Fa 05CSE182 Nobel Citation, 2002
27
Fa 05CSE182 Mass Spectrometry
28
Fa 05CSE182 Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation
29
Fa 05CSE182 Single Stage MS Mass Spectrometry LC-MS: 1 MS spectrum / second
30
Fa 05CSE182 Tandem MS Secondary Fragmentation Ionized parent peptide
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.