Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing.

Similar presentations


Presentation on theme: "Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing."— Presentation transcript:

1 Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing

2 Fa 05CSE182 Announcements Midterm 1: Nov 1, in class. Assignment 2: Online, due October 20.

3 Fa 05CSE182 Distinguishing between families

4 Fa 05CSE182 Distinguishing between families Assignment 2

5 Fa 05CSE182 Profiles Start with an alignment of strings of length m, over an alphabet A, Build an |A| X m matrix F=(f ki ) Each entry f ki represents the frequency of symbol k in position i 0.71 0.14 0.28

6 Fa 05CSE182 Scoring Profiles k i s f ki Scoring Matrix

7 Fa 05CSE182 Psi-BLAST idea Multiple alignments are important for capturing remote homology. Profile based scores are a natural way to handle this. Q: What if the query is a single sequence. A: Iterate: –Find homologs using Blast on query –Discard very similar homologs –Align, make a profile, search with profile.

8 Fa 05CSE182 Psi-BLAST speed Two time consuming steps. 1.Multiple alignment of homologs 2.Searching with Profiles. 1.Does the keyword search idea work? Multiple alignment: –Use ungapped multiple alignments only Pigeonhole principle again: –If profile of length m must score >= T –Then, a sub-profile of length l must score >= lT|/m –Generate all l-mers that score at least lT|/M –Search using an automaton

9 Fa 05CSE182 Protein Domains An important realization (in the last decade) is that proteins have a modular architecture of domains/folds. Example: The zinc finger domain is a DNA-binding domain. What is a domain? –Part of a sequence that can fold independently, and is present in other sequences as well

10 Fa 05CSE182 Domain review What is a domain? How are domains expressed –Motifs (Regular expression & others) –Multiple alignments –Profiles –Profile HMMs

11 Fa 05CSE182 Domain databases Can you speed up HMM search?

12 Fa 05CSE182 A structural view of proteins

13 Fa 05CSE182 CS view of a protein >sp|P00974|BPT1_BOVIN Pancreatic trypsin inhibitor precursor (Basic protease inhibitor) (BPI) (BPTI) (Aprotinin) - Bos taurus (Bovine). MKMSRLCLSVALLVLLGTLAASTPGCDTSNQAKAQ RPDFCLEPPYTGPCKARIIRYFYNAKAGLCQTFVYGG CRAKRNNFKSAEDCMRTCGGAIGPWENL

14 Fa 05CSE182 Protein structure basics

15 Fa 05CSE182 Side chains determine amino-acid type The residues may have different properties. Aspartic acid (D), and Glutamic Acid (E) are acidic residues

16 Fa 05CSE182 Bond angles form structural constraints

17 Fa 05CSE182 Various constraints determine 3d structure Constraints –Structural constraints due to physiochemical properties –Constraints due to bond angles –H-bond formation Surprisingly, a few conformations are seen over and over again.

18 Fa 05CSE182 Alpha-helix 3.6 residues per turn H-bonds between 1st and 4th residue stabilize the structure. First discovered by Linus Pauling

19 Fa 05CSE182 Beta-sheet Each strand by itself has 2 residues per turn, and is not stable. Adjacent strands hydrogen-bond to form stable beta-sheets, parallel or anti-parallel. Beta sheets have long range interactions that stabilize the structure, while alpha-helices have local interactions.

20 Fa 05CSE182 Domains The basic structures (helix, strand, loop) combine to form complex 3D structures. Certain combinations are popular. Many sequences, but only a few folds

21 Fa 05CSE182 3D structure Predicting tertiary structure is an important problem in Bioinformatics. Premise: Clues to structure can be found in the sequence. While de novo tertiary structure prediction is hard, there are many intermediate, and tractable goals. The PDB database is a compendium of structures PDB

22 Fa 05CSE182 Searching structure databases Threading, and other 3d Alignments can be used to align structures. Database filtering is possible through geometric hashing.

23 Fa 05CSE182 Trivia Quiz What research won the Nobel prize in Chemistry in 2004? In 2002?

24 Fa 05CSE182 How are Proteins Sequenced? Mass Spec 101:

25 Fa 05CSE182 Nobel Citation 2002

26 Fa 05CSE182 Nobel Citation, 2002

27 Fa 05CSE182 Mass Spectrometry

28 Fa 05CSE182 Sample Preparation Enzymatic Digestion (Trypsin) + Fractionation

29 Fa 05CSE182 Single Stage MS Mass Spectrometry LC-MS: 1 MS spectrum / second

30 Fa 05CSE182 Tandem MS Secondary Fragmentation Ionized parent peptide


Download ppt "Fa 05CSE182 CSE182-L6 Protein structure basics Protein sequencing."

Similar presentations


Ads by Google