Download presentation
1
DNA Motif and protein domain discovery
Presented by: Deeter Neumann Peter St. Andre PDB; human enhancer binding protein PDB; zinc finger 224
2
Outline What are DNA motifs & proteins domains?
Their importance and function motif algorithms locating domain/motif experimentally available programs: PFAM & SMART Taken fromwikimedia.org
3
What are DNA sequence motifs?
“Sequence motifs are short recurring patterns in DNA that are presumed to have biological function.” D’haeseleer, P. Nature Biotechnology 24, (2006). Image taken from bio.miami.edu
4
Why are DNA sequence motifs important
to know? Indicates common structural protein domains Identifies similar function Other possible biological functions, eg. transcription factors, mRNA processing
5
What is the function of DNA domains?
specific and non-specific interactions permits binding of transcription factor to target gene sequence-specific recognition Precise collection of sequence elements Human Molecular Genetics 3; Strachan & Read
6
What are protein domains?
Protein sequences and structures that evolve, function, and exist independently from the rest of the protein They often form functional units, like metal binding domains Image of human zinc finger domain Taken from .ionchannels.org
7
Why are Proteins Domains Important?
Bind to other molecules in the cell Signal transduction pathways Genetically engineering novel proteins Pharmaceutical importance 7
8
Algorithmic Approaches for both DNA motifs and protein domain searches
Three general approaches are used: Enumeration Deterministic optimization Probabilistic optimization
9
Enumeration Employs the broadest approach Looks at all possible motifs
Few limitations are enacted on it
10
Enumeration, cont. Key point: Covers all possible sequence motifs with few limitations Pros: Does not get stuck in local optimum Cons: May overlook subtle patterns Programs like WeederWeb and YMF use these type of algorithms
11
WeederWeb
12
WeederWeb Results
13
Deterministic optimization
Takes into account an Expectation Maximization model and a position weight matrix MEME is one program that uses this approach What does this mean?
14
Deteriministic optimization, cont.
15
Deterministic optimization, cont.
Taken from ws.nbcr.net/app /meme.html
16
Probabilistic optimization
Uses a Gibbs sampling approach Randomized implementation of expectation maximization model How is this applied?
17
Probabilistic optimization, cont.
Selects random sites and each is weighted against known motifs Allows program to add or remove sequences and continuously update motifs
18
AlignAce 3.0
19
Results
20
Which one to use? Recent research showed that enumeration approaches worked very well Generally accepted that no one approach is the best Programs that incorporate several approaches work the best Important to rerun programs
21
Examples of programs WeederWeb is a web-based interface with an enumerative approach YMF is another enumerative program MEME is an online program that uses a deterministic optimization approach MotifSampler is a program that combines Gibbs sampling and a third order Markov model
22
YMF
23
YMF results
24
Measurements used to score sequence motifs
Three main statistics used: Information content Log likelihood MAP score
25
Other measures of motif quality
Group specificity, or site specificity Probability of having a certain number of target sequences with the site in question Sequence specificity Accounts for both number of sequences with the sites in question and the number of sites per sequence Positional bias, or uniformity Looks at how uniform of the sites in question are distribute with respect to transcription start sites of the gene
26
Identification and preliminary characterization of a protein motif related to the zinc finger
Lovering et al. (1993)
27
What is a zinc finger? autonomously folding domain structural motif
zinc required for folding and DNA interactions PDB; single zinc finger in solution part of protein that is used to regulate DNA
28
Classic zinc finger conserved cysteines and histidines binds with zinc
Tetrahedral structure antiparallel two-stranded β-sheets and an α-helix C2H2--classic Conserved Cys and His form loop (finger) (often tandomly repeated) 2nd class- steroid/nuclear receptor 2 Zn for single-fold domain. 4 cysteines for each zinc 3rd class- GAL4 DNA binding domain Alpha helix in major groove for interaction----> common feature of 3 classes N-terminal of alpha-helix---> DNA interaction image from wikipedia
29
Figure 1A Lovering et al. 377 Amino Acids Glycine-rich region (27%)
Cysteine-rich region near N terminus (residues 15-64, indicated by boxes) Is there anything else you can tell me about this sequence? Lovering et al.
30
Actual RING1 sequence MTTPANAQNASKTWELSLYELHRTPQEAIMDGTEIAVSPRSLHSELMCPICLDMLKNTMTTKECLHRFCSDCIVTALRSGNKECPTCRKKLVSKRSLRPDPNFDALISKIYPSREEYEAHQDRVLIRLSRLHNQQALSSSIEEGLRMQAMHRAQRVRRPIPGSDQTTTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGGSSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPSPPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKYLALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGGDGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNEKFWKVSRPLELCYAPTKDPK 406 Amino Acids (compared to 377) Residues (not 15-64)
31
RING finger Cys1-Xaa-hydrophobic aa-Cys2-Xaa9-27-Cys3-Xaa1-3-His- Xaa-hydrophobic aa-Cys4-Xaa2-Cys5-hydrophobic aa- Xaa5-47-Cys6-Xaa2-Cys7 Summary of extended motif Cys and His always conserved in RING finger family Conservation one residue before Cys 2, 3, 4 (Phe), 7 one residue after Cys 5, 6 (Pro), 7 generally hydrophobic Important for secondary and tertiary structure Positioning and Orientation of metal ligand
32
Figure 1B Gene expression similar in variety of cell lines
Fig. 1B Lovering et al. Gene expression similar in variety of cell lines Present as 1.6kb Agreement with cDNA w/o poly-A tail
33
Figure 2 DNA binding regulation recombination repair Lovering et al.
Avg length 1st Loop- 11±1aa (NOT CG30 and PE38) 2nd Loop- 10±3aa ( NOT T18) Partially Symmetric???--common structural arrangement for protein-DNA binding motif Regulation: Xenopus(XNF-7) Drosophila (sina, Psc, Su(z)2) DG17 (Dictyostelium) Rpt-1 (mouse) down regulation of IL2R PAS4 (yeast) Recombination: RAG-1 Repair: RAD18 (yeast) Human Oncogenes: MEL18 BMI-1 RET T18 CBL Viral Early Genes: E110 VZ61 CG30 PE-38 EPO Viral Genes: LCMV P11 Ribonucleoprotein Complex: SS-A/Ro Lovering et al.
34
RING1 peptide 55 aa synthetic peptide (residues in RING1 seq) RING finger metal binding ---> prefers Zinc cobalt cadmium copper No binding of Iron, manganese, and magnesium Metal binding was tested for by titrating the synthetic protein with different metals in the presence of Cobalt
35
Figure 3A S-C0(II) ___ cobalt ----- zinc Co(II) d-d transitions
S-Co determined that Cysteine was involved with the binding Co d-d transitions indicate tetrahedral configuration Increase zinc concentrations Characteristic cobalt transitions diminish Cobalt competitor Zinc binds 10x more tightly than cobalt Fig. 3A Lovering et al.
36
Figure 4A Zinc dependence binding Gel mobility-shift assays
Forms discrete band in excess zinc DNA binding is Zn dependent!
37
RING1 function No known function (not published until 1993)
Inhibit transactivation of recombination signal binding protein-J (RBP-J) (Hongyan et al.) Ubiquitin-protein ligases @ time of paper RING1 function unknown (1992) 2004-found to Inhibit transactivation of RBP-J (recombination signal binding protein-J)
38
Pfam database http://pfam.sanger.ac.uk/
Database that contains large collection of protein domains and families Represented as sequence alignments and HMMs List of key features about protein New interface that combined other Pfam versions New updates have made it more user-friendly
39
Pfam search of RING1
40
Pfam search
41
Pfam search results
42
Pfam search results
43
Pfam link out
44
HMM logo of sequence motif
45
SMART http://smart.embl-heidelberg.de/
Multiple sequence alignment of members >400 domains in >54,000 different proteins Searches database using HMMs Facilitate study of domain evolution and multi-domain architecture by correlation with phyletic distributions Detects domains on multiple sequence alignments of representative family members Intended for eukaryotic signal transduction DNA, RNA, chromatin, and actin cytoskeleton functions HMMs used to improve sensitivity of domain and repeat detection----HMMer2
46
SMART 2 different modes normal swiss-Prot SP-TrEMBL ensemble genomic
proteomes of sequenced genomes
47
SMART SMART home page Normal mode Genome mode
48
SMART
49
SMART
50
SMART
51
SMART
52
SMART Interactions of proteins that are associated with LMNA
Different colored lines represent different evidence for relationship experiments-purple databases-aqua textmining-yellow homology-light blue
53
SMART
54
More motif madness
55
PRINTS
56
PRINTS
57
PROSITE
58
PROSITE
59
Questions?
60
How primitive is this RING-finger motif
How primitive is this RING-finger motif? The author only discusses genes containing this motif that come from eukaryotes. Is this motif found in prokaryotes as well?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.