KSU Symposium of Student Scholars 2015 REPSA-Directed Identification of DNA-Binding Specificity for Orphan Transcription Factors Kamir Hiam Van Dyke Lab Dept. of Biochemistry KSU Symposium of Student Scholars 2015
The Power of Modern Genetics Genetic Sequencing Raw Data Series of bases ≠ cellular function Further experimentation & data analysis needed Central Dogma of Biology DNARNAProtein Proteins responsible for the majority of cellular activities Transcription Factors are proteins that bind to DNA to regulate expression
The Incomplete Tale of E. coli Complete genome sequenced in 1997 (4.6+ Mbp) 4290 Open Reading Frames 240 potential TFs Detailed binding profiles for only 68 Understanding orphan regulatory proteins can improve public health Microbial disease Human microbiome
REPSA and Combinatorial Methods All combinatorial methods use large pools of randomized oligonucleotides difference in selection (A) CASTing involves physical separation of bound protein-DNA complex requires knowledge of protein (B) REPSA involves protein protection of enzymatic cleavage no prior knowledge of protein needed REPSA optimal technique for transcription factor discovery
A Closer Look at REPSA Selection Type IIS Restriction Enzymes cleave DNA at a defined distance from their recognition sites Selection Template (68 bp) 5’ - CTAGGAATTCGTGCAGAGGTGAAT NNNNNNNNNNNNNNNNNNNN TTA CATCC CTCCAG AAGCTTGGAC – 3’ 3’ - GATCCTTAAGCACGTCTCCACTTA NNNNNNNNNNNNNNNNNNNN AAT GTAGG GAGGTC TTCGAACCTG – 5’ BsgI HphI FokI BpmI
Orphan TF Specificity in the Van Dyke Lab General Experimental Procedure Clone bacteria (E. coli K12) with gene of interest Induce protein expression Perform combinatorial selection for sequence specificity (REPSA) Sequence final pool of DNA Analyze data for consensus sequence Verify, Hypothesize, & Publish Repeat!
LexA as model protein for REPSA Previously characterized repressor protein SOS response regulator Known consensus binding sequence DNase I protection EMSA Weight Matrix Studies Combinatorial methods never used Ideal candidate for REPSA protocol optimization Overall structure of E. coli LexA-DNA complex
Representative Round of REPSA Protein/ No Enzyme No Protein/ Enzyme 6 Rounds PCR 9 Rounds 12 Rounds
Unexpected Consensus Sequences Round 7: 553/1000 sequences e=4.0e-290 What are structural features of T-rich sequences? Why would this inhibit cleavage? Round 14: 923/1000 sequences e=9.3e-2093 Doesn’t this sequence look familiar? Why would this inhibit cleavage? According to previous literature, FokI cleaves DNA without regard for sequence specificity… Yet we appear to have found violations of this rule.
Future Direction LexA Specificity Studies Work out issues with protein binding Examine Orphan Transcription Factors In E. coli and other organisms such as extremophiles Mixtures of Transcription Factors Develop REPSA techniques to handle multiple TFs at one time Ultimate goal to work with less pure cellular extracts
Acknowledgements Van Dyke Lab Members NSF STEM Scholarship LSAMP Scholarship OVPR KSU Foundation NIH