GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016 Lecture - 11 Protein-Protein Interaction GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016
Protein-Protein Interaction Proteins work together by actually binding to form multicomponent complexes that carry out specific functions. These functional units can be as simple as dimeric transcription-factor complexes or as complex as the 30-plus component systems that form ribosomes. Biochemists believe that all proteins bind to or interact with at least one other protein. The discovery that proteins in higher organisms (e.g., human and mouse) contain higher numbers of functional domains suggests that many of these proteins have multiple associations. Understanding how protein complexes work is essential to understanding how cells work as systems.
Identifying Protein-Protein Interaction In the pregenomic era, immunoprecipitation was the primary means of determining protein-protein interaction.
Identifying Protein-Protein Interaction In the post genomic era, few methods have proven especially helpful for this purpose. Yeast two hybrid (Y2H) system is one of them. In short, the Y2H method is designed to use a protein of interest as bait in order to discover proteins that physically interact with the bait protein, those proteins are termed as preys. In Y2H method, a single transcription factor is cut into two pieces called the DNA Binding Domain (DBD) and Activation Domain (AD), which stimulates the RNA polymerase to begin transcription.
Identifying Protein-Protein Interaction Fused to the DBD is the bait protein of interest (B), which cannot initiate transcription on its own. Fused to the AD the prey ORF, which can be any known or unknown protein. The prey protein of AD + ORF fused together cannot initiate transcription either. When the bait and prey proteins are produced in the same cell, they might interact; and if they do, transcription of His3 gene is initiated. Any ORF can be tested with Y2H, which means a proteome-wide survey can be performed rapidly by transforming a genomic library into cells that contain bait plasmids. In this way, every protein in a proteome can be tested individually for its potential interact with bait.
Identifying Protein-Protein Interaction
GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016 Lecture - 11 Genetic Circuitry GEB 406 Course Instructor: Sheikh Ahmad Shah Semester: Summer 2016
Introduction Every cell in an organism contains the exact same DNA. Despite the genes being all the same, our body can produce different cell types (like: liver cells, brain cells, skin cells, etc.) by expressing only a subset of genes. The subset of genes expressed is tightly controlled during development, and this control is exerted over: Location Time Amount
Genetic Circuits Genetic circuits are functional interaction of proteins and DNA sequences through inducible transcription factors and cis-regulatory elements like promoter, enhancers, etc. Endo16 is one of the most studied genes which serves as a model for understanding the genomic control over the expression of genes. This gene is expressed in the developing gut of a Sea Urchin embryo. Sea urchin is transparent in nature and has been used as a model organism for developmental biology for over 100 years. Sea Urchin Embryo
Genomic Control over Genes At the upstream of start transcription site, there are some DNA sequences which control the transcription process. These are termed as “cis-regulatory elements”. On the other hand, there are some DNA sequences further away from the coding sequence, often located on a separate chromosome, which also regulates transcription. These are termed as “trans-regulatory elements”.
Genomic Control over Genes Cis-regulatory elements are modular in their organization, which means that the DNA can be divided into functional units, each of which performs a particular job. Each cis-regulatory module is composed of a sequence of DNA to which one or more DNA binding protein can bind, either to help initiate transcription (transcription factors) or to repress transcription (repressors). The best understood cis-regulatory elements are found in the sea urchin, at the upstream of Endo16 gene.
Early Development of Sea Urchin Sea urchin zygote cell starts to divide mitotically. From one cell to two cells, from two cells to four cells, and so on. From the mass of cell, a hollow sphere of embryonic cell called Blastula is formed. Blastula then undergoes a process called Gastrulation when a subset of blastula cells begin to invaginate or move into the cavity of the blastula/gastrula. As the cells begin to invaginate, they repress some genes and activate others. The cells that form the elongating tube inside the gastrula will become the endodermal cells that line the gut of the future larva. This elongated tube of cells is called the archenteron.
Early Development of Sea Urchin Fate Map of Sea Urchin
Molecular Dissection of Development Slightly before gastrulation, cells at the base of the blastula express a gene called Endo16, and later all cells of the invaginating archenteron express Endo16. These cells will become endodermal cells in the larva. During development, the expression of Endo16 gene is tightly controlled for location, time, and amount of RNA. This expression pattern has made Endo16 an excellent genetic marker to identify which cells will form the endoderm. Later in gastrulation, the cells of the foregut and hindgut repress Endo16 so that only midgut cells still express it. By late gastrulation, the midgut cells transcribe Endo16 at an even higher rate than before.
Molecular Dissection of Sea Urchin Development The modules of Endo16's cis-regulatory elements are located in the 2,300 bp upstream of the coding DNA. Each module (G-A) has a function when studied individually because DNA-binding proteins recognize specific DNA sequences within each module, and these protein-DNA interactions regulate transcription.
Approach of Davidson’s Group Several DNA constructs were created to fuse each module individually onto the most basic promoter (Bp) that allows RNA polymerase to bind and begin transcription. A reporter gene CAT (Chloramphenicol Acetyl Transferase) was used in place of the coding portion of Endo16, because CAT output could easily be monitored.
Location of Expression It was found that, constructs 1, 2, 7, and 8 promote the production of CAT in endodermal cells. These constructs include modules G, B, and A. Constructs containing modules C, D, E, and F do not promote the production of CAT in endoderm cells but do permit CAT production in mesoderm and ectoderm cells. The roles of modules F-C were unclear and seemingly counterproduc-tive. Therefore, new constructs were created that combined the best endoderm promoter (GBA+Bp) with each of the remaining modules. The idea was to test whether modules F-C had any influence on the transcription promoted by modules G, B, and A.
Location of Expression When the new constructs were tested, they exhibited different capacities to promote the formation of CAT. However, the level of CAT production was essentially unchanged in endodermal cells, but the level of "inappropriate" CAT expression in mesoderm and ectoderm was altered. Modules DC reduced the capacity of the promoter to function in mesoderm cells, while modules F and E each reduced the expression of CAT in ectoderm cells. That suggested that modules F-C appear to function as cell-type specific repressors of transcription, which helps explain why Endo16 is not expressed in ectoderm or mesoderm cells.
Timing of Expression After establishing which modules promote and repress transcription and which modules address the location component of Endo16 expression in embryogenesis, scientists then turned to the question of timing. They took all the previously made DNA constructs again for this research. For each DNA construct, CAT enzyme activity was measured at each time point. First, CAT activity using constructs 1, 2, 7, 8, and 10 were measured, and it was found that the three inducing modules (G, B, and A) exhibited different temporal profiles.
Timing of Expression Module A induced CAT production during the first 48 hours and then dropped off. Module B promoted CAT production primarily at the 60- and 72-hour time points. Module G did not promote much CAT production by itself, though there is a marginal increase around 48 hours. When modules GBA were combined, the expression level was almost equal to the wild-type cis-regulatory element containing all eight modules.
The Effect of Module G To determine the role of module G in Endo16 transcription, the investigators built few more DNA constructs. For these constructs, they removed the Endo16 Bp (basal promoter) and replaced it with a weakened viral promoter (SVp). When each inducing module was placed individually upstream of SVp, the output indicated that modules G, B, and A exerted their influence on transcription without any participation by Bp.
The Effect of Module G The amplitude of CAT production by module A was increased approximately fourfold when module G was added onto A+SVp, though the shape of the curve was essentially unchanged. Interestingly, module G did not alter the amplitude or shape of module B's ability to pro-duce CAT. When module G was added onto BA+SVp, the amplitude is substantially increased at the 48-hour time point, which is when module A is exerting its maximum effect. Furthermore, the addition of module G (GBA+SVp) has increased the output from module B at 60 and 72 hours, when module B becomes active. In short, module G acts as an amplifier for module A and B* (*while combined with A).
Circuit Diagram of Endo16 Cis-Regulatory Elements A circuit diagram can be drawn which can explain the cis-regulatory elements function of Endo16 in early and late development.
Integrating Single-Gene Circuits the complexity of whole-genome regulation is too overwhelming to diagram as simple circuits. Genomic information is accumulating faster than ever, and new tools are needed to visualize all of it simultaneously. So, many gene circuits are being integrated.
Integrated Genomic Circuit Our genes are regulated to be activated in some cells and repressed in others. Genetic expression changes dynamically in response to environmental influences and aging. Cells need a mechanism to switch genes from on to off and vice versa. Genes need to sense their intracellular environment and respond accordingly. But cells should also be tolerant of some cellular variations. Furthermore, cells need to have alternative means for accomplishing vital functions. Our genomes must be prepared for circumstances that might block one circuit from performing its cellular role.
Bistable Toggle Switches Bistable Toggle Switch is the switch that is used to turn any instrument or device on or off in a stable way. A biological, bistable toggle switch will remain in one position (on or off ) until the circuit determines the switch should be toggled to the other position. A biological toggle switch typically consists of three factors: Promoters, Repressors, and Inducers. (Constitutive) Promoters encourage expression of a gene. Repressors bind to promoters, inhibiting expression of genes. Inducers bind to repressors, preventing repressor binding to promoters. Thus inducers encourage expression of genes.
Bistable Toggle Switches The bistability of the toggle arises from the mutually inhibitory arrangement of the repressor genes. In the absence of inducers, two stable states are possible: one in which promoter 1 transcribes repressor 2, and vice versa. Switching is accomplished by transiently introducing an inducer of the currently active repressor. The inducer permits the opposing repressor to be maximally transcribed until it stably represses the originally active promoter.
How Do Toggle Switches Work? Genetic switches have to deal with a degree of uncertainty, which is termed as “Noise”. Gene activation occurs when transcription factors bind to cis-regulatory elements. When a cell undergoes mitosis and cytokinesis (eukaryotes) or cell division (bacteria), the first source of noise is introduced as transcription factors may not split in 50:50 ratio. For this kind of uncertainty, the process is called “Stochastic”. For example, if a cell had 50 copies of the Otx transcription factor, 6% of the time a particular daughter cell might get 19 or fewer copies (instead of 25 copies), while 6% of the time it might get at least 31. That could have a profound effect on the subsequent regulation of Endo16 expression.
How Do Toggle Switches Work? Another component of genetic noise is the random binding of proteins (transcription factors) to its target DNA. As each cis-regulatory element must be found by a small number of DNA-binding proteins, it results in an increased range of times when all the transcription factors are in the right places for any given gene. Again, once the cis-regulatory element is fully occupied and ready to initiate transcription, the first RNA will be produced after a variable amount of time due to noise in the initiation of the transcription machinery. For these kind of stochastic behaviors, the time of the transcription of a particular gene can not be predicted very precisely.
Effect of Noise and Stochastic Behavior In prokaryotes and eukaryotes, proteins are produced in bursts of translation of varying durations and with varying outputs. Therefore, the total number of proteins produced from any gene is not the same each time, but rather an average with a normal distribution (a bell shaped curve, with “average” being the highest point of that curve). By producing proteins in bursts rather than at a constant rate, the cell provides proteins a higher probability of forming a quaternary structure (e.g., a dimer) that may be required for full function. So, there exists a chaotic and mildly disorganized environment for protein expression inside the cell.
Effect of Noise and Stochastic Behavior Protein A can bind to the cis- regulatory elements of genes b and c to initiate transcription for both genes. Protein B has three possible fates: it can be degraded by the cell; it can diffuse away and perform other functions; and, most importantly for us, it can repress the expression of gene c. Conversely, protein C has three fates, one of which is to repress gene b. Here, A can bind with either B or C, resulting in the repression of C and B respectively. Stochastic factors like the amount of A and its ability to find a limited number of binding sites upstream of b and c determine which protein will be expressed by the cell.
Toggle Switch in λ Phage In bacteriophage λ, there is a naturally evolved toggle switch which controls whether the phage will go into lytic phase or lysogenic phase. Here, the deciding factor is a single protein called CII or C two. (This CII is equivalent of protein A of the previous slide).
Toggle Switch in λ Phage If CII finds the promoter PRE, transcription will proceed toward the left of PRE and lead to the transcription of CI (C one) further downstream. Dimer of CI can bind to the promoter PL upstream of CIII and lead to the production of CIII. CIII prevents the destruction of CII; thus, CIII indirectly reinforces its own production in a positive feedback loop. Transcription of CI Transcription of CIII by CI Prevention of CII degradation by CIII
Toggle Switch in λ Phage Dimerized CI (or CI2) reinforces its own production indirectly by binding to sites labeled OR1 and OR2 to repress the production of Cro protein (CI2 acting as a repressor of cro). CI2 binding to OR1 and OR2 also promotes its own production in a positive feedback loop by acting as a transcription factor for its own gene, CI. CI2 repressing Cro Positive feedback loop of CI
Toggle Switch in λ Phage Once CII initiates this bistable toggle switch, λ is locked into peaceful lysogenic coexistence with its host E. coli unless new environmental forces disturb the system (e.g., UV light, change in nutrient availability). However, the toggle switch could have flipped the other way, depending on the noise and stochastic protein behaviors. CII protein could have been degraded if it took too long to find PRE, because E. coli makes a protease that can destroy CII. If Brownian motion (random motion driven by kinetic energy) causes the protease to find CII before CII finds PRE, the lytic lifestyle is chosen.
Toggle Switch in λ Phage In the absence of CII, the promoter labeled PR is weakly active and begins transcribing to the right, resulting in the production of Cro protein. Cro2 binds to OR3 and OR2, which leads to repression of CI and increased transcription of cro. The positive feedback loop keeps the bistable toggle switch flipped toward cro transcription and a lytic lifestyle that eventually leads to the production of hundreds of fully mature viruses that swell and lyse the E. coli host cell.
Toggle Switch in λ Phage There are several noisy factors in the choice made by λ phage, such as the limited number of proteins and binding sites, variable amount of time for transcription, burst of protein production for efficient dimerization. Environmental influences can also skew this decision. For example, if the bacterium host happens to be growing in a nutrient-rich environment, the bacterium produces more protease, resulting in faster destruction of CII and the production of many new λ phage (lytic lifestyle). Conversely, if the bacterium happened to be in a nutrient-poor environment, there are fewer protease molecules, so CII has a higher probability of finding its binding site on PRE before being destroyed. A longer half-life for CII leads to peaceful coexistence (lysogenic lifestyle).
Genomic Control of Different Genes In a study, investigators from Tufts University had examined the amount of noisy factors generated by different aspects of a genomic circuit. They placed two different promoters recA and lacZ upstream of the reporter gene GFP and then measured the expression in 200 individual cells when under control conditions or under induction. It was found that the recA promoter is constitutively on (always activated) at a low level with varying amount of noise. When induced, recA promoter stimulates large amounts of mRNA. In contrast, lacZ exhibits a very low background level of transcription with little noise under control conditions, and induction does not produce as much increase over basal rate.
Genomic Control of Different Genes The behavior of recA and lacZ promoters makes sense considering their roles. RecAp is an essential promoter used to repair DNA damage which is a vital process. Thus, when the cell senses DNA damage, the promoter requires only one step to switch to a higher expression rate with relatively less noise. In contrast, lacZp metabolizes lactose, and the gene is induced in the absence of glucose and the presence of lactose. Basal expression of lacZ is normally low because alternative sugars would be available. The toggle switch for lacZ induction requires several other proteins, and each of those proteins has its own level of noise. Therefore, lacZ induction is a much noisier system than RecA.