Molecular Biology Techniques – A Primer The methods depend upon, and were developed from, an understanding of the properties of biological macromolecules themselves. Hybridization---the base-pairing characteristics of DNA and RNA DNA cloning--- DNA polymerase, restriction endonucleases and DNA ligase PCR---Thermophilic DNA polymerase
Topic 1: Nucleic acids Electrophoresis Restriction Hybridization DNA Cloning and gene expression PCR Genome sequence and analysis
Electrophoresis 1. Gel electrophoresis separates DNA and RNA molecules according to size, shape and topological properties Gel matrix is an inserted, jello-like porous material that support and allows macromolecules to move through. Agarose and polyacrylamide are two different gel matrices
Electrophoresis DNA and RNA molecules are negatively charged, thus move in the gel matrix toward the positive pole (+) Linear DNA molecules are separated according to size The mobility of circular DNA molecules is affected by their topological structures. The mobility of the same molecular weight DNA molecule with different shapes is: supercoiled> linear> nicked or relaxed
DNA separation by gel electrophoresis large moderate small After electr
To separate DNA of different size ranges Electrophoresis To separate DNA of different size ranges Narrow size range of DNA: use polyacrylamide Wide size range of DNA: use agarose gel Very large DNA(>30-50kb): use pulsed-field gel electrophoresis
pulsed-field gel electrophoresis Switching between two orientations: the larger the DNA is, the longer it takes to reorient
Restriction endonucleases cleave DNA molecules at particular sites Nucleic acid Restriction digestion Restriction endonucleases cleave DNA molecules at particular sites Why use endonucleases? --To make large DNA molecules break into manageable fragments
Restriction digestion Restriction endonucleases: the nucleases that cleave DNA at particular sites by the recognition of specific sequences The target site recognized by endonucleases is usually palindromic. e.g. EcoRI 5’….GAATTC.….3’ ….CTTAAG….
To name a restriction endonuclease: e.g. EcoRI Restriction digestion To name a restriction endonuclease: e.g. EcoRI the 1st such enzyme found Escherichia coli Species category R13 strain
Frequency of the occurrence of hexamaeric sequence: Restriction digestion Frequency of the occurrence of hexamaeric sequence: 1/4096 (4-6) Randomly
Digestion of a DNA fragment with endonuclease EcoRI (The largest fragment) (The smallest fragment) Consider a linear DNA molecule with 6 copies of GAATTC: it will be cut into 7 fragments which could be separated in the gel electrophoresis by size Digestion of a DNA fragment with endonuclease EcoRI
Restriction digestion Endonucleases are used to make restriction map: e.g. the combination of EcoRI + HindIII Allows different regions of one molecule to be isolate and a given molecule to be identified A given molecule will generate a characteristic series of patterns when digested with a set of different enzymes
Restriction digestion Different enzymes recognize their specific target sites with different frequency EcoRI Recognize hexameric sequence: 4-6 Sau3A1 Recognize terameric sequence: 4-4 Thus Sau3A1 cuts the same DNA molecule more frequently
Recognition sequences and cut sites of various endonucleases Restriction digestion blunt ends sticky ends Recognition sequences and cut sites of various endonucleases
Restriction digestion The 5’ protruding ends of are said to be “sticky” because they readily anneal through base-pairing to DNA molecules cut with the same enzyme Reanneal with its complementary strand or other strands with the same cut
DNA hybridization can be used to identify specific DNA molecules Nucleic acid DNA hybridization DNA hybridization can be used to identify specific DNA molecules Hybridization: the process of base-pairing between complementary ssDNA or RNA from two different sources
Probe: a labeled, defined sequence used to search mixtures of nucleic acids for molecules containing a complementary sequence
Labeling of DNA or RNA probes Radioactive labeling: display and/or magnify the signals by radioactivity Non-radioactive labeling: display and/or magnify the signals by antigen labeling – antibody binding – enzyme binding - substrate application (signal release) End labeling: put the labels at the ends Uniform labeling: put the labels internally
End labeling Single stranded DNA/RNA 5’-end labeling: polynucleotide kinase (PNK) 3’-end labeling: terminal transferase
Labeling at both ends by kinase, then remove one end by restriction digestion 5’pAATTC G ---------------------G ---------------------CTTAAp5’
Uniformly labeling of DNA/RNA Nick translation: J1 Characterization of clones Uniformly labeling of DNA/RNA Nick translation: DNase I to introduce random nicks DNA polI to remove dNMPs from 3’ to 5’ and add new dNMP including labeled nucleotide at the 3’ ends. Hexanucleotide primered labeling: Denature DNA add random hexanucleotide primers and DNA pol synthesis of new strand incorporating labeled nucleotide.
Strand-specific DNA probes: e.g. M13 DNA as template J1 Characterization of clones Strand-specific DNA probes: e.g. M13 DNA as template the missing strand can be re- synthesized by incorporating radioactive nulceotides Strand-specific RNA probes: labeled by transcription
J1-5 Southern and Northern blotting J1 Characterization of clones J1-5 Southern and Northern blotting DNA on blot RNA on blot Genomic DNA preparation RNA preparation Restriction digestion - Denature with alkali - Agarose gel electrophoresis DNA blotting/transfer and fixation RNA 6. Probe labeling 6. Hybridization (temperature) 7. Signal detection (X-ray film or antibody)
Southern analysis
Southern bolt hybridization
Northern analysis COB RNAs in S. cerevisiae bI1 bI2 bI3 bI4 bI5 mRNA Pre-mRNAs
Blot type Target Probe Applications J1 Characterization of clones Blot type Target Probe Applications Southern DNA DNA or RNA mapping genomic clones estimating gene numbers Northern RNA RNA sizes, abundance, and expression Western Protein Antibodies protein size, abundance
Sequencing Two ways for sequencing: Nucleic acid Sequencing Two ways for sequencing: 1. DNA molecules (radioactively labeled at 5’ termini) are subjected to 4 regiments to be broken preferentially at Gs, Cs, Ts, As, separately. 2. chain-termination method
chain-termination method ddNTPs are chain-terminating nucleotides: the synthesis of a DNA strand stops when a ddNTP is added to the 3’ end
The absence of 3’-hydroxyl lead to the inefficiency of the nucleophilic attack on the next incoming substrate molecule
Tell from the gel the position of each G DNA synthesis aborts at a frequency of 1/100 every time the polymerase meets a ddGTP
Fluorescence automated sequencing system Slab gel electrophoresis..
Fluorescence automated sequencing system capillary gel electrophoresis
Method uses non-radioactive fluorescent labelling. Computerized visualization from a single lane of an automated sequencer. Method uses non-radioactive fluorescent labelling.
“read” the sequencing gel to get the sequence of the DNA DNA sequencing gel 4 systems with dNTP+ ddGTP, dNTP+ ddATP d NTP+ ddCTP, d NTP+ ddTTP separately “read” the sequencing gel to get the sequence of the DNA
The shortgun strategy permits a partial assembly of large genome sequence NUCLEIC ACIDS If we want to sequence a much larger and more complicate eukaryotic genome using the shortgun strategy. What can we do? Firstly, libraries in different level should be constructed.
The DNA fragment can be easily extracted and sequenced automatically. Sophisticated computer programs have been developed to assemble the randomized DNA fragment, forming contigs. A single contig is about 50,000 to 200,000 bp. It’s useful to analysis fruit fly genome that contains an average of one gene every 10 kb. If we want to analysis human genome, contigs should be assembled into scaffolds.
1-16 the paired-end strategy permits the assembly of large genome sequence The main limitation to producing large contigs is the occurrence of repetitive sequence. (Why?) To solve this problem, paired-end sequencing is developed. The same genomic DNA is also used to produce recombinant libraries composed of large fragments between 3~100 kb. NUCLEIC ACIDS
The end of each clone can be sequenced easily, and these larger clones can firstly assemble together.
If a larger scaffold is needed, you should use a cloning vector that can carry large DNA fragment, (at least 100kb). BAC is a good choice.
1-17 genome-wide analysis The purpose of this analysis is to predict the coding sequence and other functional sequence in the genome. For the genomes of bacteria and simple eukaryotes, finding ORF is very simple and effective. NUCLEIC ACIDS
For animal genomes, a variety of bioinformatics tools are required to identify genes and other functional fragments. But the accuracy is low.
The most important method for validating protein coding regions and identify those those missed by current current gene finder program is the use of cDNA sequence data. The mRNAs are firstly reverse transcript into cDNA, and these cDNA, both full length and partial, are sequenced using shortgun method. These sequence are used to generate EST (expressed sequence tag) database. And these ESTs are aligned onto genomic scaffolds to help us identify genes.
Part II proteins
2-1 specific proteins can be purified from cell extracts The purification of individual proteins is critical to understanding their function. (why?) Although there are thousands of proteins in a single cell, each protein has unique properties that make its purification somewhat different from others. proteins
The purification of a protein is designed to exploit its unique characteristics, such as size, charge, shape, and in many instance, function.
2-2 purification of a protein requires a specific assay To purify a protein requires that you have an assay that is unique to that protein. In many instance, it’s convenient to use a measure for the function of the protein, or you may use the antibody of the protein. It is useful to monitor the purification process. proteins
2-4 Proteins can be separated from one another using column chromatography In this approach, protein fractions are passed though glass columns filled with appropriated modified small acrylamide or agarose beads. There are various ways columns can be used to separate proteins according to their characteristics. proteins
Ion exchange chromatography The proteins are separated according to their surface charge. The beads are modified with either negative-charged or positive-charged chemical groups. Proteins bind more strongly requires more salt to be eluted.
Gel Filtration Chromatography This technique separate the proteins on the bases of size and shape. The beads for it have a variety of different sized pores throughout. Small proteins can enter all of the pores, and take longer to elute; but large proteins pass quickly.
This method is called affinity chromatography. 2-5 affinity chromatography can facilitate more rapid protein purification If we firstly know our target protein can specifically interact with something else, we can bind this “something else” to the column and only our target protein bind to the column. This method is called affinity chromatography. proteins
Immunoaffinity chromatography An antibody that is specific for the target is attached to the bead, and ideally only the target protein can bind to the column. However, sometimes the binding is too tight to elute our target protein, unless it is denatured. But the denatured protein is useless.
Sometimes tags (epitopes) can be added to the N- or C- terminal of the protein, using molecular cloning method. This procedure allows the modified proteins to be purified using immunoaffinity purification and a heterologous antibody to the tag. Importantly, the binding affinity can change according to the condition. e.g. the concentration of the Ca2+ in the solution.
immunoprecipitation We attach the antibody to the bead, and use it to precipitate a specific protein from a crude cell extract. It’s a useful method to detect what proteins or other molecules are associated with the target protein.
2-6 separation of proteins on polyacrylamide gels The native proteins have neither a uniform charge nor a uniform secondary structure. If we treat the protein with a strong detergent SDS, the higher structure is usually eliminated. And SDS confers the polypeptide chain a uniform negative charge. proteins
And sometimes mercaptoethanol is need to break the disulphide bond. Thus, the protein molecules can be resolved by electrophoresis in the presence of SDS according to the length of individual polypeptide. After electrophoresis, the proteins can be visualized with a stain, such as Coomassie brilliant blue.
2-7 antibodies visualize electrophoretically-separated proteins. The electrophoretically separated proteins are transferred to a filter. And this filter is then incubate in a solution of an antibody to our interested protein. Finally, a chromogenic enzyme is used to visualized the filter-bound antibody proteins
2-8 protein molecules can be directly sequenced Two sequence method: Edman degradation and Tandem mass spectrometry(MS/MS). Due to the vast resource of complete or nearly complete genome, the determination of even a small stretch of protein sequence is sufficient to identify the gene. proteins
Edman degradation It’s a chemical reaction in which the amino acid’s residues are sequentially release for the N-terminus of a polypeptide chain.
Step 1: modify the N-terminal amino with PITC, which can only react with the free α-amino group. Step 2: cleave off the N-terminal by acid treatment, but the rest of the polypeptide remains intact. Step 3: identify the released amino acids by High Performance Liquid Chromatography (HPLC). The whole process can be carried out in an automatic protein sequencer.
Tandem mass spectrometry MS is a method in which the mass of very small samples of a material can be determined.
Step 1: digest your target protein into short peptide. Step 2: subject the mixture of the peptide to MS, and each individual peptide will be separate. Step 3: capture the individual peptide and fragmented into all the component peptide. Step 4: determine the mass of each component peptide. Step 5:Deconvolution of these data and the sequence will be revealed.
2-9 proteomics Proteomics is concerned with the identification of the full set of proteins produced by a cell or a tissue under a particular by a particular set of conditions. proteins
Three principle methods 1. 2-D gel electrophoresis for protein separation. 2. MS for the precise determination of a protein. 3. Bioinformatics technology.
1-14 shortgun sequencing a bacterial genome The bacterium H. influenzae was the first free-living organism to have a complete genome sequenced and assembled. This organism is chosen as its genome is small (1.8 Mb) and compact. NUCLEIC ACIDS
Its whole genome was sheared into many random fragments with an average length of 1kb. This pieces are cloned into a plasmid vector. And these clones are sequenced respectively. All these sequence information are loaded into the computer. The powerful program will assemble the random DNA fragment based on containing matching sequence, forming a single continuous assemble, called a contig.
To ensure every nucleotide in the genome was captured in the final genome assemble, 30,000 ~ 40,000 clones are needed, which is ten times larger as the genome. This is called 10×sequence coverage. This method might seem tedious, but it’s much faster and cheaper than the digestion-mapping-sequencing method. As the computer is much faster at assembling sequence than the time required to map the chromosome.
J3 Polymerase chain reaction Analysis and uses of cloned DNA J3 Polymerase chain reaction J3-1 PCR J3-2 The PCR cycle J3-3 Template J3-4 Primers J3-5 Enzymes J3-6 PCR optimization
J3 Polymerase chain reaction J3-1 PCR The polymerase chain reaction(PCR) is to used to amplify a sequence of DNA using a pair of primers each complementary to one end of the the DNA target sequence.
J3 Polymerase chain reaction J3-2 The PCR cycle Denaturation: The target DNA (template) is separated into two stands by heating to 95℃ Primer annealing: The temperature is reduced to around 55℃ to allow the primers to anneal. Polymerization (elongation, extension): The temperature is increased to 72℃ for optimal polymerization step which uses up dNTPs and required Mg2+ .
J3 Polymerase chain reaction
J2 nucleic acid sequencing Steps of PCR Template Primers Enzymes
J3 Polymerase chain reaction J3-3 Template Any source of DNA that provides one or more target molecules can in principle be used as a template for PCR Whatever the source of template DNA, PCR can only be applied if some sequence information is known so that primers can be designed.
J3 Polymerase chain reaction J3-4 Primers PCR primers need to be about 18 to 30 nt long and have similar G+C contents so that they anneal to their complementary sequences at similar temperatures.They are designed to anneal on opposite strands of the target sequence. Tm=2(a+t)+4(g+c): determine annealing temperature. If the primer is 18-30 nt, annealing temperature can be Tm5oC
Degenerate primers: an oligo pool derived from protein sequence. J3 Polymerase chain reaction Degenerate primers: an oligo pool derived from protein sequence. E.g. His-Phe-Pro-Phe-Met-Lys can generate a primer 5’-CAY TTY CCN TTY ATG AAR Y= Pyrimidine N= any base R= purine
J3-5 and 6 Enzymes and PCR Optimization J3 Polymerase chain reaction J3-5 and 6 Enzymes and PCR Optimization The most common is Taq polymerase.It has no 3’ to 5’ proofreading exonuclease activity. Accuracy is low, not good for cloning. We can change the annealing temperature and the Mg2+ concentration or carry out nested PCR to optimize PCR.
I.Reverse transcriptase-PCR J2 nucleic acid sequencing PCR optimization I.Reverse transcriptase-PCR II.Nested PCR
Nested PCR J2 nucleic acid sequencing Gene of interest First round primers Gene of interest Second round PCR First round PCR Second round primers
Reverse transcriptase-PCR Reverse transcriptase J2 nucleic acid sequencing Reverse transcriptase-PCR RT-PCR 5‘-Cap mRNA AAA(A)n (dT)12~18 primer anneal 5‘-Cap 3‘ 5‘ AAA(A)n dNTP Reverse transcriptase 5‘-Cap 5‘ Regular PCR AAA(A)n cDNA:mRNA hybrid