Finding Motifs in Restriction Enzyme Sequences

Slides:



Advertisements
Similar presentations
KEY CONCEPT DNA fingerprints identify people at the molecular level.
Advertisements

Gibbs sampling for motif finding in biological sequences Christopher Sheldahl.
BNFO 240 Usman Roshan. Last time Traceback for alignment How to select the gap penalties? Benchmark alignments –Structural superimposition –BAliBASE.
One gene one protein.
Translation (Protein Synthesis) RNA  protein. Making a protein Many RNAs needed –mRNA, tRNA, rRNA.
Sequencing a genome and Basic Sequence Alignment
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
Assessment of sequence alignment Lecture Introduction The Dot plot Matrix visualisation matching tool: – Basics of Dot plot – Examples of Dot plot.
DNA. A. Terminology A. Terminology Chromosomes- strands of genetic material Chromosomes- strands of genetic material Genes- Fundamental unit of heredity.
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.
Sequencing a genome and Basic Sequence Alignment
Objective Understand the process of translation Explain how proteins are assembled Analyze the end result (proteins’ function and structure)
Construction of Substitution Matrices
AP Biology DNA Study Guide. Chapter 16 Molecular Basis of Heredity The structure of DNA The major steps to replication The difference between replication,
Protein Synthesis Occurs in 2 steps – Step 1: Transcription Taking DNA and transcribing it into RNA – Step 2: Translation Taking RNA and translating it.
 What is it?  What are they?  What is it?  How does it work?  DNA is isolated  DNA is copied with PCR  Cut with restriction enzymes  Run through.
Agenda: Warm Up 8 min Minilesson 20min Activity 1 25min Activity 2 25min Summary 10 min Reflection 5 min Agenda: Warm Up 8 min Minilesson 20min Activity.
DNA "The Blueprint of Life". DNA stands for... D_________N_______ A____.
Construction of Substitution matrices
More about proteins Proteins are the building block of our bodies. They make up many components (muscle, skin) or direct the synthesis of components (bone,
Have your clickers ready!. 1. An amino acid. 2. A type of mutation 3. Three mRNA bases that code for an amino acid. 4. The genetic code. Countdown 30.
Step 3: Tools Database Searching
Special Topics in Genomics Motif Analysis. Sequence motif – a pattern of nucleotide or amino acid sequences GTATGTACTTACTATGGGTGGTCAACAAATCTATGTATGA TAACATGTGACTCCTATAACCTCTTTGGGTGGTACATGAA.
DNA, RNA, PROTEIN REVIEW. 1. What are all living things made of? 2. In what organelle is the genetic material located? 3. What is the name of the molecule.
 During replication (in DNA), an error may be made that causes changes in the mRNA and proteins made from that part of the DNA  These errors or changes.
Protein Synthesis. The genetic code This is the sequence of bases along the DNA molecule Read in 3 letter words (Triplet) Each triplet codes for a different.
Motif identification with Gibbs Sampler Xuhua Xia
DNA. DNA fingerprinting, DNA profiling, DNA typing  All terms applied to the discovery by Alec Jeffreys and colleagues at Leicester University, England.
DNA Technology and Mutations. Key Concepts What are different types of DNA Technology? How do we use DNA technology? What are some different types of.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
SC.912.L.16.3 DNA Replication. – During DNA replication, a double-stranded DNA molecule divides into two single strands. New nucleotides bond to each.
DNA/RNA Test Review 1. What are produced through the translation process? 2. Transcribe this DNA segment, AATCGC, to mRNA. 3. What are the three components.
Bioinformatics Overview
STR Analysis Biology Enriched.
Alignment table: group 4
Variation among organisms
Four different segments of a DNA molecule are represented below.
Types of Mutations.
The human liver contains many specialized cells that secrete bile
The making of proteins for …..
RNA and Protein Synthesis
RNA and Protein Synthesis
MUTATIONS.
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
They are in equal amounts.
STR Analysis Biology Enriched.
(A) Schematic representation of kalata B1 showing the cyclic cystine knot, the amino acid sequence in single letter code, and the regions used for oligonucleotide.
Notes 13.1 DNA.
Translation (Protein Synthesis) RNA  protein.
MUTATIONS.
Relationship between Genotype and Phenotype
Essential Question: How cells make proteins
Mrs. Einstein Biology Enriched
Alignment of H-NS, H-NS2, and StpA amino acid sequences.
RNA and Protein Synthesis
Volume 5, Issue 3, Pages e5 (September 2017)
MUTATIONS.
Have your clickers ready!
Protein Synthesis.
MUTATIONS.
Evolution of Genomes Chapter 21.
Localization of putative cholesterol-binding motifs in the homology model of human glucose transporter 1 (GLUT1) protein. Localization of putative cholesterol-binding.
Transcription & Translation
13.2 – Manipulating DNA.
Multiple sequence alignment of Twisted gastrulation (TSG) proteins.
Predicted Amino Acid Sequence of the Tomato Cf-4 Protein (Thomas et al
Cells lacking CDK6 kinase function are required to mutate p53.
Structures of wild-type and mutated recombinant CDX2.
Presentation transcript:

Finding Motifs in Restriction Enzyme Sequences August Staubus Bio 131

Restriction Enzymes Cut DNA 160,337 Seemingly Unrelated cengage.com rcsb.org

Motif! Discovered by looking at structures Thielking et al

Data Downloaded all 160,337

High-Level Code Choose the first k-mer from the first protein sequence Compute a profile for this k-mer Choose the profile-most-probable k-mer from the next protein sequence Compute a profile for the k-mers chosen so far Compute the consensus of the selected k- mers Compute the score of the selected k-mers Ritz 2017 Repeat until the profile- most-probable kmer has been selected for each sequence Ritz 2017

High-Level Code Randomize the order of the protein sequences Choose the first a random k-mer from the first protein sequence Compute a profile for this k-mer Choose the profile-most-probable k-mer from the next protein sequence Compute a profile for the k-mers chosen so far Compute the consensus of the selected k- mers Compute the score of the selected k-mers based on amino acid mutation table Repeat until the profile- most-probable kmer has been selected for each sequence Repeat i times Repeat n times Ritz 2017

Results Best 3-mer Score Best 2-mer Number of runs (i) DLE 8 DE 5 25 AND 11 FR 50 RKG 12 HR 100

3-mer↑ ↓2-mer