BME 110L / BIOL 181L Computational Biology Tools www.soe.ucsc.edu/classes/bme110/Winter09 February 19: In-class exercise: a phylogenetic tree for that.

Slides:



Advertisements
Similar presentations
Application to find Eukaryotic Open reading frames. Lab.
Advertisements

An Introduction to Bioinformatics Finding genes in prokaryotes.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Polymerase Chain Reaction (PCR). PCR produces billions of copies of a specific piece of DNA from trace amounts of starting material. (i.e. blood, skin.
Ab initio gene prediction Genome 559, Winter 2011.
V) BIOTECHNOLOGY.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Finding Eukaryotic Open reading frames.
Gene Prediction Methods G P S Raghava. Prokaryotic gene structure ORF (open reading frame) Start codon Stop codon TATA box ATGACAGATTACAGATTACAGATTACAGGATAG.
CISC667, F05, Lec18, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Gene Prediction and Regulation.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
BIOS816/VBMS818 Lecture 7 – Gene Prediction Guoqing Lu Office: E115 Beadle Center Tel: (402) Website:
Gene Identification Lab
ECE 501 Introduction to BME
Single DNA Sequence Analysis Tools BME 110: CompBio Tools Todd Lowe May 6, 2008.
Lecture 12 Splicing and gene prediction in eukaryotes
Chapter 9: Biotechnology
Finding prokaryotic genes and non intronic eukaryotic genes
© Wiley Publishing All Rights Reserved. Working with a Single DNA Sequence.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
DNA Technology- Cloning, Libraries, and PCR 17 November, 2003 Text Chapter 20.
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
Biotechnology SB2.f – Examine the use of DNA technology in forensics, medicine and agriculture.
DNA Technology Chapter 20.
NCBI Review Concepts Chuong Huynh. NCBI Pairwise Sequence Alignments Purpose: identification of sequences with significant similarity to (a)
BME 110L / BIOL 181L Computational Biology Tools October 29: Quickly that demo: how to align a protein family (10/27)
Doug Raiford Lesson 3.  Have a fully sequenced genome  How identify the genes?  What do we know so far? 10/13/20152Gene Prediction.
1 The Interrupted Gene. Ex Biochem c3-interrupted gene Introduction Figure 3.1.
Genome Organization & Evolution. Chromosomes Genes are always in genomic structures (chromosomes) – never ‘free floating’ Bacterial genomes are circular.
BME 110L / BIOL 181L Computational Biology Tools Introductory Remarks and Overview - who - why - what - how Logistics.
DNA sequencing. Dideoxy analogs of normal nucleotide triphosphates (ddNTP) cause premature termination of a growing chain of nucleotides. ACAGTCGATTG ACAddG.
Genomics: Gene prediction and Annotations Kishor K. Shende Information Officer Bioinformatics Center, Barkatullah University Bhopal.
Chapter 21 Eukaryotic Genome Sequences
Recombinant DNA Technology and Genomics A.Overview: B.Creating a DNA Library C.Recover the clone of interest D.Analyzing/characterizing the DNA - create.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Srr-1 from Streptococcus. i/v nonpolar s serine (polar uncharged) n/s/t polar uncharged s serine (polar uncharged) e glutamic acid (neg. charge) sserine.
Comp. Genomics Recitation 9 11/3/06 Gene finding using HMMs & Conservation.
From Genomes to Genes Rui Alves.
Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell.
Class material and homework for February 9 today’s in-class topic: selected examples of contemporary biotechnology –polymerase chain reaction (PCR) –DNA.
KEY CONCEPT Biotechnology relies on cutting DNA at specific places.
BME 110L / BIOL 181L Computational Biology Tools Introductory Remarks and Overview - who - why - what - how Logistics.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Basic Overview of Bioinformatics Tools and Biocomputing Applications II Dr Tan Tin Wee Director Bioinformatics Centre.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
How can we find genes? Search for them Look them up.
Annotation of Drosophila virilis Chris Shaffer GEP workshop, 2006.
JIGSAW: a better way to combine predictions J.E. Allen, W.H. Majoros, M. Pertea, and S.L. Salzberg. JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the.
Polymerase Chain Reaction (PCR). PCRPCR PCR produces billions of copies of a specific piece of DNA from trace amounts of starting material. (i.e. blood,
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
DNA Technology Ch. 20. The Human Genome The human genome has over 3 billion base pairs 97% does not code for proteins Called “Junk DNA” or “Noncoding.
Genome Annotation Assessment in Drosophila melanogaster by Reese, M. G., et al. Summary by: Joe Reardon Swathi Appachi Max Masnick Summary of.
A Molecular Toolkit AP Biology Fall The Scissors: Restriction Enzymes  Bacteria possess restriction enzymes whose usual function is to cut apart.
(H)MMs in gene prediction and similarity searches.
Finding genes in the genome
1 Applications of Hidden Markov Models (Lecture for CS498-CXZ Algorithms in Bioinformatics) Nov. 12, 2005 ChengXiang Zhai Department of Computer Science.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
The genetic engineers toolkit A brief overview of some of the techniques commonly used.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
bacteria and eukaryotes
Genome Annotation (protein coding genes)
Biotechnology.
What is a Hidden Markov Model?
Lecture 8 A toolbox for mechanistic biologists (continued)
The student is expected to: (6H) describe how techniques such as DNA fingerprinting, genetic modifications, and chromosomal analysis are used to study.
Ab initio gene prediction
Introduction to Bioinformatics II
Reading Frames and ORF’s
Genome Annotation and the Human Genome
Introduction to Alternative Splicing and my research report
Presentation transcript:

BME 110L / BIOL 181L Computational Biology Tools February 19: In-class exercise: a phylogenetic tree for that protein family… Leaving the world of proteins again… A few words on Lab-work relevant tools we’ve encountered NEXT WEEK: Review of midterm Qs (for part/most of the week) This evening: Homework 3, PART I (Parts I and II due on Thu March 4 BEFORE class) Accompanying Reading (B4D): Chp 5

Most likely in the lab, you’ll be working with DNA sequence… - trying to figure out what a DNA-sequence you determined is about (we’ve talked about this plenty) - OR you’ll want to engineer (mutate, express, target) a region of DNA that you find particularly interesting Some of the tools you encountered in HW2 aim to help you in these tasks - “lab tools” are likely to be among those you’ll use the most… We assume that you have a grounding in these very basic topics of molecular biology - a quick refresher on the following slides, + a few gene-finding programs. (See also Ch 5, B4D)

Basic/Old ORF Finding in Bacteria 1. Look for “long”open reading frames (ORFs) 2. Scan sequence at nucleotide #1 (Frame #1), begin ORF at first start codon: ATG (too simple - why?) 3. Continue scanning to first stop codon: TAA, TGA, or TAG 4. That is your ORF! 5. Repeat, starting at nuc#2 (Frame #2), then again, starting at nuc#3 (Frame #3) 6. Take reverse complement of sequence -> (i.e. 5’-CGAAC -> 5’-GTTCG) 7. Scan sequence starting at nuc#1 (Frame –1), nuc#2 (Frame –2), nuc#3 (Frame –3)

Historical Note: Annotating the Yeast Genome Yeast is a eukaryote, but does not have many introns A strict cutoff for ORF length was used: minimum of 300 nucleotides ORF required to be considered a protein-encoding gene in original genome annotation (1996) Since then, many smaller ORFs have been found experimentally

More Sophisticated Methods Can analyze entire genome at once, use codon frequencies, not just one gene GeneMark ( Also, Glimmer from TIGR ( Not available on-line, one must download these programs and run them locally Both methods build on Hidden Markov Model (HMM)-methodology, in which a computer program is trained to distinguish coding from non-coding DNA sequences (humans don’t have to know the rules).

Eukaryotic Gene Finders Introns do not necessarily preserve reading frame (e.g.Exon#1 uses frame +2 Exon#2 uses frame +1 Exon#3 uses frame +2) No start codon at beginning of internal exons, so we must “guess” where exon/intron junction is Note that there are sequence preferences at pre-mRNA splice sites (e.g. most introns begin with GU, end with AG) but as always it’s only that, a preference/trend… Very difficult by hand, so computer programs are required Eukaryotic Gene Finding On-line: GenomeScan (builds on GenScan(HMMs; splice sites etc) and checks up on probability of BLASTX matches against a protein DB)

Gene Scanner Tracks on the Human Genome at UCSC AceView, Twinscan, N-SCAN, GeneID, GeneID, Genscan, Pseudogenes All use codon frequencies, splice site prediction, and other sources of information (Blastxhits, cDNA information, comparative genomics) As more experimental data is collected, less reliance on purely computational predictions

New England BioLabsTools (could be useful to try out) NEBcutter–Display which restriction enzyme cut in your sequence, and where REBsites–Display a “virtual”digest of your DNA, showing how it would look on an agarose gel

Designing PCR Primers

Primer3 (as we used in HW2) Key Input: 1. Sequence 2. Targets (region to be amplified) 3. Product Size Ranges 4. Primer size (usually bp) 5. Primer Tm (annealing temperature) (Rest are usually OK as defaults)