Lecture 1 BNFO 135 Usman Roshan. Course overview Perl progamming language (and some Unix basics) –Unix basics –Intro Perl exercises –Programs for comparing.

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

The Chemistry of Life Macromolecules
Information transferred from DNA to mRNA is translated into an amino acid sequence. 7.3 Translation Information transferred from DNA to mRNA is translated.
Lecture 1 BNFO 601 Usman Roshan. Course overview Perl progamming language (and some Unix basics) –Unix basics –Intro Perl exercises –Dynamic programming.
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.
Lecture 1 BNFO 240 Usman Roshan. Course overview Perl progamming language (and some Unix basics) Sequence alignment problem –Algorithm for exact pairwise.
BNFO 602, Lecture 2 Usman Roshan Some of the slides are based upon material by David Wishart of University.
BNFO 602 Lecture 2 Usman Roshan. Sequence Alignment Widely used in bioinformatics Proteins and genes are of different lengths due to error in sequencing.
BNFO 602 Lecture 1 Usman Roshan.
BNFO 240 Usman Roshan. Last time Traceback for alignment How to select the gap penalties? Benchmark alignments –Structural superimposition –BAliBASE.
Lecture 1 BNFO 601 Usman Roshan. Course overview Perl progamming language (and some Unix basics) –Unix basics –Intro Perl exercises –Sequence alignment.
Protein structure Friday, 10 February 2006 Introduction to Bioinformatics Brigham Young University DA McClellan
1. Primary Structure: Polypeptide chain Polypeptide chain Amino acid monomers Peptide linkages Figure 3.6 The Four Levels of Protein Structure.
Lecture 4 BNFO 235 Usman Roshan. IUPAC Nucleic Acid symbols.
BNFO 235 Lecture 5 Usman Roshan. What we have done to date Basic Perl –Data types: numbers, strings, arrays, and hashes –Control structures: If-else,
Lecture 1 BNFO 136 Usman Roshan. Course overview Pre-req: BNFO 135 or approval of instructor Python progamming language and Perl for continuing students.
Building Blocks of life Molecular Structure: DNA, RNA and amino acids Lecture 3.
DNA TEST STUDY GUIDE. 1. What is this a picture of? Nucleotides.
Genetic Code All of the information to make a new organism is contained in the chromosomes of the cell. Chromosomes are made of tightly coiled DNA or Deoxyribonucleic.
Proteins and DNA Chapter 3.
Insulin: Weight = 5733, 51 amino acids Glutamine Synthetase: Weight = 600,000, 468 amino acids.
DNA: The Molecule of Heredity
Intelligent Systems for Bioinformatics Michael J. Watts
Lecture 3 THE CHEMISTRY OF LIVING THINGS. Table 2.1 III. Atoms Combine to Form Molecules C. Three types of Bonds.
Molecules of life:DNA, RNA and Amino Acids Molecular Structure Lecture 3.
DNA and Protein Synthesis A Brief Tutorial. Background DNA is the genetic material. DNA is the genetic material. Sometimes called “the blueprint of.
D. NUCLEIC ACIDS 1.ARE MADE OF THE ELEMENTS C,H,O,N,P.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
DNA VISUAL PPT QUIZ #2 CH. 12. Question #1 : Cytosine will form a base pair only with: a. cytosine b. adenine c. thymine d. Katzine e. guanine.
Chapter 11 DNA and GENES. DNA: The Molecule of Heredity DNA, the genetic material of organisms, is composed of four kinds nucleotides. A DNA molecule.
Proteins. Proteins Chains of amino acids Basic structure below:
PROTEINS The final product of the DNA blueprint Hemoglobin.
Introduction to Bioinformatics Algorithms Algorithms for Molecular Biology CSCI Elizabeth White
Protein Structure Primary - sequence of amino acids Secondary – folding into pleated sheets or alpha helix Tertiary – 3-D structure, completely folded.
Protein Structure  The structure of proteins can be described at 4 levels – primary, secondary, tertiary and quaternary.  Primary structure  The sequence.
Teaching Bioinformatics Nevena Ackovska Ana Madevska - Bogdanova.
Lecture 1 BNFO 601 Usman Roshan. Course overview Perl progamming language (and some Unix basics) –Unix basics –Intro Perl exercises –Dynamic programming.
Introduction to molecular biology Data Mining Techniques.
Four Levels of Protein Structure Amino acids Primary structure.
DNA Structure and Protein Synthesis Topic 2.4. Introduction  Cause of CF?  faulty CFTR protein  What causes faulty protein?  DNA Mutation  What is.
Bioinformatics Overview
Nucleic Acids & Proteins
Data-intensive Computing: Case Study Area 1: Bioinformatics
Lecture 1 BNFO 601 Usman Roshan.
BIOCHEMISTRY The chemistry of the carbon atom Versatility of the carbon atom Bonds readily to itself, forms chains, rings, single and double.
Amino Acids and Proteins
Protein Synthesis and Protein Folding
Bell Ringer On a clean sheet of paper, this will be turned in today.
DNA Structure.
Protein Synthesis.
. Nonpolar (hydrophobic) Nonpolar (hydrophobic) Amino Acid Side Chains
Macromolecules.
Nucleic Acids.
BNFO 602 Lecture 2 Usman Roshan.
The Cell Cycle and Protein Synthesis
بیوشیمی : پروتئین ها و لیپیدها
BIOCHEMISTRY The chemistry of the carbon atom Versatility of the carbon atom Bonds readily to itself, forms chains, rings, single and double.
DNA and RNA.
Organic Compounds (Cont.) Proteins and Nucleic Acids
DNA:The cells Information system
Biological Chemistry.
Chapter 3 Part 2 Lecture Outline See PowerPoint Image Slides
(Really) Basic Molecular Biology
Keratin By: Lara Glendening, Blake Carlile, Dean Kidder-Buell, and Casey Elmhirst aditya.
The final product of the DNA blueprint
Four Levels of Protein Structure
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Lecture 1 BNFO 135 Usman Roshan

Course overview Perl progamming language (and some Unix basics) –Unix basics –Intro Perl exercises –Programs for comparing DNA and protein sequences Sequence analysis –Pairwise and multiple sequence comparison –Sequence alignments –Application of alignments –Heuristic alignment (BLAST)

Overview (contd) Grade: 40% programming assignments, 30% mid-term and 30% final exam Recommended Texts: –Perl for Bioinformatics by Arun Jagota –Introduction to Bioinformatics by Arthur Lesk

Nothing in biology makes sense, except in the light of evolution AAGACTT -3 mil yrs -2 mil yrs -1 mil yrs today AAGACTT T_GACTTAAGGCTT _GGGCTTTAGACCTTA_CACTT ACCTT (Cat) ACACTTC (Lion) TAGCCCTTA (Monkey) TAGGCCTT (Human) GGCTT (Mouse) T_GACTTAAGGCTT AAGACTT _GGGCTTTAGACCTTA_CACTT AAGGCTTT_GACTT AAGACTT TAGGCCTT (Human) TAGCCCTTA (Monkey) A_C_CTT (Cat) A_CACTTC (Lion) _G_GCTT (Mouse) _GGGCTTTAGACCTTA_CACTT AAGGCTTT_GACTT AAGACTT

Representing DNA in a format manipulatable by computers DNA is a double-helix molecule made up of four nucleotides: –Adenosine (A) –Cytosine (C) –Thymine (T) –Guanine (G) Since A (adenosine) always pairs with T (thymine) and C (cytosine) always pairs with G (guanine) knowing only one side of the ladder is enough We represent DNA as a sequence of letters where each letter could be A,C,G, or T. For example, for the helix shown here we would represent this as CAGT.

Transcription and translation

Amino acids Proteins are chains of amino acids. There are twenty different amino acids that chain in different ways to form different proteins. For example, FLLVALCCRFGH (this is how we could store it in a file) This sequence of amino acids folds to form a 3-D structure

Protein folding

The protein folding problem is to determine the 3-D protein structure from the sequence. Experimental techniques are very expensive. Computational are cheap but difficult to solve. By comparing sequences we can deduce the evolutionary conserved portions which are also functional (most of the time).

Protein structure Primary structure: sequence of amino acids. Secondary structure: parts of the chain organizes itself into alpha helices, beta sheets, and coils. Helices and sheets are usually evolutionarily conserved and can aid sequence alignment. Tertiary structure: 3-D structure of entire chain Quaternary structure: Complex of several chains

Key points DNA can be represented as strings consisting of four letters: A, C, G, and T. They could be very long, e.g. thousands and even millions of letters Proteins are also represented as strings of 20 letters (each letter is an amino acid). Their 3-D structure determines the function to a large extent.