DNA in the chromosomes of the genome contains all the information to develop an organism and operate all its cell types.DNA in the chromosomes of the genome.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

The Organization of Cellular Genomes Complexity of Genomes Chromosomes and Chromatin Sequences of Genomes Bioinformatics As we have discussed for the last.
Gene Expression Chapter Eleven. What is Gene Expression? When a gene is expressed – that gene’s protein product is made: 1.DNA is transcribed into RNA.
A view of life Chapter 1. Properties of Life Living organisms: – are composed of cells – are complex and ordered – respond to their environment – can.
Transcriptomics Breakout. Topics Discussed Transcriptomics Applications and Challenges For Each Systems Biology Project –Host and Pathogen Bacteria Viruses.
Models and methods in systems biology Daniel Kluesing Algorithms in Biology Spring 2009.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Introduction to Proteomics. First issue of Proteomics- Jan. 1, 2001.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
COG and GO tutorial.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Genes. Outline  Genes: definitions  Molecular genetics - methodology  Genome Content  Molecular structure of mRNA-coding genes  Genetics  Gene regulation.
Introduction to BioInformatics GCB/CIS535
Bacterial Physiology (Micr430)
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
Fuzzy K means.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Cloning, genomes, and proteomes
Understanding Drosophila Development at the Molecular Level Gene Myers EECS, UCal, Berkeley.
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Unit 7 Lesson 1 DNA Structure and Function
Control of Gene Expression Eukaryotes. Eukaryotic Gene Expression Some genes are expressed in all cells all the time. These so-called housekeeping genes.
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
AP Biology Control of Eukaryotic Genes.
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
DOE Genomics: GTL Program IT Infrastructure Needs for Systems Biology David G. Thomassen Office of Biological and Environmental Research DOE Office of.
Beyond the Human Genome Project Future goals and projects based on findings from the HGP.
KEY CONCEPT Biology is the study of all forms of life.
GTL Facilities Computing Infrastructure for 21 st Century Systems Biology Ed Uberbacher ORNL & Mike Colvin LLNL.
Gene expression and DNA microarrays Old methods. New methods based on genome sequence. –DNA Microarrays Reading assignment - handout –Chapter ,
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
Finish up array applications Move on to proteomics Protein microarrays.
Microbial Biotechnology Philadelphia University
Molecular Biology Primer. Starting 19 th century… Cellular biology: Cell as a fundamental building block 1850s+: ``DNA’’ was discovered by Friedrich Miescher.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Chapter 21 Eukaryotic Genome Sequences
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
12.3 DNA, RNA, and Protein Objective: 6(C) Explain the purpose and process of transcription and translation using models of DNA and RNA.
Genomes To Life Biology for 21 st Century A Joint Initiative of the Office of Advanced Scientific Computing Research and Office of Biological and Environmental.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Central dogma: the story of life RNA DNA Protein.
Statistical Testing with Genes Saurabh Sinha CS 466.
Introduction to biological molecular networks
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
RNA and Gene Expression BIO 224 Intro to Molecular and Cell Biology.
Biological Networks. Can a biologist fix a radio? Lazebnik, Cancer Cell, 2002.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
The State of Microarrays The Scientist: 2003 By: Hien Dang.
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
1 Studying Life. 1 Studying Life 1.1 What Is Biology? 1.2 How Is All Life on Earth Related? 1.3 How Do Biologists Investigate Life? 1.4 How Does Biology.
High throughput biology data management and data intensive computing drivers George Michaels.
Chapter 8 Section 8.4: DNA Transcription 1. Objectives SWBAT describe the relationship between RNA and DNA. SWBAT identify the three kinds of RNA and.
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Molecular Mechanisms of Gene Regulation
Genomes and Their Evolution
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
9 Future Challenges for Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
From Mendel to Genomics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Computational Biology
Introduction to Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

DNA in the chromosomes of the genome contains all the information to develop an organism and operate all its cell types.DNA in the chromosomes of the genome contains all the information to develop an organism and operate all its cell types. RNA serves both short-term informational roles and structural roles.RNA serves both short-term informational roles and structural roles. Proteins execute the functions of a cell and provides its structural integrity.Proteins execute the functions of a cell and provides its structural integrity. Small metabolites (fats, sugars, etc.) provide energy, raw materials, and serve some limited structural roles.Small metabolites (fats, sugars, etc.) provide energy, raw materials, and serve some limited structural roles. The Elements of Molecular Biology A principal goal is to understand cells and organisms as molecular systems / machines. The basic classes of molecules are:

Cell Nucleus Cells As Molecular Machines Genome Transport TranslationProtein Metabolics: Synthesis Synthesis Degradation Degradation Energy EnergyActivation Receptor SignalCascade Gene mRNATranscription Splicing Secretion RibosomePolymerase TBF

 Determining the DNA sequences of the chromosomes of a species. Sequencing  An accurate parts list of all the proteins and RNAs in the cell. Annotation  A graph of all the interactions taking place between these agents. Pathways  What is happening during each interaction. Function  Where each interaction is taking place. Subcellular Localization  Simulating the system to predict behavior. Cellular Dynamics Understanding Cells at the Molecular Level

Current State  We can sequence the euchromatic portions of genomes.  We can recognize 75% of the genes but not accurately unless they have been experimentally verified. We don’t know much about alternate splicing.  We can crudely observe expression of mRNAs and with even greater difficulty observe the more abundant proteins.  Most accurate molecular biological information is still being verified one hypothesis at a time.  We must either coordinate efforts or reduce experimental costs to the point where each investigator is greatly empowered.

Current Technologies  Sequencing: Randomly sample and sequence 600bp stretches from the ends of segments of a given length and assemble, followed by a directed finishing phase.  Expression Assays: High density arrays where each spot is a set of 18-50bp DNAs complementary to the RNA sequence to be measured, or geometric amplification from a pair of DNA probes complementary to the RNA sequence (quantitative PCR).  Proteomics: Mass spectrometers can measure the amount and atomic weight of ionized protein pieces (peptides) allowing complex mixtures to be analyzed.  Light Microscopy: With confocal microscopes and antibody, or RNA, or organo- metallic staining, phenomenon involving but a few particles are being observed.  All of these technologies involve interesting problems in the interpretation of the data. Data Analysis vs. Data Mining

On Systems Biology  The goal is to understand cells and organisms as molecular systems. An objective that can be reached in the next years is the elucidation of the topology of the “circuit” diagram of a species’ cells.  A graph whose nodes are proteins, RNAs, DNAs, and metabolites  Edges or hyper edges describe relationships, e.g. Catalyzes A->B, Kinase (adds phosphate group), Complexes with, etc. Catalyzes A->B, Kinase (adds phosphate group), Complexes with, etc.  Nodes carry attributes describing where and when the agent is present  Believe that such knowledge will result in tremendous “understanding” in that it will generate many hypotheses that are true.  Many “circuits” when modeled as stochastic diff. eq.s are so robust to their parameters that they have only a handful of operating ranges and the biologically relevent one(s) is readily evident.

Apoptosis: Programmed cell death

A Proposal based on Drosophila  Goal: Develop a comprehensive set of accurate genomics-based data and biological resources for the advancement of cellular biology. Make them openly and readily accessible and available to all. 1.Sequence species of Drosophila, 3-5 to a nearly-finished state and to an assembled state. 2.Understand the evolution of the genomes and accurately annotate the genes and regulatory signals using comparative informatics. 3.Develop a comprehensive, experimentally verified annotation of all possible transcripts of the nearly-finished species. Capture each in a gate-way clone library. 4.Build state of the art expression assays for these genomes and apply them in a number of surveys. 5.Utilize mass spec. technology to begin systematically exploring complex formation and signalling cascades of Drosophila cells.

Why Drosophila?  Sufficiently interesting model: development, differentiation, behavior  Easy to stock lines and many experimental methods.  Compact genome: 20 Dros. Genomes = 1 mammalian genome  Excellent evolutionary and molecular model for humans  Large community: coordinated efforts are possible

Given a ladder of 5-10 Drosophila genomes we should be able to accurately discern most protein encoding genes, many RNA genes, and a host of regulatory signals Comparative Gene Finding sechellia mauritiana simulans melanogaster orena teissieri yakuba mimetica 0M10M20M30M40M pseudoobscura Require ab initio models to predict orthologs consistent with the evolutionary ladder amongst all strains.

 We need to make computers easier to program – i.e. we need to put scientific computing in the hands of the scientists.  Our information management technologies are inadequate – huge data sets, semi-structured, data contains errors, not integrated – we need to model these and develop flexible data mining capabilities over them.  There will be a continued need for new algorithms and tools as driven by new technologies and protocols.  Physical simulations systems of various types will be needed – docking, ligand binding, stochastic differential equations.  Experimental design, driven by analysis and simulation, should be a part of our discipline and is an area where we can but are not contributing. The Role of Informatics