ICML-Tutorial, Banff, Canada, 2004 Measured by gene expression microarrays Gene Regulation System Biology Gene expression: two-phase process 1.Gene is.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

. Context-Specific Bayesian Clustering for Gene Expression Data Yoseph Barash Nir Friedman School of Computer Science & Engineering Hebrew University.
Microarray Data Analysis Day 2
Rich Probabilistic Models for Gene Expression Eran Segal (Stanford) Ben Taskar (Stanford) Audrey Gasch (Berkeley) Nir Friedman (Hebrew University) Daphne.
From Sequence to Expression: A Probabilistic Framework Eran Segal (Stanford) Joint work with: Yoseph Barash (Hebrew U.) Itamar Simon (Whitehead Inst.)
BioinformaticsFox Chase Cancer Center Signaling, Microarrays, and Annotations Michael Ochs Information Science and Technology, Fox Chase Cancer Center.
Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms Jan Komorowski and Astrid Lägreid.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
JYC: CSM17 BioinformaticsCSM17 Week 10: Summary, Conclusions, The Future.....? Bioinformatics is –the study of living systems –with respect to representation,
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
Protein structure (Part 2 of 2).
Gene Expression Overview
MOPAC: Motif-finding by Preprocessing and Agglomerative Clustering from Microarrays Thomas R. Ioerger 1 Ganesh Rajagopalan 1 Debby Siegele 2 1 Department.
Investigating the Importance of non-coding transcripts.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The Protein Data Bank (PDB)
How Are Genes Expressed? Chapter11. DNA codes for proteins, many of which are enzymes. Proteins (enzymes) can be used to make all the other molecules.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Protein Synthesis Ordinary Level. Lesson Objectives At the end of this lesson you should be able to 1.Outline the steps in protein synthesis 2.Understand.
Protein Synthesis Mrs. Harlin.
Express yourself That darn ribosome Mighty Mighty Proteins Mutants RNA to the Rescue
Protein Tertiary Structure Prediction
JM - 1 Introduction to Bioinformatics: Lecture VIII Classification and Supervised Learning Jarek Meller Jarek Meller Division.
Transcription.
Whole Genome Expression Analysis
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Finish up array applications Move on to proteomics Protein microarrays.
Notes: Protein Synthesis
Transcription and Translation.  Genes: are segments of DNA that code for proteins  Most nucleotide base sequences in DNA don’t code for anything  ATGCGAATCGTAGCATACGATGCATGCACGTG.
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
Michael Cummings David Reisman University of South Carolina Gene Regulation Part 2 Chapter 9.
Predicting protein degradation rates Karen Page. The central dogma DNA RNA protein Transcription Translation The expression of genetic information stored.
DNA encodes messenger RNA
12.3 DNA, RNA, and Protein Objective: 6(C) Explain the purpose and process of transcription and translation using models of DNA and RNA.
ICML-Tutorial, Banff, Canada, 2004 Kristian Kersting University of Freiburg Germany „Application of Probabilistic ILP II“, FP
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Background & Motivation Problem & Feature Construction Experiments Design & Results Conclusions and Future Work Exploring Alternative Splicing Features.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Class Notes 3 RNA and the Central Dogma. I. Function of DNA A.The DNA is a set of instructions for the ribosomes to follow as they make proteins (protein.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Gene Regulation II : The Ribosome Strikes Back!. Mechanisms Covered Attenuation Control –Tryptophan Biosynthesis Riboswitches –Tryptophan Biosynthesis.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
A collaborative tool for sequence annotation. Contact:
A Bioinformatics Tool for Analyzing G-quadruplexes in the mRNA Untranslated Regions ザカレ ザッパァ Zachary Zappala.
RNA By PresenterMedia.com PresenterMedia.com. DNA is located in the nucleus of eukaryotic cells A strand of DNA is moved from the nucleus out into the.
Improved and Promising Identification of Human MicroRNAs by Incorporating a High-Quality Negative Set.
DNA Function Chp. 10 Biology. RNA- (ribonucleic acid) RNA- (ribonucleic acid) – Plays several roles in the manufacture of proteins – Made of nucleotides.
Inference with Gene Expression and Sequence Data BMI/CS 776 Mark Craven April 2002.
11 Gene function: genes in action. Sea in the blood Various kinds of haemoglobin are found in red blood cells. Each kind of haemoglobin consists of four.
Exam #1 is T 2/17 in class (bring cheat sheet). Protein DNA is used to produce RNA and/or proteins, but not all genes are expressed at the same time or.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Knowledge prior to this research? Questions addressed? Experimental approaches used? and what outcome? Impact of these findings? Future experiments? Research.
Knowledge prior to this research? Questions addressed? Experimental approaches used? and what outcome? Impact of these findings? Future experiments? Research.
Date of download: 7/10/2016 Copyright © 2016 McGraw-Hill Education. All rights reserved. Basic steps of gene expression—transcription factors regulate.
The Transcriptional Landscape of the Mammalian Genome
Proteins & Nucleic Acids
Protein Synthesis From genes to proteins.
(3) Gene Expression Gene Expression (A) What is Gene Expression?
Exam #1 is T 9/23 in class (bring cheat sheet).
Transcription -The main purpose of transcription is to create RNA from DNA because RNA leaves the nucleus to carry out its functions but DNA does not -A.
Control of Gene Expression in Eukaryotic cells
Protein Synthesis Lecture 5
DNA Function Chp. 10 Biology.
RNA.
Chapter 6 Bellringer Unscramble the following words: tpsoneir neesg
Protein structure prediction.
Predicted location and functional classification of differentially expressed transcripts. Predicted location and functional classification of differentially.
RNA is a nucleic acid made of linked nucleotides.
Protein Synthesis.
Presentation transcript:

ICML-Tutorial, Banff, Canada, 2004 Measured by gene expression microarrays Gene Regulation System Biology Gene expression: two-phase process 1.Gene is transcribed into mRNA 2.mRNA is translated Protein Genes that are similar expressed are often coregulated and involved in the same cellular processes Clustering: identification of clusters of genes and/or experiments that share similar expression patterns [Segal et al.]

ICML-Tutorial, Banff, Canada, 2004 Gene Regulation System Biology: heterogenous data Limitations of Clustering: –Similarities over all measurements –Difficult to incorporate readily background knowledge such as clinical data or experimental details [Segal et al.]

ICML-Tutorial, Banff, Canada, 2004 Relational context Array ClusterGene Cluster Gene Regulation ExpressionLevel/1 ArrayPhase/1 ArrayCluster/1 inArray/2 Gene Features, such as function, localization,... [Segal et al., simplified representation] ofGene/2 GeneCluster/1 Lipid/1 AminoAcid Metabolism/1 Cytoplasm/1 GCN4/1

ICML-Tutorial, Banff, Canada, 2004 Gene Regulation Synthatic data: 1000 genes, 90 arrays (= measurements), each gene 15 functions and 30 transcription factors. [Segal et al.] Cluster recovery Naive BayesPRMs Simulated data90.8± ±1.07 Noisy simluated data76.7± ±1.52

ICML-Tutorial, Banff, Canada, 2004 Gene Regulation Real world data: predicting the array cluster of an array without performing the experiment Link introduced between arrays and genes Outside the scope of other approaches ! [Segal et al.]

ICML-Tutorial, Banff, Canada, 2004 Protein Fold Recognition Comparison of protein structure is fundamental to biology, e.g. function prediction Two proteins show sufficient sequence similarity = essentially adopt the same structure. If one of the two similar proteins has a known structure, can build a rough model of the protein of unknown structure. [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 strand orientation length quantized number of acids type of helixhelix Protein Secondary Structure [helix(h(right,3to10),5), helix(h(right,alpha),13), strand(null,7), strand(minus,7), strand(minus,5), helix(h(right,3to10),5),…] [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 Model ~120 parameters vs. over parameters Secondary structure of domains of proteins (from PDB and SCOP) fold1: TIM beta/alpha barrel fold, fold2: NAD(P)-binding Rossman-fold fold23: Ribosomal protein L4, fold37: glucosamine 6-phosphate deaminase/isomerase old fold55: leucine aminopeptidas fold logical sequences (> ground atoms) [Kersting et al.]

ICML-Tutorial, Banff, Canada, 2004 Results Accuracy: 74% vs. 82.7% (1622 vs / 2187) Majority vote: 43% fold1fold2fold23fold37fold55 precision0.86 / / / / / 0.74 recall0.78 / / / / / 0.86 New Class of relational Kernels (see Thomas Gaertner´s Tutorial on Kernels for Structured Data). [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 mRNA Science Magazine: RNA one of the runner- up breakthroughs of the year Identifying subsequences in mRNA that are responsible for biological functions. Secondary structures of mRNAs form tree structures: not easily for HMMs [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 mRNA [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 mRNA 93 logical sequences (in total 3122 ground atoms) –15 and 5 SECIS (Selenocysteine Insertion Sequence), –27 IRE (Iron Responsive Element), –36 TAR (Trans Activating Region) and –10 histone stemloops. Leave-one-out crossvalidation: Plug-In Estimates: 4.3 % error Fisher kernels SVM: 2.2 % error [Kersting et al.; Kersting, Gaertner]

ICML-Tutorial, Banff, Canada, 2004 Web Log Data Log data of web sides KDDCup 200 ( RMM over [Anderson et al.]

ICML-Tutorial, Banff, Canada, 2004 User Log Data [Anderson et al.]

ICML-Tutorial, Banff, Canada, 2004 Collaborative Filterting User preference relationships for products / information. Traditionally: single dyactic relationship between the objects. classPers1 classProd1 buys11buys12buysNM classPersNclassProdM... classPers2 classProd2 [Getoor, Sahami]

ICML-Tutorial, Banff, Canada, 2004 Relational Naive Bayes Collaborative Filtering classPers/1 subscribes/2 classProd/1 visits/2 manufactures reputationCompany/1 topicPage/1 topicPeriodical/1 buys/2 colorProd/1costProd/1 incomePers/1 [Getoor, Sahami; simplified representation]