Modeling and Understanding Stress Response Mechanisms with Expresso Ruth G. Alscher Lenwood S. Heath Naren Ramakrishnan Virginia Tech, Blacksburg, VA 24061.

Slides:



Advertisements
Similar presentations
The Robert Gordon University School of Engineering Dr. Mohamed Amish
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
1 Bioinformatics in the Department of Computer Science Lenwood S. Heath Department of Computer Science Blacksburg, VA College of Engineering Northern.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
August 19, 2002Slide 1 Bioinformatics at Virginia Tech David Bevan (BCHM) Lenwood S. Heath (CS) Ruth Grene (PPWS) Layne Watson (CS) Chris North (CS) Naren.
1 MicroArray -- Data Analysis Cecilia Hansen & Dirk Repsilber Bioinformatics - 10p, October 2001.
Modeling and Understanding Stress Response Mechanisms with Expresso Ruth G. Alscher Lenwood S. Heath Naren Ramakrishnan Virginia Tech, Blacksburg, VA
Bioinformatics: A New Frontier for Computer Scientists Ruth G. Alscher Lenwood S. Heath.
Gene expression analysis summary Where are we now?
Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance in Plants Ruth Grene Alscher Lenwood S. Heath Virginia Tech.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
The Power of Microarray Technology Ruth G. Alscher.
December 14, 2001Slide 1 Some Biology That Computer Scientists Need for Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA 24061
Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance in Plants Ruth Grene Alscher Lenwood S. Heath Naren Ramakrishnan.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Unit 1 Biology Notes Characteristics of Life
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
Applications of Functional Genomics and Bioinformatics Towards an Understanding of Oxidative Stress Resistance in Plants: Expresso and Chips.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Microarrays: Basic Principle AGCCTAGCCT ACCGAACCGA GCGGAGCGGA CCGGACCGGA TCGGATCGGA Probe Targets Highly parallel molecular search and sort process based.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Inferring Cellular Networks Using Probabilistic Graphical Models Jianlin Cheng, PhD University of Missouri 2009.
Statistical Bioinformatics QTL mapping Analysis of DNA sequence alignments Postgenomic data integration Systems biology.
Metagenomic Analysis Using MEGAN4
Bioinformatics.
Expresso and Chips Studying Drought Stress in Plants with cDNA Microarrays Lenwood S. Heath Department of Computer Science Virginia Tech, VA
1 Bio + Informatics AAACTGCTGACCGGTAACTGAGGCCTGCCTGCAATTGCTTAACTTGGC An Overview پرتال پرتال بيوانفورماتيك ايرانيان.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Development and Evaluation of a Comprehensive Functional Gene array for Environmental Studies Zhili He 1,2, C. W. Schadt 2, T. Gentry 2, J. Liebich 3,
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
PattArAn – From Annotation Triplets to Sentence Fingerprints Motivation Motivation  Scientific concepts are annotated with controlled vocabulary (CV)
November 16, 2001Slide 1 Opportunities in Bioinformatics for Computer Science Lenwood S. Heath Virginia Tech Blacksburg, VA University.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Gene expression analysis
Genomics and Arabidopsis. What is ‘genomics’? Study of an organism’s entire genome –All the DNA encoded in the organism –Nucleus, mitochondria, chloroplasts.
K Phone: Web: A Software Package for the Design and Analysis of Microbial Functional.
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
Gene Expression and Networks. 2 Microarray Analysis Supervised Methods -Analysis of variance -Discriminate analysis -Support Vector Machine (SVM) Unsupervised.
Epigenetic Modifications in Crassostrea gigas Claire H. Ellis and Steven B. Roberts School of Aquatic and Fishery Sciences, University of Washington, Seattle,
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
EB3233 Bioinformatics Introduction to Bioinformatics.
May 23, 2002Slide 1 Networks in Bioinformatics Lenwood S. Heath Virginia Tech Blacksburg, VA, USA I-SPAN’02 Manila, Philippines May 23, 2002.
Bioinformatics and Computational Biology
Microarray (Gene Expression) DNA microarrays is a technology that can be used to measure changes in expression levels or to detect SNiPs Microarrays differ.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Life Science. Explain that cells are the basic unit of structures and function of living organisms. Cells are the basic unit of structures of living organisms.
Bioinformatics Research Overview Li Liao Develop new algorithms and (statistical) learning methods > Capable of incorporating domain knowledge > Effective,
Funded by the Library of Congress.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Microarray: An Introduction
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
BME435 BIOINFORMATICS.
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
Understanding Stress Response Mechanisms with Expresso
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Summary of the Standards of Learning
Presentation transcript:

Modeling and Understanding Stress Response Mechanisms with Expresso Ruth G. Alscher Lenwood S. Heath Naren Ramakrishnan Virginia Tech, Blacksburg, VA NSF Site Visit NCSU Forest Biotechnology Group July 12, 2001

Who’s Who Ruth Alscher Plant Stress Boris Chevone Plant Stress Ron Sederoff, Ross Whetten Len van Zyl Y-H.Sun Forest Biotechnology Plant Biology Computer Science Lenwood Heath (CS) Algorithms Naren Ramakrishnan (CS) Data Mining Problem Solving Environments Craig Struble, Vincent Jouenne (CS) Image Analysis Statistics Ina Hoeschele (DS) Statistical Genetics Keying Ye (STAT) Bayesian Statistics Virginia Tech North Carolina State Univ. Virginia Tech Dawei Chen Molecular Biology Bioinformatics

People Ross Whetten Boris Chevone Ron Sederoff Y-H.SunDawei Chen Lenny Heath Ruth Alscher Vincent Jouenne Naren Ramakrishnan Keying Ye Len van Zyl Craig Struble

Overview Plant responses to environmental stress Stress on a chip Summary of results obtained Expresso –Managing expression experiments –Analyzing expression data –Reaching conclusions Where we go from here –Modeling experiments –Modeling pathways

Plant-Environment Interactions Several defense systems that respond to environmental stress are known. Their relative importance is not known. Mechanistic details are not known. Redox sensing may be involved.

Scenarios for Effect of Abiotic Stress on Plant Gene Expression

The 1999 Experiment: A Measure of Long Term Adaptation to Drought Stress Loblolly pine seedlings (two unrelated genotypes “C” and “D”) were subjected to mild or severe drought stress for four (mild) or three (severe) cycles. –Mild stress: needles dried down to –10 bars; little effect on growth, new flushes as in control trees. –Severe stress: needles dried down to –17 bars; growth retardation, fewer new flushes compared to controls. Harvest RNA at the end of growing season, determine patterns of gene expression on DNA microarrays. With algorithms incorporated into Expresso, identify genes and groups of genes involved in stress responses.

Hypotheses There is a group of genes whose expression confers resistance to drought stress. Based on previous work (RGA and others for superoxide dismutases and glutathione reductases) increased expression of defense genes is co-regulated and is correlated with resistance to oxidative stress. Failure to cope is correlated with little or no defense gene activation. A common core of defense genes exists, which responds to several different stresses.

Selection of cDNAs for Arrays 384 ESTs (xylem, shoot tip cDNAs of loblolly) were chosen on the basis of function and grouped into categories. Major emphasis was on processes known to be stress responsive. In cases where more than one EST had similar BLAST hits, all ESTs were used.

Categories within Protective and Protected Processes Plant Growth Regulation Environmental Change Gene Expression Signal Transduction Protective Processes Protected Processes ROS and Stress Cell Wall Related Phenylpropanoid Pathway Development Metabolism Chloroplast Associated Carbon Metabolism Respiration and Nucleic Acids Mitochondrion Cells Tissues Cytoskeleton Secretion Trafficking Nucleus Protease-associated

Hypotheses versus Results Among the genes responding to mild stress, there exists a population of genes whose expression confers resistance. –Genes in 69 categories responded positively to mild stress in Genotypes C and D (the positive response was not observed in the severe stress condition in Genotype D). There is evidence for a response to drought among genes associated with other stresses. –Isoflavone reductase homologs and GSTs responded positively to mild drought stress. –These categories are previously documented to respond to biotic stress and xenobiotics, respectively.

Quality Control Positive: LP-3, a loblolly gene known to respond positively to drought stress in loblloly pine, was included. LP-3 was positive in the moist versus mild comparison, and unchanged in the moist versus severe comparison. Negative: Four clones of human genes used as negative controls in the Arabidopsis Functional Genomics project were included. The clones did not respond.

Candidate Categories Include –Aquaporins –Dehydrins –Heat shock proteins/chaperones Exclude –Isoflavone reductases

Integration of design and procedures Integration of image analysis tools and statistical analysis Connections to web database and sequence alignment tools The software Aleph was used for inductive logic programming (ILP). Expresso: A Problem Solving Environment (PSE) for Microarray Experiment Design and Analysis

Expresso: A Microarray Experiment Management System

Inductive Logic Programming ILP is a data mining algorithm expressly designed for inferring relationships. By expressing relationships as rules, it provides new information and resultant testable hypotheses. ILP groups related data and chooses in favor of relationships having short descriptions. ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications).

Rule Inference in ILP Infers rules relating gene expression levels to categories, both within a probe pair and across probe pairs, without explicit direction Example Rule: [Rule 142] [Pos cover = 69 Neg cover = 3] level(A,moist_vs_severe,not positive) :- level(A,moist_vs_mild,positive). Interpretation: “If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.”

More Rules we Obtained [Rule 6] level(A,moist_vs_mild,positive) :- category(A, transport_protein). level(A,mild_vs_severe,negative) :- category(A, transport_protein). [Rule 13] level(A,moist_vs_mild,positive) :- category(A, heat). [Rule 17] level(A,moist_vs_mild,positive) :- category(A, cellwallrelated).

ILP subsumes two forms of reasoning Unsupervised learning –“Find clusters of genes that have similar/consistent expression patterns” Supervised learning –“Find a relationship between a priori functional categories and gene expression” Hybrid reasoning –“Is there a relationship between genes in a given functional category and genes in a particular expression cluster?” –ILP mines this information in a single step

ILP in a Data Mining Context Attribute-Value Methods Clustering Conceptual Clustering SVMsSOMs Similarity-Metric Agglomerative Divisive (bottom-up) (top-down) ILP combines the expressiveness of conceptual clustering with the efficiency of attribute-value techniques.

Current Status of Expresso Completely automated and integrated –Statistical analysis –Data mining –Experiment capture in MEL Current Work: Integrating –Image processing –Querying by semi-structured views –Expresso-assisted experiment composition

Future Directions Next Generation Stress Chips 1.Further work on Expresso and pine cDNA microarray experiments recently funded by an NSF Next Generation Software grant. 2.Time course, short and long term, to capture gene expression events underlying “emergency” and adaptive events following drought stress imposition. (Use all currently available pine ESTs for candidate stress resistance genes.) 2.Initiate modeling of kinetics of drought stress responses. 3.Generate cDNA library from stressed seedlings.

Future Directions Expresso An open, integrated system for design, process, analysis, data mining, data storage, and integration of information from web-based resources. Supports closing the experimental loop. Accumulated results influence later experiments, as well as enable construction of testable models of pathways. Multiple models are refined and evaluated within Expresso. Biologists have interactive access to models and control Expresso’s components.