Study of Arabidopsis’ Copper Regulation by High Throughput Sequence Data Analysis Steven A. Cardenas, SoCal BSI Dr. Pellegrini, PI, UCLA Dr. Casero Diaz-Cano,

Slides:



Advertisements
Similar presentations
Next-Generation Sequencing: Methodology and Application
Advertisements

Determining the roles of the BTB genes At2g04740, At4g08455, At1g04390, and At2g30600 in Arabidopsis thaliana growth and development. Brandon D. Blaisdell,
The Trihelix Transcription Factor Family Heather Hernandez.
CH. 11 : Transcriptional Control of Gene Expression Jennifer Brown.
20,000 GENES IN HUMAN GENOME; WHAT WOULD HAPPEN IF ALL THESE GENES WERE EXPRESSED IN EVERY CELL IN YOUR BODY? WHAT WOULD HAPPEN IF THEY WERE EXPRESSED.
CD4 is a transmembrane glycoprotein expressed on the cell surface of thymocytes and mature T- lymphocytes with helper function. T-cells develop in the.
Finding detailed relationships between proteins specific to phenotypes among microbial organisms Daniel Park Molecular Biology Institute, UCLA Yeates lab.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
High Throughput Genome Sequencing: A Test of Functional Overlap in Mismatch Repair Proteins Ana Brar PI: Dr. John Hays.
Transcriptomics Jim Noonan GENE 760.
Southern California Bioinformatics Summer Institute Wendie Johnston, Beverly Krilowicz, Jamil Momand, Sandra Sharp, Nancy Warter- Perez.
A Genomic Survey of Polymorphism and Linkage Disequilibrium Imran Mohiuddin Magnus Nordborg, Ph.D. University of Southern California.
Discovery Of A Novel Nucleotide Sequence In Taricha granulosa David J. Stanley Mentor: Frank L. Moore Department of Zoology.
Identification of compounds to affect radiosensitivity of cells Pellegrini Lab—UCLA SoCalBSI 2007 Joshua Smith Bazyl Nettles.
Larry Lam Southern California Bioinformatics Summer Institute 2009 Graeber Lab – Crump Institute for Molecular Imaging UCLA A Data Management and Analysis.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Data analytical issues with high-density oligonucleotide arrays A model for gene expression analysis and data quality assessment.
Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr.
The Effects of Deleting Cytosolic Thioredoxin Reductase on p53 Target Gene Expression Sydney Radding Dr. Gary Merrill Dept. Of Biochemistry/Biophysics.
A Computational Analysis of the H Region of Mouse Olfactory Receptor Locus 28 Deanna Mendez SoCalBSI August 2004.
Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang.
Genetic Effects of Stress in Vervet Monkey Olivera Grujic Dr. Eleazar Eskin’s Lab, UCLA Dr. Nelson Freimer’s Lab,UCLA SoCalBSI, 2008.
Gene Regulation: What it is, and how to detect it By Jordan, Jennifer, and Brian.
High Throughput Sequencing
and analysis of gene transcription
Before we start: Align sequence reads to the reference genome
HC70AL Spring 2009 Gene Discovery Laboratory RNA and Tools For Studying Differential Gene Expression During Seed Development 4/20/09tratorp.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
RNAseq analyses -- methods
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
Verna Vu & Timothy Abreo
Genomics and Arabidopsis. What is ‘genomics’? Study of an organism’s entire genome –All the DNA encoded in the organism –Nucleus, mitochondria, chloroplasts.
Gene Regulations and Mutations
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Javad Jamshidi Fasa University of Medical Sciences, November 2015 Gene Structure and Transcriptio n.
The C3HC4-Type RING Zinc Finger and MYB Transcription Factor Families Matthew Taube June 5, 2008 HC70AL.
Lecture 12 RNA – seq analysis.
Motif Search and RNA Structure Prediction Lesson 9.
Arabidopsis Thaliana A Study of Genes and Embryo Development By Garen Polatoglu.
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003.
Microarray: An Introduction
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Canadian Bioinformatics Workshops
Building Excellence in Genomics and Computational Bioscience miRNA Workshop: miRNA biogenesis & discovery Simon Moxon
NAC Family Genes AT1G01720 AT1G77450
Risheng Chen et al BMC Genomics
The Transcriptional Landscape of the Mammalian Genome
Are At1g08810 and At3g50060 Important to Arabidopsis Seed Development?
Promoters and expression
Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs
Lab meeting
Algorithms for Regulatory Motif Discovery
Trans-acting siRNA Brett Johnson BMS265.
Small RNA Sample Preparation
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
RNA sequencing (RNA-Seq) and its application in ovarian cancer
The RdDM Pathway Is Required for Basal Heat Tolerance in Arabidopsis
Edwards Allen, Zhixin Xie, Adam M. Gustafson, James C. Carrington  Cell 
Volume 48, Issue 5, Pages (December 2012)
Volume 7, Issue 9, Pages (September 2014)
Arabidopsis Thaliana Gene AT5G58610
Gene Expression Analysis
Sequence Analysis - RNA-Seq 2
7 WT 6 irx1-6 ixr fold increased expression 3 2 n- 1 AT1G18710
Manfred Schmid, Agnieszka Tudek, Torben Heick Jensen  Cell Reports 
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

Study of Arabidopsis’ Copper Regulation by High Throughput Sequence Data Analysis Steven A. Cardenas, SoCal BSI Dr. Pellegrini, PI, UCLA Dr. Casero Diaz-Cano, Post Doc, UCLA Steven A. Cardenas, SoCal BSI Dr. Pellegrini, PI, UCLA Dr. Casero Diaz-Cano, Post Doc, UCLA

Objective of Project  Analysis of Sets of Differentially Expressed Genes in Plus and Minus Copper Conditions For Arabidopsis WT  Identify Spl7 Regulated Genes  Potential Upstream Motifs That Regulate the Genes

Project Significance  To Further the Development of Techniques Used in High Throughput Analysis.  The Study of Copper Regulation in Arabidopsis.  This Data Could Be Used to Help Increase Our Understanding of Copper Regulation in the Human Body.

 Arabidopsis Thaliana  Tools Used  Solexa Sequencing  Low Level Data Analysis  Downstream Data Analysis  Future Work Outline of Presentation

Arabidopsis Thaliana  A Small Flowering Plant Related to Cabbage and Mustard  Found in Europe, Asia, and Northwestern Africa  First Plant Genome to be Sequenced and it is Well Annotated

 TAIR Tools Used  SOAP  MATLAB  Excel

Solexa Sequencing 1.Prepare Genomic DNA Sample 2.Attach DNA to Flow Cell Surface 3.Amplification 4.Determine First Base 5.Image First Base 6. Determine Second Base 7.Sequence Reads Over Multiple Chemistry Cycles

AAAA > 250 – 500 Mb 33 nt sequence Random Hexamer Primed 1st Strand cDNA Synthesis End Repair and Adaptor Ligation PCR AAAA 2nd Strand cDNA Synthesis Metal Catalyzed Fragmentation Sequence 60 – 200 nt Size Selection 200 bp Illumina mRNA Sample Preparation by Whole Transcriptome Analysis (WTA)

Experimental Conditions of Analyzed Data Root CellShoot Cell Arabidopsis Wild TypeSpl7 Mutant +Cu and -Cu Root CellShoot Cell

Data Analysis Solexa Data Align Data TAIR Refseq  Calculate Hits per Gene  Normalize  Regularize  Check For Reproducibility  Differentially Expressed Gene Statistical Analysis  Spl7 Motif Statistical Analysis Spreadsheet of Results SOAP MATLAB Excel

Data Reproducibility Replicate 1 (Alignment Hits per Million) Replicate 2 (Alignment Hits per Million) Arabidopsis WT Root Cell Minus Copper Condition

Statistical Analysis for Differential Expression  Differential Expression of Genes in Plus Copper vs. Minus Copper  Statistical Problems  Only Two Replicates  Large Dynamic Range of Data

 Student’s T-test  Fails With Large Dynamic Range  Bayesian T-test  Makes Use of Genes With Similar Expression Levels  Currently Still Fails With Large Dynamic Range  Binomial Test  Combined Replicates  Fails When Reproducibility is Bad Statistical Analysis for Differential Expression

Top Differentially Expressed Genes with Binomial Test Root Reference mRNA Sequence Hits per million (unique hits) Gene WT +Cu GAN1 WT +Cu GAN5 WT -Cu GAN2 WT -Cu GAN6 WT log10 (P value) Bayesian WT log10 (P value) Binomial WT log10 (P value) t-test WT log2 (fold change) spl7 +Cu GAN4 spl7 +Cu GAN8 spl7 -Cu GAN3 spl7 -Cu GAN7 Spl7 log10 (P value) Bayesian spl7 log10 (P value) Binomial Spl7 log10 (P value) t- test Spl7 log2(fold change) Glycosyl Hydrolase Family 17 protein (AT4G16260) Copper Ion Transporter (COPT2) Copper Chaperone (CCH) Ferric-Chelate Reductase (ATFRO5/FRO5) Zinc Ion Transporter (ZIP2) Peroxidase, Putative (AT1G49570) Pentatricopeptide (PPR) Repeat- Containing Protein (AT1G07590) Manganese Ion Binding (GLP5) Copper Ion Binding (UCC2) Peroxidase, Putative (AT5G19890) Min: Bayesian Binomial –inf Student T-test -5.63

Motifs Analysis: The First Approach Select Potential Targets of transcription factor SPL7 Statistical Test Background Distribution Derived From Word Counts In the Whole Genome Retrieve Promoter Sequences From the Genome Calculate Word Count For SPL7 Motif

Future Work  Research New Statistical Methods to Better Identify Differentially Expressed Genes  Use of Non Fixed Window For Bayesian T-test  Finish Analysis of Motifs That Regulate the Differentially Expressed Genes  Identify Transcribed Non Coding RNAs (e.g. microRNAs)

Acknowledgements  UCLA and the Pellegrini Lab  Dr. Matteo Pellegrini  Dr. David Casero Díaz-Cano  Dr. Shawn Cokus  Collaborators  Ute Krammer University of Heidelberg, Germany  Sabeeha Merchant University of California Los Angeles  SoCalBSI Instructors and Fellow Researchers  Funding  Dr. Jamil Momand  Dr. Sandy Sharp  Dr. Nancy Water-Perez  Dr. Wendie Johnston  Dr. Beverly Krilowicz  Dr. Silvia Heubach  Dr. Jennifer Faust  National Institutes of Health  National Science Foundation  Economic & Workforce Development  The Department of Energy