Alternative Splicing As an introduction to microarrays.

Slides:



Advertisements
Similar presentations
Microarray Technology and Applications
Advertisements

Biology and Cells All living organisms consist of cells. Humans have trillions of cells. Yeast - one cell. Cells are of many different types (blood, skin,
Application of available statistical tools Development of specific, more appropriate statistical tools for use with microarrays Functional annotation of.
Modeling sequence dependence of microarray probe signals Li Zhang Department of Biostatistics and Applied Mathematics MD Anderson Cancer Center.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Applications to Bioinformatics: Microarray Data Mining
Gene Expression Chapter 9.
Introduction to DNA Microarrays Todd Lowe BME 88a March 11, 2003.
DNA microarray and array data analysis
Microarrays Dr Peter Smooker,
Chip arrays and gene expression data. With the chip array technology, one can measure the expression of 10,000 (~all) genes at once. Can answer questions.
Alternative splicing and evolution Daniel Jeffares.
Chip arrays and gene expression data. Motivation.
Bacterial Physiology (Micr430)
Information Aspects of Nucleic Acids Measurement Technologies Description of nucleic acid measurement technologies Algorithmic, optimization, data analysis.
Microarray Data Analysis Using R Studies in Tissue Databases Mark Reimers, NCI.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
Copyright © 2002 KDnuggets Knowledge Discovery in Microarray Gene Expression Data Gregory Piatetsky-Shapiro IMA 2002 Workshop on Data-driven.
Characterizing Alternative Splicing With Respect To Protein Domains BME 220 Project Charlie Vaske.
Inferring the nature of the gene network connectivity Dynamic modeling of gene expression data Neal S. Holter, Amos Maritan, Marek Cieplak, Nina V. Fedoroff,
Microarrays: Theory and Application By Rich Jenkins MS Student of Zoo4670/5670 Year 2004.
Introduce to Microarray
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
DNA and Chromosome Structure. Chromosomal Structure of the Genetic Material.
and analysis of gene transcription
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
with an emphasis on DNA microarrays
Affymetrix vs. glass slide based arrays
AP Biology Ch. 20 Biotechnology.
es/by-sa/2.0/. Large Scale Approaches to the Study of Gene Expression Prof:Rui Alves Dept.
Analysis and Management of Microarray Data Dr G. P. S. Raghava.
Introduction to DNA Microarray Technology Steen Knudsen Uma Chandran.
Data Type 1: Microarrays
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Gene Expression Data Qifang Xu. Outline cDNA Microarray Technology cDNA Microarray Technology Data Representation Data Representation Statistical Analysis.
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Microarray - Leukemia vs. normal GeneChip System.
Scenario 6 Distinguishing different types of leukemia to target treatment.
CS491JH: Data Mining in Bioinformatics Introduction to Microarray Technology Technology Background Data Processing Procedure Characteristics of Data Data.
MPL Identification of alternative spliced mRNA variants related to cancers by genome-wide ESTs alignment KIM DAE SOO Oncogene Apr.
Introduction to DNA microarray technologies Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
داده كاوي و كاربرد آن در پزشكي بنام خدا نام دانشجو : بابك رزاقي شماره دانشجويي : استاد راهنما : جناب آقاي دكتر توحيد خواه ( سمينار درس كاربرد.
Microarrays and Gene Expression Analysis. 2 Gene Expression Data Microarray experiments Applications Data analysis Gene Expression Databases.
What Is Microarray A new powerful technology for biological exploration Parallel High-throughput Large-scale Genomic scale.
1 FINAL PROJECT- Key dates –last day to decided on a project * 11-10/1- Presenting a proposed project in small groups A very short presentation (Max.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Gene Expression Analysis. 2 DNA Microarray First introduced in 1987 A microarray is a tool for analyzing gene expression in genomic scale. The microarray.
Whole Genome Approaches to Cancer 1. What other tumor is a given rare tumor most like? 2. Is tumor X likely to respond to drug Y?
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Microarray Technology. Introduction Introduction –Microarrays are extremely powerful ways to analyze gene expression. –Using a microarray, it is possible.
Stephanie J. Culler, Kevin G. Hoff, Christina D. Smolke
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
David Sadava H. Craig Heller Gordon H. Orians William K. Purves David M. Hillis Biologia.blu B – Le basi molecolari della vita e dell’evoluzione The Eukaryotic.
Overview of Microarray. 2/71 Gene Expression Gene expression Production of mRNA is very much a reflection of the activity level of gene In the past, looking.
Chapter 2 From Genes to Genomes. 2.1 Introduction We can think about mapping genes and genomes at several levels of resolution: A genetic (or linkage)
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Gene expression  Introduction to gene expression arrays Microarray Data pre-processing  Introduction to RNA-seq Deep sequencing applications RNA-seq.
Introduction to Oligonucleotide Microarray Technology
Rest of Chapter 11 Chapter 12 Genomics, Proteomics, and Transgenics Jones and Bartlett Publishers © 2005.
Microarray: An Introduction
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Microarray Technology and Applications
Lecture 11 By Shumaila Azam
GENE REGULATION prokaryotic cells – have about 2,000 genes
Data Type 1: Microarrays
Presentation transcript:

Alternative Splicing As an introduction to microarrays

Human Genome 90,000 Human proteins, initially assumed near that number of genes (initial estimates 153,000) The 1000 cell roundworm Caenorhabditis elegans has 19,500 genes, corn has 40,000 genes Current estimates are 25,000 or fewer genes Alternative splicing allows different tissue types to perform different function with same gene assortment

Implications 75% of human genes are subject to alternative editing faulty gene splicing leads to cancer and congenital diseases. gene therapy can use splicing

Application We talked before about apoptotis when the cell determines it cant be repaired Bcl-x is a regulator of apoptotis, is alternatively spliced to produce either Bcl-x(L) that suppresses apoptosis, or Bcl-x(S) that promotes it.

Spliceosome Five snRNA molecules U1, U2, U3, U4, U5, U6 combine with as many as 150 proteins to form the spliceosome It recognizes sites where introns begin and end –Cuts introns out of pre-mRNA –joins exons

Spliceosome The 5’ splice site is at the beginning of the intron, the 3’ site is at the end The average human protein coding gene is nucleotides long with 8.8 exons separated by 7.8 introns exons are 120 nucleotides long while introns are ,000 nucleotides long

Splicing errors familial dysautonomia results from a single- nucleotide mutation that causes a gene to be alternatively spliced in nervous system tissue The decrease in the IKBKAP protein leads to abnormal nervous system development (half die before 30) > 15% of gene mutations that cause genetic diseases and cancers are caused by splicing errors.

Why splicing Each gene generates 3 alternatively spliced mRNAs Why so much intron (1-2% of genome is exons)? Mouse and human differences are almost all splicing Half of the human genome is made up of transposable elements, Alus being the most abundant (1.4 million copies) –They continue to multiply and insert themselves into the genome at the rate of one insertion per 100 human births mutations in the Alu can create a 5’ or 3’ site in an intron causing it to be an exon This mutation doesn’t impact existing exons It only has effect when it is alternatively spliced in

Microarrays For Alt. Splicing Use short oligonucleotides Get a guess at the rate of expression of the oligo Exon 1 Exon 3 Exon 2Exon 4Exon 5

Affymetrix Microarrays For Alt. Splicing Exon 1 Exon 3 Exon 2Exon 4Exon 5 Exon 1Exon 2Exon 4Exon 5 Exon 1Exon 3Exon 5 Isoform 1: Isoform 2: Probe types Constitutive Junction Exon Unique (“Cassette”)

Ideal Microarray Readings Exon 1Exon 2Exon 4Exon 5 Exon 1Exon 3Exon 5 Isoform 1: Isoform 2: Probe types Constitutive Exon Junction Unique (“Cassette”) a a b c d e Probe Expression abcde

Motivation Why alternatively splice? How does it affect the resulting proteins? Look at domains: –High level summary of protein –~80% of eukaryotic proteins are multi- domain –Domains are big relative to an exon

Some Previous Work Signatures of domain shuffling in the human genome. Kaessmann, Intron phase symmetry around domain boundaries The Effects of Alternative Splicing On Transmembrane Proteins in the Mouse Genome. Cline, Half of TM proteins studied affected by alt- splicing.

Method Predict Alternative Splicing Predict Protein Domains Look for effects of Alt-Splicing on predicted domains –“Swapping” –“Knockout” –“Clipping”

Microarray Design Genes based on mRNA and EST data in mouse Mapped to Feb mouse genome freeze ~500,000 probes (~66,000 sets) ~100,000 transcripts ~13,000 gene models

Technical work Genome Space transcripts probes Provided data Overlap gene models cc-chr cc-chr cc-chr Probe to transcript mapping Generated Data

Predicting Alternative Splicing Using mouse alt-splicing microarrays Data from Manny Ares –8 tissues –3 replicates of each tissue

Predicting Alternative Splicing General Approach: Clustering, then Anti-Clustering 107 Clusters Detail View

Gene Expression Measurement mRNA expression represents dynamic aspects of cell mRNA expression can be measured with latest technology mRNA is isolated and labeled with fluorescent protein mRNA is hybridized to the target; level of hybridization corresponds to light emission which is measured with a laser

Gene Expression Microarrays The main types of gene expression microarrays: Short oligonucleotide arrays (Affymetrix); cDNA or spotted arrays (Brown/Botstein). Long oligonucleotide arrays (Agilent Inkjet); Fiber-optic arrays...

Affymetrix Microarrays 50um 1.28cm ~10 7 oligonucleotides, half Perfectly Match mRNA (PM), half have one Mismatch (MM) Raw gene expression is intensity difference: PM - MM Raw image

Microarray Potential Applications Biological discovery –new and better molecular diagnostics –new molecular targets for therapy –finding and refining biological pathways Recent examples –molecular diagnosis of leukemia, breast cancer,... –appropriate treatment for genetic signature –potential new drug targets

Microarray Data Analysis Types Gene Selection –find genes for therapeutic targets –avoid false positives (FDA approval ?) Classification (Supervised) –identify disease –predict outcome / select best treatment Clustering (Unsupervised) –find new biological classes / refine existing ones –exploration …

Microarray Data Mining Challenges too few records (samples), usually < 100 too many columns (genes), usually > 1,000 Too many columns likely to lead to False positives for exploration, a large set of all relevant genes is desired for diagnostics or identification of therapeutic targets, the smallest set of genes is needed model needs to be explainable to biologists

Microarray Data Classification Prediction: ALL or AML Gene Value D26528_at 193 D26561_cds1_at -70 D26561_cds2_at 144 D26561_cds3_at 33 D26579_at 318 D26598_at 1764 D26599_at 1537 D26600_at 1204 D28114_at 707 Data Mining model New sample Microarray chipsImages scanned by laser Datasets