Presentation is loading. Please wait.

Presentation is loading. Please wait.

‘Omics’ - Analysis of high dimensional Data

Similar presentations


Presentation on theme: "‘Omics’ - Analysis of high dimensional Data"— Presentation transcript:

1 ‘Omics’ - Analysis of high dimensional Data
Achim Tresch Computational Biology

2 Schedule Monday Lecture: Introduction to Omics
Motivation: Transcriptomics Experimental techniques Data analysis (overview) Data exploration of univariate data Measures of location and scale Bar plot, box plot, histogramm, density plot Data exploration of bivariate data Odds ratio, correlation crosstable, scatter plot, QQ-plot Sebastian Dümcke Henrik Failmezger Exercises: Introduction to R and Bioconductor Forensic bioinformatics Arijit Das

3 Omics Omics (Wikipedia):
Omics informally refers to a field of study in biology such as genomics, proteomics or metabolomics. Omics aims at the collective characterization and quantification of pools of biologically / biochemically similar molecules that translate into the structure, function, and dynamics of an organism or organisms. Ingredients for omics research: High throughput experimental techniques for the simultaneous measurements of large numbers of molecules Statistical methods for the appropriate analysis of high dimensional data. Generally, Omics data analysis takes longer than data generation!

4 low-medium throughput
Genomics: Transcriptomics Techniques for RNA quantification Northern Blot Reporter genes Reverse Transkriptase PCR Microarrays RNA-Sequencing low-medium throughput high throughput

5 Northern Blot RNA (or DNA) is separated by the size on a gel, transfered to the membrane and hybridized with gene-specific probe RNA -> Nothern blot DNA -> Southern blot Low throughput and poor quantification Molecular Biology of the Cell (© Garland Science 2008)

6 Reverse transcription
RT-PCR Reverse transcription RNA DNA PCR The course of PCR (amount of double-stranded DNA) is monitored using a specific fluorescent dye Differences in concentration of particular mRNA in different samples can be calculated as 2N, with N being the difference in the number of cycles to obtain the same amount of product Medium throughput, high precision N Molecular Biology of the Cell (© Garland Science 2008)

7 Microarrays mRNA is converted to cDNA and labeled, and subsequently hybridized to an array of gene-specific probes (either spotted cDNA samples or oligonucleotides, either one or two sample(s) per microarray) Differences in expression between samples are determined as a ratio of fluorescence signals at individual spots. High throughput, medium precision (low dynamic range) Molecular Biology of the Cell (© Garland Science 2008)

8 Zyklusvorlesung Molekularbiologie WS 2009/10
Next generation sequencing (NGS) Massively parallel sequencing techniques enable sequencing of genome-wide cellular RNA pools Typical sequencing read lentgh is nucleotides  RNA or cDNA has to be fragmented A single run comprises reactions, depending on a platform, so most RNAs are covered by multiple “reads“  read occurence for a particular gene reflects expression level High throughput, precision depends on sequencing depth (#reads) Zyklusvorlesung Molekularbiologie WS 2009/10

9 Next generation sequencing (NGS)
Illumina (Solexa) sequencing DNA fragments are coupled to glass slide and subjected to Bridge amplification. individual reads of bp are produced at a time by using fluorescently labeled removable terminator tags Sample preparation Sequencing

10 Transcriptomics with Microarrays
Workflow of a microarray experiment Experimental design Technical performance Data mining Statistical analysis Frame a biological question Obtain the samples Extract fluorescence intensities Cluster analysis and pattern recognition Isolate total RNA Normalize data to remove biases Choose a microarray platform Study lists of genome ontologies Label cDNA or mRNA Estimate expression changes Search for regulatory motifs Perform the hybridizations Decide on biological and technical replicates Reconstruct regulatory circuits Identify differentially expressd transcripts Scan the chips Design the series of hybridization Design validation and follow-up experiments days weeks days-weeks months After: Gibson, G and SV Muse, 2004.

11 Transcriptomics with Microarrays
labeled sample Sample amplification and labeling sample injected into microarray RNA sample Probe array hybridization Fluorescence intensity translated into mRNA abundance Probe array scanning and intensity quantitation Probe array washing and staining

12 RNA Sample preparation

13 RNA Sample preparation

14 Hybridization onto microarray
Quakenbush, 2006

15 Hybridization onto microarray

16 Hybridization onto microarray
mismatch probes perfect match probes probe pair Each gene is represented by probe pairs of 25nt length, consisting of a perfect match probe and a mismatch probe. Perfect match probes are complementary to specific sequences of the target gene, preferentially located at the 3’ end of a gene. The mismatch probe is identical to the perfect match probe, except for the middle base. It is designed to detect unspecific binding.

17 Affymetrix Microarrays – Probe Synthesis
For the extension of all oligonucleotides by one base, four litographic steps with complementary masks are performed, one mask for each base A, C, T, G.

18 Affymetrix raw Data Greyscale and false color image
of the fluorescence readout

19 Detection of differentially expressed genes
Data Analysis Detection of differentially expressed genes Identification of similar samples and co-regulated genes in a multi-sample comparison genes samples

20 Data Analysis Cluster Analysis Pearson correlation matrix Venn diagram
Summary statistics up-/down regulation [ Phenotypic analysis ] Koschubs et al., EMBO J, 2009.

21 Response to chemical stimulus Vitamin metabolic process
Data Analysis: Gene Ontology Response to chemical stimulus Vitamin metabolic process

22 Microarray Databases ArrayExpress (European Bioinformatics Institute)
Gene Expression Omnibus (NCBI)

23 Dietmar Martin Gene Center,
Acknowledgement Dietmar Martin Gene Center, LMU Munich


Download ppt "‘Omics’ - Analysis of high dimensional Data"

Similar presentations


Ads by Google