Introduction to genome biology and microarrays experiment

Slides:



Advertisements
Similar presentations
Annie Perales and Miranda Viorst
Advertisements

MOLECULAR GENETICS. DNA- deoxyribonucleic acid James Watson and Francis Crick discover the structure of the DNA molecule DNA is a double helix (twisted.
Classical and Modern Genetics.  “Genetics”: study of how biological information is carried from one generation to the next –Classical Laws of inheritance.
Microarray Simultaneously determining the abundance of multiple(100s-10,000s) transcripts.
Biological background: Molecular Biology Class web site: Statistics for Microarrays.
RNA and Protein Synthesis
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
Basic Biology for CS262 OMKAR DESHPANDE (TA) Overview Structures of biomolecules How does DNA function? What is a gene? How are genes regulated?
Introduction to the biology and technology of DNA microarrays Sandrine Dudoit PH 296, Section 33 10/09/2001.
DNA and RNA. I. DNA Structure Double Helix In the early 1950s, American James Watson and Britain Francis Crick determined that DNA is in the shape of.
Transcription & Translation Biology 6(C). Learning Objectives Describe how DNA is used to make protein Explain process of transcription Explain process.
Introduce to Microarray
Biomolecules Nucleic acids.  Are the genetic materials of all organisms and determine inherited characteristics.  The are two kinds of nucleic acids,
Nucleic Acids DNA vs. RNA
Gene Structure: DNA RNA Protein Dr. Jason Tasch. Nucleic Acids Sequence of Nucleotides Nucleotide composed of: –Nitrogenous Base Purine Pyrimidine –Sugar.
Introduction to gene expression Seema Zargar. Lecture outline Introduction to all terms used in Gene expression.
From DNA to Proteins Lesson 1. Lesson Objectives State the central dogma of molecular biology. Describe the structure of RNA, and identify the three main.
DNA & Genetics Biology. Remember chromosomes? What are genes? Made up of DNA and are units of heredity; unique to everyone What are traits? Are physical.
Chapter 11 DNA and Genes. Proteins Form structures and control chemical reactions in cells. Polymers of amino acids. Coded for by specific sequences of.
Protein Synthesis Pages Part 3. Warm-Up: DNA DNA is a double stranded sequence of ___________ (smallest unit of DNA). 2.Short segments of.
13.1 RNA.
RNA Structure and Transcription Mrs. MacWilliams Academic Biology.
GENETIC CONTROL OF PROTEIN SYNTHESIS, CELL FUNCTION, AND CELL REPRODUCTION PART 1.
End Show Slide 1 of 39 Copyright Pearson Prentice Hall 12-3 RNA and Protein Synthesis RNA and Protein Synthesis.
Chapter 11 DNA Within the structure of DNA is the information for life- the complete instructions for manufacturing all the proteins for an organism. DNA.
DNA Notes DAY 2 Replication, overview of transcription, overview of translation WARM UP What is the base pairing rule? Who created it?
RNA & Protein Synthesis.
Introduction to DNA microarray technologies Sandrine Dudoit, Robert Gentleman, Rafael Irizarry, and Yee Hwa Yang Bioconductor short course Summer 2002.
How Genes Work Ch. 12.
Lecture #3 Transcription Unit 4: Molecular Genetics.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Introduction to Gene Expression
Lesson Overview Lesson OverviewFermentation Lesson Overview 13.1 RNA.
What is central dogma? From DNA to Protein
Nucleic Acids Comparing DNA and RNA. Both are made of nucleotides that contain  5-carbon sugar,  a phosphate group,  nitrogenous base.
RNA. What is RNA?  RNA stands for Ribonucleic acid  Made up of ribose  Nitrogenous bases  And a phosphate group  The code used for making proteins.
Chapter 11: DNA & Genes Sections 11.1: DNA: The Molecular of Heredity Subsections: What is DNA? Replication of DNA.
Thursday, March 31 st Objective: Explain and apply laws of heredity and their relationship to the structure and function of DNA Agenda: 1. Introduction.
DNA Structure and Protein Synthesis (also known as Gene Expression)
RNA, transcription & translation Unit 1 – Human Cells.
Transcription Objectives: Trace the path of protein synthesis.
PROTEIN SYNTHESIS TRANSCRIPTION AND TRANSLATION. TRANSLATING THE GENETIC CODE ■GENES: CODED DNA INSTRUCTIONS THAT CONTROL THE PRODUCTION OF PROTEINS WITHIN.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
DNA, RNA. Genes A segment of a chromosome that codes for a protein. –Genes are composed of DNA.
Learning Outcome 16: Describe the structure and function of the nucleus. Nucleus.
Microbial Genetics Structure and Function of Genetic Material The Regulation of Bacterial Gene Expression Mutation: Change in Genetic Material Genetic.
RiboNucleic Acid (RNA) -Contrast RNA and DNA. -Explain the process of transcription. - Differentiate between the 3 main types of RNA -Differentiate between.
Microarray: An Introduction
Gene Expression DNA, RNA, and Protein Synthesis. Gene Expression Genes contain messages that determine traits. The process of expressing those genes includes.
Chapter 8 Section 8.4: DNA Transcription 1. Objectives SWBAT describe the relationship between RNA and DNA. SWBAT identify the three kinds of RNA and.
All proteins consist of ________. 1.DNA molecules 2.RNA molecules 3.triglyceride chains 4.polypeptide chains
CH 12.3 RNA & Protein Synthesis. Genes are coded DNA instructions that control the production of proteins within the cell…
Copyright © by Holt, Rinehart and Winston. All rights reserved. ResourcesChapter menu Overview Section 2 The Structure of DNA DNA.
8.2 KEY CONCEPT DNA structure is the same in all organisms.
DNA Microarray. Microarray Printing 96-well-plate (PCR Products) 384-well print-plate Microarray.
Unit 2.1: BASIC PRINCIPLES OF HUMAN GENETICS
Chapter 13.1: RNA Essential Questions
RNA Ribonucleic Acid Single-stranded
RNA (Ch 13.1).
RNA and Transcription Lecture #19 Honors Biology Ms. Day
Transcription.
Unit 2.1: BASIC PRINCIPLES OF HUMAN GENETICS
What is RNA? Do Now: What is RNA made of?
RNA and Transcription DNA RNA PROTEIN.
Lesson Overview 13.1 RNA
12-3 RNA and Protein Synthesis
Unit Animal Science.
Genes and Protein Synthesis Review
The Structure of DNA.
DNA Deoxyribonucleic Acid.
Presentation transcript:

Introduction to genome biology and microarrays experiment Statistics for Microarray Data Analysis – Lecture 1 The Fields Institute for Research in Mathematical Sciences May 25, 2002

The human genome The cell is the fundamental working unit of every living organism. Humans: trillions of cells (metazoa); other organisms like yeast: one cell (protozoa). Cells are of many different types (e.g. blood, skin, nerve cells), but all can be traced back to a single cell, the fertilized egg.

Genes The human genome is distributed along 23 pairs of chromosomes. 22 autosomal pairs; the sex chromosome pair, XX for females and XY for males. In each pair, one chromosome is paternally inherited, the other maternally inherited. Chromosomes are made of compressed and entwined DNA. A (protein-coding) gene is a segment of chromosomal DNA that directs the synthesis of a protein.

Chromosomes and DNA

DNA A deoxyribonucleic acid or DNA molecule is a double-stranded polymer composed of four basic molecular units called nucleotides. Each nucleotide comprises a phosphate group, a deoxyribose sugar, and one of four nitrogen bases: adenine (A), guanine (G), cytosine (C), and thymine (T). The two chains are held together by hydrogen bonds between nitrogen bases. Base-pairing occurs according to the following rule: G pairs with C, and A pairs with T.

Genetic and physical maps

Genetic and physical maps Physical distance: number of base pairs (bp). Genetic distance: expected number of crossovers between two loci, per chromatid, per meiosis. Measured in Morgans (M) or centiMorgans (cM). 1cM ~ 1 million bp (1Mb) in humans

Central dogma The expression of the genetic information stored in the DNA molecule occurs in two stages: (i) transcription, during which DNA is transcribed into mRNA; (ii) translation, during which mRNA is translated to produce a protein. DNA  mRNA  protein Other important aspects of regulation: methylation, alternative splicing, etc. The correspondence between DNA's four-letter alphabet and a protein's twenty-letter alphabet is specified by the genetic code, which relates nucleotide triplets to amino acids.

Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be better, but is currently harder.

Differential expression Each cell contains a complete copy of the organism's genome. Cells are of many different types and states E.g. blood, nerve, and skin cells, dividing cells, cancerous cells, etc. What makes the cells different? Differential gene expression, i.e., when, where, and in what quantity each gene is expressed. On average, 40% of our genes are expressed at any given time.

RNA A ribonucleic acid or RNA molecule is a nucleic acid similar to DNA, but single-stranded; ribose sugar rather than deoxyribose sugar; uracil (U) replaces thymine (T) as one of the bases. RNA plays an important role in protein synthesis and other chemical activities of the cell. Several classes of RNA molecules, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), and other small RNAs.

Exons and introns Genes comprise only about 2% of the human genome; the rest consists of non-coding regions, whose functions may include providing chromosomal structural integrity and regulating when, where, and in what quantity proteins are made (regulatory regions). The terms exon and intron refer to coding (translated into a protein) and non-coding DNA, respectively.

Functional genomics The various genome projects have yielded the complete DNA sequences of many organisms. E.g. human, mouse, yeast, fruitfly, etc. Human: 3 billion base-pairs, 30-40 thousand genes. Challenge: go from sequence to function, i.e., define the role of each gene and understand how the genome functions as a whole.

Introduction to microarrays

Gene expression assays DNA microarrays rely on the hybridization properties of nucleic acids to monitor DNA or RNA abundance on a genomic scale in different types of cells. The main types of gene expression assays: High-density nylon membrane arrays; Serial analysis of gene expression (SAGE); Short oligonucleotide arrays (Affymetrix); Long oligonucleotide arrays (Agilent); Fibre optic arrays (Illumina); cDNA arrays (Brown/Botstein)*.

Applications of microarrays Measuring transcript abundance; Genotyping; Estimating DNA copy number; Determining identity by descent; Measuring mRNA decay rates; Identifying protein binding sites; Determining sub-cellular localization of gene products; ……..

The process Building the chip: RNA preparation: Hybing the chip: PCR PURIFICATION and PREPARATION MASSIVE PCR PREPARING SLIDES PRINTING RNA preparation: Hybing the chip: CELL CULTURE AND HARVEST POST PROCESSING ARRAY HYBRIDIZATION RNA ISOLATION DATA ANALYSIS cDNA PRODUCTION PROBE LABELING Taken from Schena & Davis

bacterial glycerol stocks) Building the chip PCR amplification Directly from colonies with SP6-T7 primers in 96-well plates Consolidate into 384-well plates Arrayed Library (96 or 384-well plates of bacterial glycerol stocks) Slide 4 Ideally, you will have a slides consist of all genes in the genomeThe design of the array / template is an imporatnt bioinformatics question. As you are only able to detect expression of genes on this template. Spot as microarray on glass slides Ngai Lab, UC Berkeley)

384 well plate Glass Slide cDNA clones Pins collect cDNA from wells Contains cDNA probes cDNA clones Spotted in duplicate Print-tip group 1 Glass Slide Array of bound cDNA probes 4x4 blocks = 16 print-tip groups Print-tip group 6

Sample preparation

Hybridization Binding of cDNA target samples to cDNA probes on the slide cover slip Hybridize for 5-12 hours

Hybridization chamber 3XSSC HYB CHAMBER ARRAY LIFTERSLIP SLIDE LABEL SLIDE LABEL Humidity Temperature Formamide (Lowers the Tm)

Expression profiling with DNA microarrays cDNA “B” Cy3 labeled cDNA “A” Cy5 labeled Laser 1 Laser 2 Hybridization Scanning Slide 5 You will have another sample of interests consists of a portion of the genes and the questions is to find out what genes are being expressed in this sample. The simplified version of the experiments is to label the sample with a color dye hybridize the sample on the slides and scan the slides under some laser to see which spots that light up. This provides a list of genes and it’s corresponding expression levels in our samples. Our interest is on comparing the expression levels between the two samples. + Analysis Image Capture

RGB overlay of Cy3 and Cy5 images

Raw data Human cDNA arrays ~ 43K spots; 16–bit TIFFs: ~ 20Mb per channel; ~ 2,000 x 5,500 pixels per image; Spot separation: ~ 136um; For a “typical” array: Mean = 43, med = 32, SD = 26 pixels per spots

Image analysis The raw data from a cDNA microarray experiment consist of pairs of image files, 16-bit TIFFs, one for each of the dyes. Image analysis is required to extract measures of the red and green fluorescence intensities for each spot on the array.

Image analysis GenePix

Image analysis R and G for each spot on the array. 1. Addressing. Estimate location of spot centers. 2. Segmentation. Classify pixels as foreground (signal) or background. 3. Information extraction. For each spot on the array and each dye signal intensities; background intensities; quality measures. Motivation behind background adjustment: A spot’s measured fluorescence intensity includes a contribution that is not specifically due to the hybridization of the target to the probe, but to something else, e.g. the chemical treatment of the slide, autofluorescence etc. Want to estimate and remove this unwanted contribution. R and G for each spot on the array.

Spot Batch automatic addressing. Segmentation. Seeded region growing (Adams & Bischof 1994): adaptive segmentation method, no restriction on the size or shape of the spots. Intensity extraction Foreground. Mean of pixel intensities within a spot. Background. Morphological opening: non-linear filter which generates an image of the estimated background intensity for the entire slide. Spot quality measures. Software package. Spot, built on the freely available software package R.

Quality measures Spot One channel, R or G Signal/noise ratio; Variation in pixel intensities; Identification of “bad spots” (no signal), etc. Two channels, R/G Brightness: foreground/background ratio; Uniformity: variation in pixel intensities and ratios of intensities; Morphology: area, perimeter, circularity. Array (slide) Percentage of spots with no signal; Range of intensities; Distribution of spot signal area, etc. How to use quality measures in subsequent analyses?

Terminology Probe: DNA spotted on the array, aka. spot, immobile substrate. Target: DNA hybridized to the array, mobile substrate. Sector: collection of spots printed using the same print-tip (or pin), aka. print-tip-group, pin-group, spot matrix, grid. Batch: collection of slides with the same probe layout. The terms slide or array are often used to refer to the printed microarray.

Microarray Life Cycle Biological Question Data Analysis & Modelling Sample Preparation Microarray Life Cycle MicroarrayDetection Microarray Reaction Taken from Schena & Davis

WWW resources Complete guide to “microarraying” http://cmgm.stanford.edu/pbrown/mguide/ http://www.microarrays.org Parts and assembly instructions for printer and scanner; Protocols for sample prep; Software; Forum, etc. Animation: http://www.bio.davidson.edu/courses/genomics/chip/chip.html

Acknowledgments Introduction to biology: based on Bioconductor short course lecture 1 (http://www.bioconductor.org/) and Temple short course lecture 1 with pictures from http://www.accessexcellence.org/ Introduction to microarray: based on IPAM, UCLA http://www.ipam.ucla.edu/programs/fg2000/tutorials.html#terry_speed Thanks also to: Natalie Thorne (WEHI) Ingrid Lönnstedt (Uppsala)