Download presentation
Presentation is loading. Please wait.
Published byBridget Smith Modified over 9 years ago
1
3/24/2005 TIGP 1 Bioinformatics for Microarray Studies at IBS Pei-Ing Hwang, Ph.D. Mar. 24, 2005
2
TIGP 2 3/24/2005 Different aspects for life science research genomics transcriptomics proteomics
3
TIGP 3 3/24/2005 Building blocks for DNA or RNA DNA: A, T, G, C DNA: A, T, G, C RNA: A, U, G, C RNA: A, U, G, C
4
TIGP 4 3/24/2005 DNA: deoxyribonucleic acid Double stranded Antiparallel
5
TIGP 5 3/24/2005 Why microarray? Gene Expression Gene Expression To simultaneously study multiple genes To simultaneously study multiple genes To obtain an overview of gene expression at transcriptional level under specific experimental conditions To obtain an overview of gene expression at transcriptional level under specific experimental conditions To study gene interaction network from the transcriptional aspect To study gene interaction network from the transcriptional aspect Genome Genome SNP detection SNP detection To find out recombination site in the chromosome/genome To find out recombination site in the chromosome/genome Hopefully to discover the gene responsible for a genetic disease Hopefully to discover the gene responsible for a genetic disease
6
TIGP 6 3/24/2005 Outline Introduction to Microarray experiments Introduction to Microarray experiments Experiences at IBS for the cDNA arrays Experiences at IBS for the cDNA arrays Data generated with microarray Data generated with microarray DNA annotation DNA annotation Data Analysis Data Analysis Data Management Data Management
7
TIGP 7 3/24/2005 About Microarray Technology-1 Up to hundreds of thousands of spots in a fixed area on a glass slide or a membrane Up to hundreds of thousands of spots in a fixed area on a glass slide or a membrane One species of DNA molecules per one spot One species of DNA molecules per one spot Spot is also named as “ feature ” Spot is also named as “ feature ” DNA fixed on the chip or membrane is also called “ probe DNA fixed on the chip or membrane is also called “ probe The sequence or/and function of each DNA species on the spot is known. The sequence or/and function of each DNA species on the spot is known.
8
TIGP 8 3/24/2005 About Microarray Technology-2 Making use of “ hybridization method ” Making use of “ hybridization method ” A : T, U A : T, U G : C G : C Image processing Image processing Data analysis Data analysis Result interpretation from biology aspect Result interpretation from biology aspect
9
TIGP 9 3/24/2005 Types of Microarray Types of DNA immobilized on the solid support Types of DNA immobilized on the solid support cDNA vs. oligonucleotides cDNA vs. oligonucleotides Manufacturing methods Manufacturing methods Printing vs. photolithography Printing vs. photolithography Solid support Solid support Glass slides Glass slides Membrane Membrane Nucleotide labeling (slide scanning condition) Nucleotide labeling (slide scanning condition) One color vs. two colors One color vs. two colors
10
TIGP 10 3/24/2005 GeneChip ® Array Manufacuturing Figure 1. Affymetrix uses a unique combination of photolithography and combinatorial chemistry to manufacture GeneChip® Arrays.
11
TIGP 11 3/24/2005 Microarray printing machine http://arrayit.com/Products/MicroarrayI/NanoPrint/Nano-Print-new-600.jpg
12
TIGP 12 3/24/2005 Procedure for one-channel array
13
TIGP 13 3/24/2005 Experimental Procedure for 2-channel Microarray
14
TIGP 14 3/24/2005 Data Analyses Feature intensity acquisition Feature intensity acquisition Image analyses: Image analyses: To identify differentially expressed genes Normalization (global, local, print-tip, btwn array etc.) Normalization (global, local, print-tip, btwn array etc.) Clustering or Classification Clustering or Classification Analyses from biology aspect Analyses from biology aspect Significant genes Significant genes Transcriptional regulation study Transcriptional regulation study Cellular pathway or network finding Cellular pathway or network finding
15
3/24/2005 TIGP 15 Experiences at IBS for the cDNA arrays
16
TIGP 16 3/24/2005 About IBS tomato arrays ~13000 spots/features per chip ~13000 spots/features per chip 1 clone per spot 1 clone per spot cDNA clones from ~a dozen of various cDNA libraries cDNA clones from ~a dozen of various cDNA libraries At least two different protocols were followed and six different vectors were used At least two different protocols were followed and six different vectors were used More than ten technicians involved More than ten technicians involved
17
TIGP 17 3/24/2005 Bioinformatics for Microarray at IBS (cont ’ d) IBS tomato EST database construction IBS tomato EST database construction Installation, management and maintenance of data analyses software Installation, management and maintenance of data analyses software Reference information searching Reference information searching Batch Submission of EST sequences Batch Submission of EST sequences
18
TIGP 18 3/24/2005 Bioinformatics Needs for Microarray Studies at IBS Pre-arraying data management Pre-arraying data management cDNA info collection, vector trimming, sequence annotation, EST submission ……..etc. cDNA info collection, vector trimming, sequence annotation, EST submission ……..etc. Array information management Array information management Gene set characterization, data storage, data retrieval Gene set characterization, data storage, data retrieval Post-hybridization data analysis and management Post-hybridization data analysis and management array data analyses, storage of the scanning result, biology- oriented bioinformatics analyses array data analyses, storage of the scanning result, biology- oriented bioinformatics analyses
19
TIGP 19 3/24/2005 Bioinformatics Service Work for Microarray studies at IBS Data pre-processing for the cDNAs Data pre-processing for the cDNAs Clone id assignment Clone id assignment Sequence trimming Sequence trimming gene annotation gene annotation Function classification Function classification Data sheet preparation for commercial software to analyze microarray data Data sheet preparation for commercial software to analyze microarray data Gal file preparation for GenePixPro Gal file preparation for GenePixPro Master Gene List preparation for GeneSpring Master Gene List preparation for GeneSpring
20
TIGP 20 3/24/2005 cDNA clones GenePix Spotfire, GeneSpring Biological meaning : Pathway analysis Transcription network Gene-gene interaction Feature intensities normalization sequencing PCR Vector trimming Assembly Function annotation Database Data analysis: Normalization, Variance Clustering
21
TIGP 21 3/24/2005 Pre-array Bioinformatics clones from labs sequencing Raw EST seq 1.Clone id generation 2.Vector Trimming 3.Sequence assembly 4.Seq annotation (BLAST) 5.EST submission to NCBI 6.Database construction Data Processing and Management
22
TIGP 22 3/24/2005 Clone id generation Data centralization following sequencing Data centralization following sequencing Rules for re-arraying Rules for re-arraying 96 well plate to/from 384 well 96 well plate to/from 384 well PCR from 96 well and spotting from 384 well PCR from 96 well and spotting from 384 well Order of A1, A2, B1, B2 Order of A1, A2, B1, B2
23
TIGP 23 3/24/2005 cDNA clones sequencing PCR 96 or 384 well 96 well 384 well
24
TIGP 24 3/24/2005 96-well to 384 well plates A1 B2 A2 B1
25
TIGP 25 3/24/2005 Data collection Raw sequencing data obtained from the sequencing company Raw sequencing data obtained from the sequencing company Organized and stored both ABI and text files by labs and by date Organized and stored both ABI and text files by labs and by date Confirmed with each sequence contributor for clone info Confirmed with each sequence contributor for clone info Clone id matched with raw sequences Clone id matched with raw sequences
26
TIGP 26 3/24/2005 Processing the sequencing data cDNA libraries procedures confirmed with each single lab cDNA libraries procedures confirmed with each single lab Vector/linker/primer trimming (Seqclean) Vector/linker/primer trimming (Seqclean) Function annotation Function annotation Blast against different database Blast against different database Gene Ontology annotation Gene Ontology annotation Sequence Assembly (Phrap) Sequence Assembly (Phrap)
27
TIGP 27 3/24/2005 Procedure to generate cDNA clones
28
TIGP 28 3/24/2005 IBS tomato EST Database Cloning information Cloning information Sequencing data Sequencing data Vector/adaptor Trimming information Vector/adaptor Trimming information EST assembly EST assembly Function annotation Function annotation Cross Reference Cross Reference
29
3/24/2005 TIGP 29 ID MAP 1. Seq id 2. Clone _ id 3. Contig id 4. Lab_id#1 5. Lab_id#2 6. NCBI_sbmt_id93 7. NCBI_sbmt_id94 8. dbEST _ accn _no 9. note Trimmed Sequence 1. Seq id 2. Trimmed Sequence 3. Method 4. Trim set Assembly Information 1. Contig _ id 2. Contig Sequence 3. BLAST Result 4. Position 5. Component seq id TAIR Result 1. Seq id 2. At number 3. E-Value 4. Description 5. Identity 6. Other result NCBI BLAST Result 1. Seq id 2. NCBI _id 3. E-Value 4. Description 5. Identity 6. Other result TIGR Result 1. Seq id 2. TC number 3. E-Value 4. Description 5. Identity 6. Other result Lab info 1. Seq id 2. Comment 3. Primer 4. Biotech 5. Sender 6. Collect From cDNA Library Information 1. Clone _ id(3)(4) 8. Host. 2. Name 9. Species 3. Date made 10. Vector 4. Developmental stage 11. Antibiotic. 5. Cloning sites 12. Authors 6. Description 13. Tissue 7. Library 14. Primer Gene Ontology 1. TC number 2. EC number 3. Process -GO_id -Description 4. Function -GO_id -Description 5. Component -GO_id -Description TC number Untrimmed Sequence 1. Seq id 2. Trimmed Sequence Clone _ id n11n The Tomato Database Entity-Relationship model TOM 3 TOM 4 Clone _ id Seq _ id
30
TIGP 30 3/24/2005 Information to be further analyzed Gene set characterization Gene set characterization Number of unique genes on the array Number of unique genes on the array Number of known/ unkown genes Number of known/ unkown genes Coordination of each spotted sequence Coordination of each spotted sequence Statistics about spotted cDNA Statistics about spotted cDNA grouped by function/pathway grouped by function/pathway grouped by sequence similarity grouped by sequence similarity
31
3/24/2005 TIGP 31 Post-hybridization data analysis and management
32
TIGP 32 3/24/2005 Post-hybridization data analysis Software for Microarray Analysis At IBS Software for Microarray Analysis At IBS GenePix Pro5.0 – image processing GenePix Pro5.0 – image processing GeneSpring – microarray data analysis GeneSpring – microarray data analysis Spotfire – microarray data analysis and data storage Spotfire – microarray data analysis and data storage TransPath – pathway searching TransPath – pathway searching
33
TIGP 33 3/24/2005 Image Processing GenePix Pro5.0 GenePix Pro5.0 GAL (GenePix Array List) file GAL (GenePix Array List) file
34
TIGP 34 3/24/2005 From multi-well plate to microarray
35
TIGP 35 3/24/2005 GAL online
36
TIGP 36 3/24/2005 GeneSpring at IBS for microarray data analyses for microarray data analyses standalone software standalone software providing statistical methods for data analysis providing statistical methods for data analysis Some bioinformatics Some bioinformatics providing visaulization providing visaulization licensed annually licensed annually rigid format requirement for input data rigid format requirement for input data requiring installation of a master gene list (master table) prior to data analysis requiring installation of a master gene list (master table) prior to data analysis
37
TIGP 37 3/24/2005 Master table for GeneSpring Master table contains information of Master table contains information of Id Id Source of DNA Source of DNA Gene name Gene name Gene function annotation (from Blast results) Gene function annotation (from Blast results) GO annotation GO annotation Each array needs its own master table Each array needs its own master table Format of master table may vary with different version of the software. Format of master table may vary with different version of the software.
38
TIGP 38 3/24/2005 To generate master table for GeneSpring Batch blast against three sequence database Batch blast against three sequence database Parsing Blast results Parsing Blast results Incorporating EC number, GO number and other related data from the best BLAST matched results Incorporating EC number, GO number and other related data from the best BLAST matched results Integrate all required data from various files and generate the master table Integrate all required data from various files and generate the master table checking checking
39
TIGP 39 3/24/2005 Spotfire for microarray data analyses for microarray data analyses server-client software server-client software linked to Oracle database for data storage linked to Oracle database for data storage providing various statistical methods for data analysis providing various statistical methods for data analysis capability in establishing links to more bioinformatics tools capability in establishing links to more bioinformatics tools can record analysis procedure can record analysis procedure more flexible format requirement for input data more flexible format requirement for input data
40
TIGP 40 3/24/2005 One color array for Arabidopsis Affymetrix ATH1 chip Affymetrix ATH1 chip Annotation information provided by company and available on internet Annotation information provided by company and available on internet
41
TIGP 41 3/24/2005 Bioinformatics support at Affymetrix
42
TIGP 42 3/24/2005 Projects for now and the near future Infrastructure build-up Infrastructure build-up Microarray data management system Microarray data management system Platform for Bioinformatics analyses Platform for Bioinformatics analyses Plant Signaling Pathway Database Plant Signaling Pathway Database
43
TIGP 43 3/24/2005 Team
44
3/24/2005 TIGP 44 Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.