International Conference on Bioinformatics HKUST, Hong Kong 2007

Slides:



Advertisements
Similar presentations
1 Q1-Q3 results. 2 RF lengths 3 Filtered RF length distribution.
Advertisements

Breakdown of 244 total (Yale+Vega) Pseudogenes Amongst Various ENCODE Regions 211 Yale, 178 Vega, Union is 244 More pseudogenes in the manually picked.
Topic 7.3 Transcription.
Transcriptome Sequencing with Reference
Understanding the Human Genome: Lessons from the ENCODE project
Gene prediction in ENCODE roderic guigó i serra crg-imim-upf, barcelona Advanced Bioinformatics, chsl, october 2005.
Comparison of array detected transcription map with GENCODE/HAVANA annotations in ENCODE regions.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Displaying associations, improving alignments and gene sets at UCSC Jim Kent and the UCSC Genome Bioinformatics Group.
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
BioInformatics (2). Physical Mapping - I Low resolution  Megabase-scale High resolution  Kilobase-scale or better Methods for low resolution mapping.
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
March 9, 2007 Bologna, February the complexity of human genes The ENCODE Genes & Transcripts group Roderic Guigó Centre de Regulació Genòmica, Barcelona.
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
1 ENCODE Pseudogene Summary for GT call Mark Gerstein 2005, :00 EDT summary of 6 Calls: Sept. 15, 22; Oct. 6, 13, 20, 27.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
발표자 석사 2 년 김태형 Vol. 11, Issue 3, , March 2001 Comparative DNA Sequence Analysis of Mouse and Human Protocadherin Gene Clusters 인간과 마우스의 PCDH 유전자.
Expression of the Genome The transcriptome. Decoding the Genetic Information  Information encoded in nucleotide sequences contained in discrete units.
ModENCODE August 20-21, 2007 Drosophila Transcriptome: Aim 2.2.
Small RNAs and their regulatory roles. Presented by: Chirag Nepal.
Mapping Sites of Transcription Across the Drosophila Genome Using High Resolution Tiling Microarrays LBNL, Berkeley CA August 20, 2007 A. WillinghamAffymetrix,
1 Transcript modeling Brent lab. 2 Overview Of Entertainment  Gene prediction Jeltje van Baren  Improving gene prediction with tiling arrays Aaron Tenney.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
The Havana-Gencode annotation GENCODE CONSORTIUM.
Mark D. Adams Dept. of Genetics 9/10/04
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
1 ENCODE Pseudogene Call Summary Mark Gerstein 2005, :00 EDT (Draft for G&T call on 2005, :00 EDT)
Overview of ENCODE Elements
The Central Dogma of Molecular Biology replication transcription translation.
Finding genes in the genome
Do not reproduce without permission 1 Gerstein.info/talks (c) (c) Mark Gerstein, 2002, Yale, bioinfo.mbb.yale.edu Permissions Statement This Presentation.
CFE Higher Biology DNA and the Genome Transcription.
TRANSCRIPTION (DNA → mRNA). Fig. 17-7a-2 Promoter Transcription unit DNA Start point RNA polymerase Initiation RNA transcript 5 5 Unwound.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Presented by: Matthew Tippin, Bianca Sanchez Mora
Protein Synthesis - Transcription
EGASP 2005 Evaluation Protocol
The Transcriptional Landscape of the Mammalian Genome
The modern view of dispersed genome activity
Experimental Verification Department of Genetic Medicine
Expression of the Genome
ENCODE Pseudogenes and Transcription
Exam #1 W 9/26 at 7-8:30pm in UTC 2.102A Review T 9/25 at 5pm in WRW 102 and in class 9/26.
DNA TRANSCRIPTION Making mRNA.
Eukaryotic Gene Finding
Recitation 7 2/4/09 PSSMs+Gene finding
From: TopHat: discovering splice junctions with RNA-Seq
Introduction to Bioinformatics II
DNA and the Genome Key Area 3b Transcription.
High-Resolution Profiling of Histone Methylations in the Human Genome
EXTENDING GENE ANNOTATION WITH GENE EXPRESSION
Volume 116, Issue 4, Pages (February 2004)
closing in on the set of human genes. The ENCODE project.
High-Resolution Profiling of Histone Methylations in the Human Genome
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
Presented by, Jeremy Logue.
The Structure of the Genome
Human Promoters Are Intrinsically Directional
By Wenfei Jin Presenter: Peter Kyesmu
Presented by, Jeremy Logue.
Universal Alternative Splicing of Noncoding Exons
The 3D Genome in Transcriptional Regulation and Pluripotency
Volume 11, Issue 7, Pages (May 2015)
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

International Conference on Bioinformatics HKUST, Hong Kong 2007 The complexity of the transcriptional landscape of the human genome Roderic Guigó Center for Genomic Regulation, Barcelona November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 genes and proteins One gene, one enzyme Beadle and Tatum The Central Dogma Francis Crick November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

The standard model of the eukaryotic gene most of the transcriptional output of the human genome is localized in well defined genomic loci, which encode mRNAs that, when exported into the cytosol, are translated into proteins November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 GENSCAN HMM (Burge & Karlin). Slide from M. Alexandersson November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 1% of the genome. 44 regions target selection. commitee to select sequence targets manual targets – a lot of information radom targets – stratified by non exonic conservation with mouse gene density November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Cis-regulatory elements DNase Hypersensitive Sites DNA Replication Epigenetic  Genes and Transcripts Cis-regulatory elements (promoters, transcription factor binding sites) Long-range regulatory elements (enhancers, repressors/silencers, insulators)

International Conference on Bioinformatics HKUST, Hong Kong 2007 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

gencode: encyclopedia of genes and gene variants identify all protein coding genes in the ENCODE regions: identify one complete mRNA sequence for at least one splice isoform of each protein coding gene. eventually, identify a number of additional alternative splice forms. Roderic Guigó, IMIM-UPF-CRG Stylianos Antonarakis, Geneve Alexandre Reymond Ewan Birney, EBI Michael Brent, WashU Lior Pachter, Berkeley Manolis Dermitzkakis, Sanger Jennifer Ashurst, Tim Hubbard November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 the gencode pipeline mapping of known transcripts sequences (ESTs, cDNAs, proteins) into the human genome manual curation to resolve conflicting evidence additional computational predictions experimental verification FINAL ANNOTATION November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 the gencode pipeline mapping of known transcripts sequences (ESTs, cDNAs, proteins) into the human genome manual curation to resolve conflicting evidence additional computational predictions experimental verification FINAL ANNOTATION November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

The gencode pipeline manual curation: havana (sanger) experimental verification: geneva bioinformatics: imim 2608 transcripts in 487 loci 137 transcripts in 53 non-coding loci 1097 coding transcripts and 1374 non-coding transcripts in 434 protein coding loci most of protein coding loci encode a mixture of protein coding and non-coding transcripts

one gene - many proteins very complex transcription units November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Distribution of DNaseI HSs vs. TSS in Different Gene Annotation Sets from the ENCODE Chromatin and Replication Group, John Stamatoyannopoulos November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 EGASP’05 the complete annotation of 13 regions was released in january 30, 2005. The annotation of the remaining 31 regions was being obtained, and it was withheld. gene prediction groups were asked to submit predictions by april 15, 2005 in the remaining 31 regions. 18 groups participated, submiting 30 prediction sets predictions were compared to the annoations in an NHGRI sponsored workshop at the Wellcome Trust Sanger Institute, on may 6 and 7, 2005. November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 EGASP’05 two main goals: to assess how automatic methods are able to reproduce the (costly) manual/computational/experimental gencode annotation how complete is the gencode annotation. are there still genes consistenly predicted by computational methods November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 accuracy measures November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

accuracy at the coding exon level evidence-based dual genome “ab intio” November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 EGASP’05 programs are quite good at calling the protein coding exons (accuracy at 80%) Not as good at calling the transcribed exons), but the best of the programs predict correctly only 40% of the complete transcripts (considering only the coding fraction) November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 EGASP’05 many novel exons predicted: - 8,634 unique exons predicted in intergenic regions - we ranked the exons according to the accuracy of te predicted programs - tested 238 exon pairs by RT-PCR in 24 tissues - only 7 (less than 3%) were confirmed positive November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Cis-regulatory elements DNase Hypersensitive Sites DNA Replication Epigenetic  Genes and Transcripts Cis-regulatory elements (promoters, transcription factor binding sites) Long-range regulatory elements (enhancers, repressors/silencers, insulators)

International Conference on Bioinformatics HKUST, Hong Kong 2007 Genome tiling arrays Slide from http://signal.salk.edu/msample.html Salk Institute Genomic Analysis Laboratory November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 TRANSCRIPTION OF PROCESSED POLY A+ RNA based on a number of high throughput tecnologies Total # of nucleotides : 29,998,060 non repeat masked : 14,707,189 Nb of nucleotide covered % nucleotides covered Annotated exons 1,650,821 9.8% transfrag/tar 1,278,588 9.3% CAGE Tags* 151,149 0.5% Ditags* 24,939 0.1% TOTAL UNIQUE Transcribed Bases 2,355,238 14.7% November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

PROCESSED TRANSCRIPTS (PT) Table 1: Summary of Transcriptional Coverage of ENCODE Regions. PROCESSED TRANSCRIPTS (PT) PRIMARY TRANSCRIPTS Total Bases 1 Total Interro-gated Bases 2 % bp in Exons 3 (%)* bp in CAGE tags 4 (%)* bpin PET 5 (%)* bp in TF 6 (%)* Total  Bases in PT 7 (%)* Bases in PT (ESTs included) 8 (%)* Bases in Exons and Introns 9 (%)* Bases with 5'RACE 10(%)* Bases between PETs 11 (%)* Total  Bases 12 (%)*   TOTAL (interrogated and uninterrogated) 29998060 14707189 49. 1776157 (5.9%) 151149 (0.5%) 24939 (0.1%) 1369611 (4.6%) 2519280 (8.4%) 4826292 (16.1%) 17758738 (59.2%) 23318182 (77.7%) 19658563 (65.5%) 27325931 (91.1%) INTERROGATED 1447192 (9.8%) 116013 (0.8%) 19629 1369304 (9.3%) 2163303 (14.7%) 3545358 (24.1%) 9496360 (64.6%) 11763410 (80.0%) 9767311 (66.4%) 13618240 (92.6%)

PROCESSED TRANSCRIPTS (PT) Table 1: Summary of Transcriptional Coverage of ENCODE Regions. PROCESSED TRANSCRIPTS (PT) PRIMARY TRANSCRIPTS Total Bases 1 Total Interro-gated Bases 2 % bp in Exons 3 (%)* bp in CAGE tags 4 (%)* bpin PET 5 (%)* bp in TF 6 (%)* Total  Bases in PT 7 (%)* Bases in PT (ESTs included) 8 (%)* Bases in Exons and Introns 9 (%)* Bases with 5'RACE 10(%)* Bases between PETs 11 (%)* Total  Bases 12 (%)*   TOTAL (interrogated and uninterrogated) 29998060 14707189 49. 1776157 (5.9%) 151149 (0.5%) 24939 (0.1%) 1369611 (4.6%) 2519280 (8.4%) 4826292 (16.1%) 17758738 (59.2%) 23318182 (77.7%) 19658563 (65.5%) 27325931 (91.1%) INTERROGATED 1447192 (9.8%) 116013 (0.8%) 19629 1369304 (9.3%) 2163303 (14.7%) 3545358 (24.1%) 9496360 (64.6%) 11763410 (80.0%) 9767311 (66.4%) 13618240 (92.6%)

International Conference on Bioinformatics HKUST, Hong Kong 2007 Other recent studies Many individual studies suggest unanticipated complexity of the transcriptional map of the human genome: Kapranov et al. (2007) RNA onto tiling arrays, novel RNA classes, hundreds of thousands of novel sites of transcription Peters et al. (2007) LongSage, evidence for thousands of novel transcripts Roma et al. (2007) gene trap sequence tags in mouse embryonic stem cells, thousands of novel transcripts Unneberg and Claverie (2007) interchromosomal transcript chimerism Denoeud et al. (2007) RACEarrays. Doubling the number of annotated exons in protein coding transcripts, widespread transcript chimerism November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

tiling arrays reveal many novel sites of transcription TRANSCRIPTION MAP of HL-60 DEVELOPMENTAL TIME COURSE (data by Tom Gingeras, affymerix) check if this is after induction with retinoic acid November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

characteristics of unannotated transfrags short: 78bp on average compared with 121 for exonic transfrags very gc-rich: 56% vs 42% in the background of unannotated regions lack splice sites no matches to protein or domain databases lack of selective constraints HOWEVER: reproducible across cell lines support by independent evidence of transcription (mostly unspliced ESTs). enriched for RNA structures. November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 Rozowsky et al, 2007 Novel tar/transfrags are associated to known genes by identifying novel tars that are co-expressed with known genes across 11 cell lines and conditions November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 Rozowsky et al., 2007 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 Denoeud et al, 2007 The ENCODE experiments 5’ RACE on 12 tissues primers in internal exons of 399 protein coding loci RACE products hybridized into genome tiling arrays 4573 race exons detected. 2324 novel Summary of the racearray experimens November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 5’ RACE from TMEM15 Gene (region Enr232) identifies several tissue specific distal 5’ exons. Target gene November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 11/9/2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

more than 30% of RACEfrags more than 3Mb away from the index exon November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

distal RACEfrags are associated to independently predictes sites of transcription initiation

cloning and sequencing of RACEarray products November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

cloning and sequencing of RACEarray products almost 30% of the sequenced products incorporate exons from upstream genes in chimeric structures International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 RACEarrays: an strategy for normalization of RACE libraries, and exhaustive identification of alternative transcripts November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Array based normalization of RACE libraries If we select 40 clones at random from the RACE reaction, the probability of selecting a clone from the less abundant form is 0.01 (assuming a multinomial distribution) However, if the transcript forms could be segregated by RT-PCR, then by selecting again 40 random clones, 10 from each RT-PCR, the probability of selecting the less abundant form is now, 0.6 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

RT-PCR cloning and sequencing pilot (Kourosh Salehi-Ashtiani, DFCI) * 24 novel RACEfrags tested by RT-PCR, including 6 cases previously confirmed in Denoeud et al. (2007) Positive RT-PCR cloned, and 32 randomly selected clones sequenced. RESULTS 14 positive RT-PCR, 13 confirmed by sequencing. 42 novel transcript variants. Compared with the 52 previously know for the RT-PCR positive loci. Nearly all canonical splice boundaries Genomic extensions from 2.5 to 145Kb * Difficulties in obtaining sequences for long cDNAs (which correlate with long genomic extensions)--but even with previously verified cases. Problems with RACEfrag assigntation to loci

International Conference on Bioinformatics HKUST, Hong Kong 2007 A very efficient strategy for targeted large scale transcript discovery RACEarray normalization 448 atempted clone sequences  42 novel transcripts 1 novel transcript per 10 clones sequenced. Carnici et al. (Genome Research 2003, 13:1273-1289) 1,989,385 ESTs  70,214 transcripts (mouse) 1 transcript after 30 sequenced ESTs. (and the majority of transcripts already known) November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 CONCLUSIONS there is substantial amount of transcription which does not appear to be associated to protein coding loci only a fraction of the transcript diversity of protein coding loci appears to have been surveyed so far. in particular, protein coding loci appear to have tissue specific distal alternative transcriptional start sites RACEarrays are an effective normalization strategy for identifciation of rear transcripts ENCODE transcriptional landscape: network of overlapping coding and non-coding transcripts, resulting in a continuum of transcription (more than 90% of the ENCODE regions are transcribed in at least one strand) November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

PROVING THE FUNCTIONALITY OF NOVEL TRANSCRIPTS November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

The GENCODE annotation 487 loci. 2608 transcripts 53 non-coding loci. 137 transcripts 434 protein coding loci. 1097 coding transcripts 1374 non-coding transcripts 5.7 transcripts per protein coding locus 2.5 coding transcripts per locus 1.7 proteins per locus November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 the combined analysis of BioSapiens, Kellis and Goldman identified 184 annotated protein coding transcripts which challenged (from the structural, functional and/or evolutionary standpoint) our current view of proteins. Footnote: removing these 184 proteins from the set of 738 GENCODE proteins, will leave 554 proteins for 434 loci; barely 1,3 proteins per locus November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Structural Effects of Pepsinogen C Alternative Splice Variant Locus RP11-298J23.1 codes for pepsinogen C. The structure of pepsinogen C is 1htrA. Isoform -003 is missing 80 residues with respect to pepsinogen C. Here the missing section of -003 is in light green. The missing section in this isoform would remove the core from both subdomains of the structure. Both the N-terminal sub-domain (on the left) and the C-terminal sub-domain would have to refold. This is the view from above looking down into the active cleft of the proteinase. Active site aspartates are shown in ball and chain. One of the two active site residues is in the missing section. The symmetry apparent in this isoform suggests that although it will have to refold it may very well be able to reform into a single subdomain. Michael Tress & Alfonso Valencia CNB, Madrid

Expression levels alternative vs constitutive Q-PCR in three cell lines: SKNAS GM06990 HelaS3 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

Polysomal association alternative vs constitutive November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

ACKNOWLEDGEMNTS ENCODE GT GROUP Jan Korbel (Yale) Julien Lagarde (IMIM) Jeff Long (Affx) Todd Lowe (UCSC) G. Madhavan (Affx) Anton Nekrutenko (Penn State) David Nix (Affx) Jakob Pedersen (UCSC) Alex Reymond Geneva) Joel Rozowsky (Yale) Yijun Runan (GIS) Albin Sandelin (RIKEN) Mike Snyder (Yale) Peter F. Stadler (U. Vienna) Kevin Struhl (Harvard) Hari Tammana (Affx) Scott Tennenbaun (SUNY, Albany) Chia Lin Wei (GIS) Matt Weirauch (UCSC) Deyou Zheng (Yale) Addam Frankish(Sanger) Tom Gingeras (Affymetrix) Roderic Guigó (CRG) ENCODE GT GROUP Stilyanos Antonarakis (Geneva) Robert Baertsch (UCSC) Ian Bell (Affx) Ewan Birney (EBI) Robert Castelo (IMIM) Jill Cheng (Affx) Evelyn Cheung (Affx) Hiram Clawson (UCSC) France Denoeud (IMIM) Sam Deustch(Geneva) Sujit Dike (Affymetrix) Jorg Drenkow (Affymetrix) Olof.Emanuelsson (Yale) Paul Flicek (Sanger) Mark Gerstein (Yale) Srinka Ghosh (Affx) Jenn Harrow (Sanger) Greg Helt (Afffx) Ivo Hofacker (U. Vienna) Tim Hubbard (Sanger) Phil Kapranov (Affx) Damian Keefe (EBI) Before we get into the data it is important to acknowledge the efforts of those whose efforts will be represented in this data. -AFFX group -Kevin’ group -Scott -Brad

International Conference on Bioinformatics HKUST, Hong Kong 2007 November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007

International Conference on Bioinformatics HKUST, Hong Kong 2007 CENTER FOR GENOMIC REGULATION, PRBB, BARCELONA November 9, 2018 International Conference on Bioinformatics HKUST, Hong Kong 2007