Identification of obesity-associated intergenic long noncoding RNAs

Slides:



Advertisements
Similar presentations
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Advertisements

Geuvadis RNAseq UNIGE Genetic regulatory variants
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Ch 17 Gene Expression I: Transcription
1 Harvard Medical School Mapping Transcription Mechanisms from Multimodal Genomic Data Hsun-Hsien Chang, Michael McGeachie, and Marco F. Ramoni Children.
Ribosome footprinting
Lecture 4: DNA transcription
Andrea Hand BIOL 303. Large Intergenic Non-Coding RNAs Intergenic Non-Coding: transcribed from a region of DNA between two genes that are used to make.
20,000 GENES IN HUMAN GENOME; WHAT WOULD HAPPEN IF ALL THESE GENES WERE EXPRESSED IN EVERY CELL IN YOUR BODY? WHAT WOULD HAPPEN IF THEY WERE EXPRESSED.
Transcriptomics Breakout. Topics Discussed Transcriptomics Applications and Challenges For Each Systems Biology Project –Host and Pathogen Bacteria Viruses.
Genetic Analysis of Lac Operon Make partial diploids to do complementation tests: 1 copy of lac operon on E. coli chromosome. 2nd copy of lac operon on.
Identifying promoters and regulatory elements for DNA variation
Class activity: What are my asthma variants doing? In the subset of individuals for whom expression data are available, the T nucleotide allele at rs
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
Alternative splicing and evolution Daniel Jeffares.
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Office hours Wednesday 3-4pm 304A Stanley Hall Review session 5pm Thursday, Dec. 11 GPB100.
Gene Structure and Identification
Higher BMI (body mass index) is linked to greater brain atrophy in 700 MCI and AD patients, and in healthy elderly ADNI (N=587,critical P-value: 0.025)
Nuevas perspectivas en análisis genomico: implicaciones del proyecto ENCODE 1 Rory Johnson Bioinformatics and Genomics Centre for Genomic Regulation AEEH.
Supervisor: Yihong Jennifer Tan Eric Gähwiler Karim Hamidi
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Modes of selection on quantitative traits. Directional selection The population responds to selection when the mean value changes in one direction Here,
Geuvadis RNAseq analysis at UNIGE Analysis plans
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Igor Ulitsky.  “the branch of genetics that studies organisms in terms of their genomes (their full DNA sequences)”  Computational genomics in TAU ◦
AP Biology From Gene to Protein How Genes Work AP Biology What do genes code for? proteinscellsbodies How does DNA code for cells & bodies?  how are.
From Gene to Protein Chapter 17.
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
Regulation of gene expression in the mammalian eye and its relevance to eye disease Todd Scheetz et al. Presented by John MC Ma.
GWAS Hits and Functional Implications Peter Castaldi February 1, 2013.
Marco Magistri , Journal Club. A non-coding RNA (ncRNA) is any RNA molecule that is not translated into a protein “Structural genes encode proteins.
AP Biology From Gene to Protein How Genes Work.
The generalized transcription of the genome Víctor Gámez Visairas Genomics Course 2014/15.
Gene Regulations and Mutations
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Sequencing. over view The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non coding RNA produced.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Jason Ernst Broad Institute of MIT and Harvard
1 Paper Outline Specific Aim Background & Significance Research Description Potential Pitfalls and Alternate Approaches Class Paper: 5-7 pages (with figures)
CFE Higher Biology DNA and the Genome Transcription.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Notes: Human Genome (Right side page)
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Do we know it all?. John L. Rinn and Howard Y. Chang Annu. Rev. Biochem
Chapter 4 “DNA Finger Printing”
Gene Expression: from DNA to protein
Gene Hunting: Design and statistics
Prokaryotic cells turn genes on and off by controlling transcription.
محاضرة عامة التقنيات الحيوية (هندسة الجينات .. مبادئ وتطبيقات)
Prokaryotic cells turn genes on and off by controlling transcription.
By Michael Fraczek and Caden Boyer
Analogy Video Central Dogma Analogy Video (Resources Page)
Nat. Rev. Gastroenterol. Hepatol. doi: /nrgastro
Relationship between Genotype and Phenotype
In these studies, expression levels are viewed as quantitative traits, and gene expression phenotypes are mapped to particular genomic loci by combining.
Prokaryotic cells turn genes on and off by controlling transcription.
One SNP at a Time: Moving beyond GWAS in Psoriasis
Prokaryotic cells turn genes on and off by controlling transcription.
Prokaryotic cells turn genes on and off by controlling transcription.
Prokaryotic cells turn genes on and off by controlling transcription.
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Prokaryotic cells turn genes on and off by controlling transcription.
Presentation transcript:

Identification of obesity-associated intergenic long noncoding RNAs Project for “Solving Biological Problems that require Math” Jennifer Tan

Intergenic long noncoding RNAs Pervasive transcription of eukaryotic genomes (not just in protein- coding regions) Large proportion of functional information is embedded within noncoding regions - Completion of the human genome led to the surprising revelation that only 2% of the genome encode for proteins - Evidence of biochemical activity and evolutionary constraint of sequences indicate that a much bigger proportion of the genome is likely to be biologically functional a large proportion of functional information is embedded within noncoding regions of the genome Indeed, many important noncoding functional elements have been identified We focus on noncoding RNAs Here we show a timeline showing the discoveries of functional RNAs in biological regulation Timeline of discoveries of functional RNAs in biological regulation (Rinn and Chang, 2012).

Intergenic long noncoding RNAs Pervasive transcription of eukaryotic genomes (not just in protein- coding regions) Large proportion of functional information is embedded within noncoding regions - Completion of the human genome led to the surprising revelation that only 2% of the genome encode for proteins - Evidence of biochemical activity and evolutionary constraint of sequences indicate that a much bigger proportion of the genome is likely to be biologically functional a large proportion of functional information is embedded within noncoding regions of the genome Indeed, many important noncoding functional elements have been identified We focus on noncoding RNAs Here we show a timeline showing the discoveries of functional RNAs in biological regulation Timeline of discoveries of functional RNAs in biological regulation (Rinn and Chang, 2012). Rinn and Chang, 2012

Intergenic long noncoding RNAs > 200 bp No apparent open reading frame (ORF) Pol II transcribed, capped and polyadenylated, and frequently spliced Tend to be expressed at lower levels and with temporal/tissue-specific profiles Currently > 10,000 lincRNAs annotated in human genome Similar to protein-coding messenger RNAs (mRNAs), a subset of lincRNAs are transcribed by RNA-polymerase II (Pol II), capped and polyadenylated and frequently spliced at canonical splicing sites, although at a seemingly reduced splicing efficiency compared to mRNAs. In addition, lincRNAs are enriched in the same chromatin marks as protein-coding genes: histone H3K4 trimethylation at their 5’-end and histone H3K36 trimethylation across the gene body. Lower expression (10 times lower), tend to be temporal and tissue-specifically expressed Currently, there are more than 10,000 lincRNAs annotated in the human genome

Intergenic long noncoding RNAs lincRNAs have been implicated in all levels of gene regulation < 0.5% of lincRNAs have an established biological function LincRNAs have already been implicated to contribute at all levels of gene regulation by modulating levels of genomically adjacent or distally located gene products via cis- or trans-acting molecular mechanisms, respectively (1) transcriptional regulation, including regulatory element control, chromatin remodeling and epigenetic control (2) post-transcriptional regulation where lincRNAs can act as molecular decoys for other species of RNA transcripts However, only a relatively small number of lincRNAs having been functionally characterized to date their frequent association with global signatures of functionality for a subset of these lincRNAs motivated continuous efforts from scientists in different research fields, including molecular biology, biochemistry, genetics, and computational genomics, to characterize and establish their molecular, cellular, and organismal functions. Ulitsky and Bartel, 2013

Project aim Identify lincRNAs likely associated with genetic variants (SNPs) linked to OBESITY Obesity is a growing global health problem Over 13% of adults were considered to be obese in 2014 (WHO) Body Mass Index (BMI) is commonly used to assess obesity Genetic factors found to contribute to variability in BMI between individuals Here, we use obesity as a disease model, which has a strong genetic basis. We aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. Obesity is defined as abnormal or excessive fat accumulation that may impair health. Obesity is a growing worldwide health problem associated with increase morbidity and mortality that imposes an enormous burden on individual and public health According to WHO statistics, 13% of adults (over 18 years of age) were obese in 2014 BMI is commonly used as a proxy to assess obesity and variability in BMI between individuals has been found to be attributed to genetic factors

Project aim Identify lincRNAs likely associated with genetic variants (SNPs) linked to OBESITY Aim of the project is to identify lincRNAs whose expression is correlated with genetic variants recently linked to obesity/BMI through genome-wide association studies (GWAS) Genome-wide association study (GWAS) is an examination of common genetic variants, typically single nucleotide polymorphisms (SNPs), across most of the genomic landscape that may be associated with a disease. In Locke et al., the authored identified 97 BMI-associated genomic loci above genome-wide significance in a GWAS study.

Project aim Identify lincRNAs likely associated with genetic variants (SNPs) linked to OBESITY Aim of the project is to identify lincRNAs whose expression is correlated with genetic variants recently linked to obesity/BMI through genome-wide association studies (GWAS) Genome-wide association study (GWAS) is an examination of common genetic variants, typically single nucleotide polymorphisms (SNPs), across most of the genomic landscape that may be associated with a disease. In Locke et al., the authored identified 97 BMI-associated genomic loci above genome-wide significance in a GWAS study. 97 genomic loci linked to BMI

Project aim Identify lincRNAs likely associated with genetic variants (SNPs) linked to OBESITY Geuvadis dataset, which has RNA seq for lymphoblastoid cell lines (LCLs) from 373 individuals of European descent 97 genomic loci linked to BMI RNAseq data for 373 individuals

Project aim Identify lincRNAs likely associated with genetic variants (SNPs) linked to OBESITY cis-eQTL analysis: Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity cis-eQTL analysis: Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity

eQTL analysis expression Quantitative Trait Loci (eQTL) Goal : identify genomic locations where genotype significantly affects gene expression Genomic DNA loci RNA transcripts transcribed in the loci Genetic risk variant, such as SNPs, associated with potential diseases that we have genotype data for In the vicinity of the genetic variant We ask if there is an association between the genotype at the SNP with the expression levels of the RNA transcript

eQTL analysis expression Quantitative Trait Loci (eQTL) Goal : identify genomic locations where genotype significantly affects gene expression Genomic DNA loci RNA transcripts transcribed in the loci Genetic risk variant, such as SNPs, associated with potential diseases that we have genotype data for In the vicinity of the genetic variant We ask if there is an association between the genotype at the SNP with the expression levels of the RNA transcript DNA

eQTL analysis expression Quantitative Trait Loci (eQTL) Goal : identify genomic locations where genotype significantly affects gene expression Genomic DNA loci RNA transcripts transcribed in the loci Genetic risk variant, such as SNPs, associated with potential diseases that we have genotype data for In the vicinity of the genetic variant We ask if there is an association between the genotype at the SNP with the expression levels of the RNA transcript DNA SNPs Genotype data

eQTL analysis expression Quantitative Trait Loci (eQTL) Goal : identify genomic locations where genotype significantly affects gene expression Gene expression data RNA Genomic DNA loci RNA transcripts transcribed in the loci Genetic risk variant, such as SNPs, associated with potential diseases that we have genotype data for In the vicinity of the genetic variant We ask if there is an association between the genotype at the SNP with the expression levels of the RNA transcript DNA SNPs Genotype data

Genotype of risk factor eQTL analysis expression Quantitative Trait Loci (eQTL) Goal : identify genomic locations where genotype significantly affects gene expression Gene expression data RNA Gene expression level Genomic DNA loci RNA transcripts transcribed in the loci Genetic risk variant, such as SNPs, associated with potential diseases that we have genotype data for In the vicinity of the genetic variant We ask if there is an association between the genotype at the SNP with the expression levels of the RNA transcript DNA SNPs Genotype data C/C C/T T/T Genotype of risk factor

eQTL analysis Goal : Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Here, we aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. Specifically, the majority of disease-associated SNPs (approximately 88%) were mapped to non-coding regions, revealing a larger than anticipated role for noncoding SNPs in diseases (Hindorff et al. 2009).

eQTL analysis Goal : Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Gene expression data RNA Here, we aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. DNA SNPs Genotype data

BMI/obesity associated SNPs eQTL analysis Goal : Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Gene expression data RNA Here, we aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. DNA BMI/obesity associated SNPs Genotype data

BMI/obesity associated SNPs eQTL analysis Goal : Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Gene expression data lincRNA Here, we aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. DNA BMI/obesity associated SNPs Genotype data

BMI/obesity associated SNPs eQTL analysis Goal : Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Gene expression data lincRNA Gene expression level Here, we aim to identify lincRNAs likely associated with genetic variants (SNPs) recently linked to obesity through genome-wide association studies (GWAS) Such lincRNAs would be excellent candidates for future validation and functional characterization studies. DNA BMI/obesity associated SNPs Genotype data C/C C/T T/T Genotype of risk factor

Project aim eQTL analysis: Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Data: RNAseq: gene expression levels from lymphoblastoid cell lines (LCLs) from 373 individuals of European descent (Lappalainen et al. 2013) Genotype: BMI/obesity associated genetic markers (Locke et al. 2015) To identify BMI/obesity associated lincRNAs, we will take advantage of publicly available RNA sequencing (RNAseq) data of lymphoblastoid cell lines (LCLs) from 373 individuals of European descent (Lappalainen et al. 2013). LCL – derived from human B lymphocytes (B cells) We will test the correlations between BMI/obesity associated variants and the expression levels of lincRNAs in their genomic vicinity.

Project aim eQTL analysis: Test correlations between BMI/obesity associated variants and expression levels of lincRNAs in their genomic vicinity Data: RNAseq: gene expression levels from lymphoblastoid cell lines (LCLs) from 373 individuals of European descent (Lappalainen et al. 2013) Genotype: BMI/obesity associated genetic markers (Locke et al. 2015) What does it involve? Gene expression data normalization Estimation of correlation between lincRNA expression and genotype genetic variants Multiple hypothesis testing correction Data manipulation in R The gene expression data will first be normalized to remove unwanted technical variations/confounding factors in order to ensure accurate inference of gene expression levels. Next, correlations are estimated between expression levels of lncRNAs and genotypes of the GWAS variants, which are then compared to randomly permutated correlations for multiple hypothesis testing correction. Significantly correlated lncRNA:SNP pairs above background levels will be useful for further functional characterizations and may provide insight on biological pathways involved in obesity. Such lincRNAs would be excellent candidates for future validation and functional characterization studies.

Thank you ☺