Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr.

Slides:



Advertisements
Similar presentations
The Human Genome Project Main reference: Nature (2001) 409,
Advertisements

Prokaryotic Gene Regulation:
Periodic clusters. Non periodic clusters That was only the beginning…
Introduction to genomes & genome browsers
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
20,000 GENES IN HUMAN GENOME; WHAT WOULD HAPPEN IF ALL THESE GENES WERE EXPRESSED IN EVERY CELL IN YOUR BODY? WHAT WOULD HAPPEN IF THEY WERE EXPRESSED.
Outline to SNP bioinformatics lecture
The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis.
A turbo intro to (the bioinformatics of) microRNAs 11/ Peter Hagedorn.
Predicting the Function of Single Nucleotide Polymorphisms Corey Harada Advisor: Eleazar Eskin.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
Comparative Motif Finding
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
ChIP-seq QC Xiaole Shirley Liu STAT115, STAT215. Initial QC FASTQC Mappability Uniquely mapped reads Uniquely mapped locations Uniquely mapped locations.
Chris Chander, Luke Adea BioSci D145 Feb. 12, 2015
 MicroRNAs (miRNAs) are a class of small RNA molecules, about ~21 nucleotide (nt) long.  MicroRNA are small non coding RNAs (ncRNAs) that regulate.
CS 374: Relating the Genetic Code to Gene Expression Sandeep Chinchali.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
Nathaniel Gustafson Dr. Garry Larson (City of Hope)
Why microarrays in a bioinformatics class? Design of chips Quantitation of signals Integration of the data Extraction of groups of genes with linked expression.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
“An integrated encyclopedia of DNA elements in the human genome” ENCODE Project Consortium. Nature 2012 Sep 6; 489: Michael M. Hoffman University.
Special Topics in Genomics Lecture 1: Introduction Instructor: Hongkai Ji Department of Biostatistics
Bioinformatics.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
COURSE OF BIOINFORMATICS Exam_31/01/2014 A.
More regulating gene expression. Combinations of 3 nucleotides code for each 1 amino acid in a protein. We looked at the mechanisms of gene expression,
Inferring transcriptional and microRNA-mediated regulatory programs in glioblastma Setty, M., et al.
Molecular Biology in a Nutshell (via UCSC Genome Browser) Personalized Medicine: Understanding Your Own Genome Fall 2014.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis Wen-Ting Huang.
A Biology Primer Part III: Transcription, Translation, and Regulation Vasileios Hatzivassiloglou University of Texas at Dallas.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Data Mining the Yeast Genome Expression and Sequence Data Alvis Brazma European Bioinformatics Institute.
Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.
Recombination breakpoints Family Inheritance Me vs. my brother My dad (my Y)Mom’s dad (uncle’s Y) Human ancestry Disease risk Genomics: Regions  mechanisms.
Eukaryotic Gene Structure. 2 Terminology Genome – entire genetic material of an individual Transcriptome – set of transcribed sequences Proteome – set.
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
Diving into the gene pool: Chromosomes, genes and DNA
1 From Mendel to Genomics Historically –Identify or create mutations, follow inheritance –Determine linkage, create maps Now: Genomics –Not just a gene,
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Intro to Probabilistic Models PSSMs Computational Genomics, Lecture 6b Partially based on slides by Metsada Pasmanik-Chor.
Finding genes in the genome
Starter What do you know about DNA and gene expression?
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
Notes: Human Genome (Right side page)
A high-resolution map of human evolutionary constraints using 29 mammals Kerstin Lindblad-Toh et al Presentation by Robert Lewis and Kaylee Wells.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Regulation of Eukaryotic Gene Expression Key concepts in Expression of Eukaryotic Genomes EACH CELL IN YOUR BODY CONTAINS ALL OF THE SAME DNA ;
Different microarray applications Rita Holdhus Introduction to microarrays September 2010 microarray.no Aim of lecture: To get some basic knowledge about.
Enhancers and 3D genomics Noam Bar RESEARCH METHODS IN COMPUTATIONAL BIOLOGY.
Looking Within Human Genome King abdulaziz university Dr. Nisreen R Tashkandy GENOMICS ; THE PIG PICTURE.
Warm up  1. How is DNA packaged into Chromosomes?  2. What are pseudogenes?  3. Contrast DNA methylation to histone acetylation (remember the movie.
The Transcriptional Landscape of the Mammalian Genome
Finding the potential miRNA-binding sites of the selected SNPs
Gene Expression 3B – Gene regulation results in differential gene expression, leading to cell specialization.
more regulating gene expression
Concept 18.2: Eukaryotic gene expression can be regulated at any stage
Polymorphisms GWAS traits.
By Michael Fraczek and Caden Boyer
Genome organization and Bioinformatics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Polymorphisms GWAS traits.
mRNA Degradation and Translation Control
From Mendel to Genomics
Presentation transcript:

Whole Genome Polymorphism Analysis of Regulatory Elements in Breast Cancer AAGTCGGTGATGATTGGGACTGCTCT[C/T]AACACAAGCGAGATGAAGAAACTGA Jacob Biesinger Dr. Garry Larson City of Hope

Topics Covered Today Cancer and Gene Regulation Combining Data: Bioinformatics Progress So Far Molecular Cause of Genetic Disease

ATGCCGGCTTACCATATCTACCTAAATCCGGTA ATGCCGGCTTACCATAAT Port/files/SICKLE CELL WEBSITE/whatissickle.htm SNPs in coding regions: Sickle Cell Anemia Single Nucleotide Polymorphisms and Genetic Disease GluProPheSerThr STOP Genetic disease may also be caused by differential expression of vital proteins ValProPheSerThr STOP TGTAGA Protein Coding Region Untranslated region Promoter Binding Mechanism Micro RNA Binding Mechanism Chunky sheep from miRNA binding site destruction Nature Rev. Genet. 5, 202–212 (2004) T

Breast Cancer Expression Tumor expression patterns are extremely divergent from normal cells Could SNPs in regulatory regions of genes associated with breast cancer explain their overexpression in tumors? Normal Breast Expression Breast Tumor Expression

Statistical Search for Dysregulated Genes Expression patterns in cancers gives two categories: Estrogen Receptor + and ER- Recent metaanalysis pooled tumor expression data for 9 studies and >15,000 genes Top 1% ER+ > ER- 150 genes Top 1% ER+ < ER- 150 genes Normalized expression difference between ER+ and ER- Consistency across studies

Regulation Motifs Which TF binding sites exist in our selected genes? A recent study identified motifs conserved in regulatory regions across 4 organisms lymphocyte transmembrane adaptor 1 Promoter motifs: 123 known motifs 174 phylogenetically conserved Downstream motifs: 273 conserved 3’ UTR 343 conserved miRNA 6mer 368 conserved miRNA 7mer

Motif Search Use Python and UCSC Genome Browser to:  Get promoter region DNA (2kb upstream from transcription start site (TSS) + max of 2kb downstream of TSS, limited by translation start)  Get 3’ untranslated region RNA  Search for motifs on + and – strand Results for Top 1% up and down:  ’ UTR hits  mer hits  mer hits  known motif hits  phylo motif hits

SNP Databases SNP information is coming from two databases:  HapMap- Four groups (270 total people) genotyped for same SNPs  CGEMS- Breast Cancer association study, complete with p-values. A late-comer to our study (June 2007) HapMap ~4 million CGEMS ~550k

Mapping SNPs HapMap ~4 million CGEMS ~550k Gene Promoters and 3’ UTR Motif Matches Use MSSQL 2003 and Python (pymssql) to perform a join of dbSNP, HapMap and CGEMS SNPs with regulatory motifs

Verify Motif Significance How do we know that these motifs are significant? Hypothesis: Due to negative selection, there will be fewer SNPs in motifs than in random areas within the same region. Method: Contrast how many motifs have at least one SNP in them against how many of 100 random sequences from the same region have at least one SNP in them

Motif Counting Results Known Top 1%Motif with SnpMotif without SnpTotal 1-Sided P- Value Actual Random Total Phylo Top 1% 1-Sided P- Value Actual Random Total  3’ UTR results not yet available There is a significant difference between motifs and random sequences.

CGEMS Results A number of SNPs that fall within motifs are associated with Breast Cancer Highest ranking was 1514 out of 550,000 Further analysis required to say if significant

Thanks! SoCalBSI mentors City of Hope Dr. Garry Larson Dr. David Smith Dr. Päl Sætrom Cathryn Lundberg All the SoCalBSI students! Funded by: