miRNA workshop miRNA target prediction in animals

Slides:



Advertisements
Similar presentations
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Advertisements

Improving miRNA Target Genes Prediction Rikky Wenang Purbojati.
Author: Jim C. Huang etc. Lecturer: Dong Yue Director: Dr. Yufei Huang.
RNA Structure Prediction
Advantages of C. elegans: 1. rapid life cycle 2. hermaphrodite
Predicting RNA Structure and Function
Comparative Motif Finding
Computational biology seminar
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
Presenting: Asher Malka Supervisor: Prof. Hermona Soreq.
Predicting RNA Structure and Function
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA Folding. RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four.
Copyright OpenHelix. No use or reproduction without express written consent1.
RNA Structure Prediction
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Background & Motivation Problem & Feature Construction Experiments Design & Results Conclusions and Future Work Exploring Alternative Splicing Features.
Improving Intergenic miRNA Target Genes Prediction Rikky Wenang Purbojati.
Motif Search and RNA Structure Prediction Lesson 9.
Finding genes in the genome
RNA Structure Prediction
Transcription factor binding motifs (part II) 10/22/07.
1 What forces constrain/drive protein evolution? Looking at all coding sequences across multiple genomes can shed considerable light on which forces contribute.
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Regulation of Gene Expression
From Gene To Protein DNA -> RNA -> Protein
Bioinformatics Research Group
Gene expression from RNA-Seq
ANIMAL TARGET PREDICTION - TIPS
Motifs BCH364C/394P - Systems Biology / Bioinformatics
Global Transcriptional Dysregulation in Breast Cancer
Figure 1 miRNA expression in multiple sclerosis lesions
EPConDB: Endocrine Pancreas Consortium Database
Predicting RNA Structure and Function
LncRNAs exert their effects by diverse mechanisms. LncRNAs exert their effects by diverse mechanisms. (A) lncRNAs can.
MicroRNA-451 plays a role in murine embryo implantation through targeting Ankrd46, as implicated by a microarray-based analysis  Zhengyu Li, M.D., Jia.
Recitation 7 2/4/09 PSSMs+Gene finding
Introduction to Bioinformatics II
Transcriptome-wide Discovery of microRNA Binding Sites in Human Brain
How MicroRNAs Modify Protein Production
MicroRNAs: Target Recognition and Regulatory Functions
Volume 27, Issue 1, Pages (July 2007)
ATM Gene Mutations Result in Both Recessive and Dominant Expression Phenotypes of Genes and MicroRNAs  Denis A. Smirnov, Vivian G. Cheung  The American.
Volume 27, Issue 1, Pages (July 2007)
Effect of altered 3′UTR on miRNA-mediated gene regulation.
SP1 was a downstream target of miR-150-3p
Identification of miR‐499 targets
Volume 48, Issue 5, Pages (December 2012)
Presented by, Jeremy Logue.
Volume 14, Issue 7, Pages (February 2016)
Amygdalar MicroRNA-15a Is Essential for Coping with Chronic Stress
Prediction of Plant MicroRNA Targets
Volume 54, Issue 6, Pages (June 2014)
MicroRNAs in cancer: biomarkers, functions and therapy
Summarized by Sun Kim SNU Biointelligence Lab.
Wenwen Fang, David P. Bartel  Molecular Cell 
Volume 53, Issue 6, Pages (March 2014)
ATM Gene Mutations Result in Both Recessive and Dominant Expression Phenotypes of Genes and MicroRNAs  Denis A. Smirnov, Vivian G. Cheung  The American.
Presented by, Jeremy Logue.
Deep Learning in Bioinformatics
Motifs BCH339N Systems Biology / Bioinformatics – Spring 2016
Regulating gene expression
MicroRNAs in cancer: biomarkers, functions and therapy
Derek de Rie and Imad Abuessaisa Presented by: Cassandra Derrick
Presentation transcript:

miRNA workshop miRNA target prediction in animals Thomas Bradley thomas.bradley@tgac.ac.uk

Background The miRNA associates with the argonaute protein (Ago) via low-specificity hydrogen bonding of the sugar phosphate backbone to Ago AGO AGO-miRNA miRNA + The Ago-miRNA complex is guided to targets by high specificity interactions between the miRNA base pairs and the base pairs of the target

Plants vs. Animals

Background Most animal miRNAs (unlike plants) do not mediate transcript cleavage Each miRNA can target multiple transcript and vice versa Transcript A 5’ UTR Coding Sequence 3’ UTR m7G AAAAAAA Alternative Cleavage and Polyadenylation (APA) miR-X miR-Y Transcript B 5’ UTR Coding Sequence 3’ UTR m7G AAAAAAA

Experimental Validation There are many different ways to experimentally validate a candidate target which won’t be discussed in great detail here...but it is important to state that: 1. There are multiple different ways of experimentally validating targets (e.g. Luciferase assay, microarrays, RNA-Seq, immunoprecipitation) 2. Each of these methods have their own idiosyncrasies which should be appreciated when analysisng results 3. The process of experimental validation of targets is a rapidly evolving area, with new techniques and protocols being developed year-on-year

Exercise 1a 1. Visit the Tarbase website (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=tarbase/index) - or just type ‘tarbase’ into Google if that is easier 2. Input ‘GNAI3’ as your gene 3. Click “Submit” 4. What is the most common method for discovering targets? 5. How can you find where your gene of interest is expressed? 6. In which tissue was the top target identified? 7. Optional/extension: Repeat steps using a different gene symbol

Exercise 1b 1. Visit the Tarbase website (http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=tarbase/index) - or just type ‘tarbase’ into Google if that is easier 2. Input ‘has-mir-16-5p’ as your miRNA of interest 5. What is the most common method for discovering targets? 6. How can you find where your gene of interest is expressed? 7. In which tissue was the top target identified? 8. Optional/extension: Repeat steps using a different miRNA

Background Most targets bind the miRNA 5’ end seed region This denotes a set of different binding subsequences Bartel (2009)

Background In the event of seed region mismatch, 3’ compensatory binding can occur Supplementary binding can also occur Bartel (2009)

Background Most targets bind the miRNA 5’ end seed region This denotes a set of different binding subsequences In the event of seed region mismatch, 3’ compensatory binding can occur Bartel (2009)

Background Most targets bind the miRNA 5’ end seed region This denotes a set of different binding subsequences In the event of seed region mismatch, 3’ compensatory binding can occur Bartel (2009)

Exercise 2a 1. Visit the TargetScan 7 website (http://www.targetscan.org/vert_71/) - or just type ‘targetscan7’ into Google if that is easier 2. Select the Human species in the first drop down menu 3. Input ‘GNAI3’ as your human gene symbol 4. Click “Submit” 5. Tally the total number of sites of each type 6. What proportion of sites have higher probability of preferential conservation? 7. Optional/extension: Repeat step 5 looking at poorly conserved sites 8. Repeat steps using a different gene symbol

Exercise 2b 1. Visit the TargetScan 7 website (http://www.targetscan.org/vert_71/) - or just type ‘targetscan7’ into Google if that is easier 2. Select the Human species in the first drop down menu 3. Choose ‘mir-9-5p’ as your broadly conserved miRNA family 4. Click “Submit” 5. Look at the top 4-5 results 6. Determine the proportion of conserved sites belonging to each site type 7. Repeat the process for poorly conserved site types 8. Optional/extension: Repeat steps using different miRNA families

Background Most target prediction models score candidate interactions on the following basis General sequence features Specific base-pairing to the seed region (+ additional 3’ supplementary binding) Thermodynamics of binding Conservation of the target site (AKA miRNA Response Element – mRE) Ritchie and Rasko (2014)

Select features AIC = 2k – 2ln(L) 26 features were selected using manual curation (from published data) These 26 features were then further processed using a process of stepwise regression using (AIC – Akaike Information Criterion) AIC = 2k – 2ln(L)

14 Features The 26 features are reduced to 14 in order to prevent overfitting from occurring The 14 features are: 3’-UTR target-site abundance (TA_3UTR) Predicted seed-pairing stability (SPS) sRNA position 1 (sRNA1) sRNA position 8 (sRNA8) Site position 8 (site8) Local AU content (local_AU) 3’ supplementary pairing (3P_score) Predicted structural accessibility (SA) Minimum distance from stop codon or polyadenylation site (min_dist) Probability of conserved targeting (PCT) ORF length (len_ORF) 3’-UTR length (len_3UTR) Number of offset-6mer sites (off6m) ORF 8mer sites (ORF8m)

Simple Linear regression y = β0 + βx + ε House Price output input Number of bedrooms

Multilinear regression (2 features) y = β0 + β1x1 + β2x2 + ε House Price Size of house (Arbitrary units) Number of bedrooms

Multilinear regression (14 features) Sorry, no pretty picture this time! y = β0 + β1x1 + β2x2 + … β14x14 + ε

Multi-linear regression Agarwal et al (2015)

TargetScan7

Exercise 3a 1. Visit the TargetScan 7 website (http://www.targetscan.org/vert_71/) - or just type ‘targetscan7’ into Google if that is easier 2. Select the Human species in the first drop down menu 3. Input ‘GNAI3’ as your human gene symbol 4. Click “Submit” 5. For conserved targets, find the average context++ score for each site type 6. Optional/extension: Repeat step 5 looking at poorly conserved sites 8. Repeat steps using a different gene symbol

Exercise 3b 1. Visit the TargetScan 7 website (http://www.targetscan.org/vert_71/) - or just type ‘targetscan7’ into Google if that is easier 2. Select the Human species in the first drop down menu 3. Choose ‘mir-7-5p’ as your broadly conserved miRNA family 4. Click “Submit” 5. What is the different between ‘cumulative weighted context++’ and ‘total context++’ 7. What is the relationship if any between these two variables and the aggregate PCT?