Presentation is loading. Please wait.

Presentation is loading. Please wait.

Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs

Similar presentations


Presentation on theme: "Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs"— Presentation transcript:

1 Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs
Md Safiur Rahman Mahdi Advisor: Dr. Michael Domaratzki Department of Computer Science July 27, 2015 Dear audience, good afternoon. Welcome to the presentation of my thesis defence. My thesis title is “Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs”. My advisor is Dr. Michael Domaratzki from Department of Computer Science.

2 Key concepts MicroRNA (miRNA) Small RNA Differential expression

3 From cell to DNA Cells are the basic building blocks of all living things. Image taken from:

4 From DNA to Protein Transcription Translation
Image taken from: Central dogma of molecular biology DNA contains gene  RNA  Proteins

5 Noncoding RNA (ncRNA) Image taken from:

6 miRNA : how does it work? Part of gene contains small non-coding rna called mirna which contains only 22 neucleotides in length They don’t make proteins, acts completely in a different way….find the complementary portion in mRNA and binds together that blocks translation form mRNA to protein They are the fundamental regulator of gene expression Image taken from:

7 miRNA  1000 nt  70-600 nt Mature/star miRNA 17-24 nt 5’ 3’
There should be 0-4 nt difference between the miRNA mature and star miRNA [Kozomara et al., 2014]

8 What is differential expression?
Responds to signals / triggers Changes in expression level Between two sample groups Control Vs. treatment Up-regulated (increased in expression) Down-regulated (decreased in expression) gene expression that responds to signals or triggers

9 Motivation Wheat is an 11 billion dollar industry [NRCC, 2015]
How can we improve wheat breeding with possible climate change? Which miRNAs are differentially expressed with different stresses? NRCC = National Research Council Canada

10 Related work Mayer et al., 2014 98,068 putative precursor miRNAs
52 miRNA sequences Sun et al., 2014 Used Mayer et al.’s precursors 260 mature and star miRNAs

11 Contribution Designed and implemented a toolchain that identify conserved miRNAs Identified differentially expressed miRNA

12 72 input files (4*6*3) ~523 million reads
Methodology No stress (Control) Heat (37◦) Light 6 days UV (2 min) 3 Replicates 72 input files (4*6*3) ~523 million reads 21 GB Plant miRNAs filtering BLASTn 15,158 sequences Conserved miRNAs

13 Tools used Python/biopython (Ubuntu Linux system)
bash shell scripting (Hermes, WestGrid) BLAST Bowtie2 MAFFT RNAfold

14 Unique sequence identification
Distinct read Sequence Read count Normalized read count / RPM: 1,000,000 * read count / total number of sequence

15 Removal of ncRNA sequences
Split each unique file into 300 files Performed BLASTN with Rfam database Aggregated 300 files to a single file

16 Filtering Consistent naming at least 10 RPM [Montes et al., 2014]
Identified 15,158 sequences in total

17 Conserved miRNA identification
Used miRNA database: miRBase Discarded all miRNAs except plants Performed BLASTN with miRBase Considered 0-4 mismatches, no indels [Michael Axtell]

18 Conserved miRNA identification [continued]

19 Multiple sequence alignment
Matched species  bowtie2 with  wheat genome  contigs Wheat, precursor, and conserved miRNA sequences Experimental Sequences

20 72 input files (4*6*3) ~523 million reads
No stress (Control) Heat (37◦) Light Mayer et al.’s supplementary materials 6 days UV (2 min) 3 Replicates 72 input files (4*6*3) ~523 million reads 21 GB Putative Precursors Bowtie 2 filtering 15,158 sequences Putative mature miRNAs Star miRNA prediction Conserved miRNAs

21 Matched species

22 Differential gene expression
Mayer et al. Sun et al. 36 miRNAs, 232 sequences Day 0: Control Vs. Heat Control Vs. light Control Vs. UV Day 10: Control Vs. Heat Control Vs. light Control Vs. UV edgeR Day 1: Control Vs. Heat Control Vs. light Control Vs. UV …………

23 613 experimental sequence
Result - miRBase No stress (Control) Heat (37◦) Light 6 days UV (2 min) 3 Replicates 72 input files (4*6*3) ~523 million reads 21 GB Plant miRNAs filtering BLASTn 15,158 sequences 87 plant miRNAs 613 experimental sequence

24 Result - Mayer et al. and Sun et al.
No stress (Control) Heat (37◦) Light Mayer et al.’s supplementary materials 6 days UV (2 min) 3 Replicates 72 input files (4*6*3) ~523 million reads 21 GB Putative Precursors Bowtie 2 filtering 15,158 sequences Sun et al.’s supplementary materials Putative mature miRNAs Star miRNA prediction 36 wheat miRNAs 232 experimental sequences

25 Result - differential gene expression

26 miRNAs: day 7, all stresses

27 miRNAs - in all days, heat stress
miR 398 miR 5064 miR 2020b

28 Number of miRNA differentially expressed
Heat: 34 families Light: 8 families UV: 7 families

29 Per day expression

30 Expression of miRNA families
miRNA 395 and 398 strongly suppressed miRNA 1439, 2020, 5064 and 5175 expressed(+) with heat miRNA 395 was suppressed in all days and in all stresses

31 Toolchain validation Validated our tool with Brassica rapa dataset [Bilichak et al., 2015] Includes pollen, embryo, endosperm and progeny tissue Control and heat stress only

32 Identified Same differential expression
Same miRNA-168 is expressed in endosperm tissue of heat stressed plant Bichilak et al. identified bra-miR168 with 6.48 log fold change Identified cca-miR168 with 5.83 log fold change Our log fold change may be lower because we mapped the Brassica rapa dataset to miRBase with 0 to 4 mismatches, which may possibly have excluded some Brassica rapa miRNAs.

33 Comparison – miRBase result
Researchers are doing more research concerning miRNAs in Triticum aestivum or species related to it than Brassica rapa, which causes more entries in miRBase similar to the Triticum aestivum miRNAs than the Brassica rapa miRNAs.

34 Summary ~ 523 million reads: miRBase: Mayer et al. and Sun et al.:
Filtered down to 15,158 sequences miRBase: Identified 87 plant miRNAs (613 sequences) Mayer et al. and Sun et al.: Identified 36 wheat miRNAs (232 sequences) Differential gene expression: Heat stress (most significant)

35 Find out the known conserved miRNAs
Summary Find out the known conserved miRNAs

36 Future Work Novel miRNA prediction Predict novel miRNA from isomiR

37 Thank you

38 Related work Mayer et al., 2014 Sun et al., 2014
Identified 98,068 putative miRNA precursors Identified 52 miRNA sequences Sun et al., 2014 Used 11 libraries (dry grain, embryo etc.) Used Mayer et al.’s 98,068 miRNA precursors Reported 260 mature miRNA and star sequences

39 Our experimental data 21 GB dataset (~ 523 million small RNA reads)
From leaf samples of 96 wheat plants 72 files: 6 days: 0, 1, 2, 3, 7, 10 4 stresses (for 3 days) Heat: 37◦ C Light: continuous light UV: 2 minutes of exposure/day Control 3 replicates File size: 250 to 450 MB

40 Methodology 4. Conserved miRNA identification 5. Multiple Sequence
Pre-processing Processing Post-processing 1. Unique sequence identification 4. Conserved miRNA identification 5. Multiple Sequence Alignment 2. ncRNA removal 3. Filtering

41 4. Conserved miRNA identification [continued]

42 4. Conserved miRNA identification [continued]

43 Identification of conserved miRNAs using supplementary materials of Mayer et al., 2014
Aligned 10 RPM experimental sequences with the precursor database using Bowtie2 Finding an energetically stable structure of RNA using the sequence is known as the MFE method A dot “." represents an unpaired base open parenthesis “(" represents a base that is paired (5' end) to another base ahead of it (3' end) closed parenthesis “)" represents a base that is paired (3' end) to another base behind it (5' end) Considered exact match and MFE <0.2 Kcal/mol/nt [Kozomara et al., 2014] Considered only the 52 wheat miRNA provided by Kurtoglu et al., 2014

44 Finding star sequence:
5’ end: 2 base bair overhang [Kozomara et al., 2014]

45 Finding star sequence:
5’ end: 3’ end: 2 base bair overhang [Kozomara et al., 2014]

46 Exceptions

47 Exceptions [continued]

48 Differential gene expression
Combination of Mayer et al. and Sun et al., 10 RPM 36 conserved miRNA families 232 unique sequences (325 in total) 205 sequences from Mayer et al. 12 sequences from Sun et al. 15 sequences are common Used edgeR package of R programming language 3 (Control Vs. Heat, light, UV) files * 6 days = 18 input files

49 Differential gene expression

50 Conserved miRNA identification using miRBase database
Identified 87 conserved miRNA families Matched with 613 sequences from experiment (10 RPM) Many miRNA families matched with multiple experimental sequences Tae-miR159b matched with 150 experimental sequences (maximum)

51 conserved miRNAs identification using the supplementary materials of Mayer et al. and Sun et al.
232 unique sequences (325 in total) 205 sequences from Mayer et al. 12 sequences from Sun et al. 15 sequences are common


Download ppt "Identifying Conserved microRNAs in a Large Dataset of Wheat Small RNAs"

Similar presentations


Ads by Google