Presentation is loading. Please wait.

Presentation is loading. Please wait.

28. Juni 2016, BSc. Präsentation, M. Haberbusch

Similar presentations


Presentation on theme: "28. Juni 2016, BSc. Präsentation, M. Haberbusch"— Presentation transcript:

1 28. Juni 2016, BSc. Präsentation, M. Haberbusch
F O R N A V I S A tool that helps to optimize the Forna Container‘s RNA secondary structure graph generation Max Haberbusch 28. Juni 2016, BSc. Präsentation, M. Haberbusch

2 BSc. Präsentation, 2016, M. Haberbusch
Overview RNA Secondary Structure RNA Secondary Structure Graph Plotting Forna Container Metrics Simulation Visualization Prototype Advantages & Disadvantages BSc. Präsentation, 2016, M. Haberbusch

3 RNA Secondary Structure
BSc. Präsentation, 2016, M. Haberbusch

4 RNA Secondary Structure (1)
RNA consists of nucleotides RNA encodes the blueprint for proteins Secondary structure describes basepairing interactions between nucleotides Important for predicting and determining the function of RNA molecules FASTA Format, common format for storing RNA secondary structure: > Somesequence ; this is a random RNA sequence and its ; secondary structure AGAUAUGUGCCGGCCUAACUCUAACGGUAAUCUUCUGUCACGACCUACGCGCCGAGGUGACCUAUAAUAGCACGCACUACGCCGUCAACUACAGAGCAUU ((((...((((( )))))))))(((((...(((....(((...(((((..(((...)))..))).)).))).)))....)))))..... BSc. Präsentation, 2016, M. Haberbusch

5 RNA Secondary Structure (2)
Different RNA secondary structure representations depending on the purpose Undirected Graph Mountainplot Bonding Graph FASTA Format (a) (b) (c) BSc. Präsentation, 2016, M. Haberbusch

6 RNA Secondary Structure as Undirected Graph
BSc. Präsentation, 2016, M. Haberbusch

7 RNA Secondary Structure as Undirected Graph
Important for identifying substructures Substructures Loops Hairpins Stems etc. Visualization of canonical and non-canonical interactions between nucleotides BSc. Präsentation, 2016, M. Haberbusch

8 Examples from Google Search
BSc. Präsentation, 2016, M. Haberbusch

9 BSc. Präsentation, 2016, M. Haberbusch
Forna Container BSc. Präsentation, 2016, M. Haberbusch

10 BSc. Präsentation, 2016, M. Haberbusch
Forna Container (1) Force-directed graph layout algorithm for RNA secondary structure graph plotting in web browser Implemented by the Theoretical Biochemistry Group at the Institute for Theoretical Chemistry at University of Vienna P. Kerpedjiev, S. Hammer, I. Hofacker. "Forna (force-directed RNA): simple and effective online RNA secondary structure diagrams."Bioinformatics (2015): btv372. BSc. Präsentation, 2016, M. Haberbusch

11 BSc. Präsentation, 2016, M. Haberbusch
Forna Container (3) BSc. Präsentation, 2016, M. Haberbusch

12 BSc. Präsentation, 2016, M. Haberbusch
Example Outputs Friction: 0.65 Charge: -40 Chargedistance: 80 Friction: 0.65 Charge: -140 Chargedistance: 40 Friction: 0.95 Charge: -200 Chargedistance: 150 BSc. Präsentation, 2016, M. Haberbusch

13 BSc. Präsentation, 2016, M. Haberbusch
Metrics BSc. Präsentation, 2016, M. Haberbusch

14 Nodecollisions & Linkcollisions
Backbonelink Überlappungen Node Überlappungen BSc. Präsentation, 2016, M. Haberbusch

15 Linklength Deviation & Looproundness
Friction: 0.95 Charge: -190 Chargedistance: 150 Friction: 0.65 Charge: -40 Chargedistance: 80 Linkcollisions: 0 Nodecollisions: 0 Linklength Deviations: 6.5 Loop Roundness: 0.77 Linkcollisions: 0 Nodecollisions: 0 Linklength Deviations: 1.48 Loop Roundness: 0.22 vs BSc. Präsentation, 2016, M. Haberbusch

16 BSc. Präsentation, 2016, M. Haberbusch
Looproundness BSc. Präsentation, 2016, M. Haberbusch

17 BSc. Präsentation, 2016, M. Haberbusch
Simulation BSc. Präsentation, 2016, M. Haberbusch

18 BSc. Präsentation, 2016, M. Haberbusch
Simulation BSc. Präsentation, 2016, M. Haberbusch

19 BSc. Präsentation, 2016, M. Haberbusch
Sampling Friction: [0.3 ; 0.95] with step size of 0.05 {0.3, 0.35, 0.4, …, 0.95} Charge: [-30 ; -200] with step size of 10 {-30, -40, -50, …, -200} Chargedistance: [30 ; 150] with step size of 10 {30, 40, 50, …, 150} Basic quantity: BQ = friction x charge x chargedistance |BQ| = 3276 combinations Random sampling: 1000 combinations, randomly picked out of the basic quantity BSc. Präsentation, 2016, M. Haberbusch

20 BSc. Präsentation, 2016, M. Haberbusch
Random Sampling Why? Clutter Reduction (Visualization) 1000 points vs 3276 points per structure Runtime Reduction (Simulation) Basic quantity: 3276*100*5s= 19days Random sample: 1000*100*5s= 6days File size reduction (Simulation Results) Literature G. Ellis, Random Sampling as a Clutter Reduction Technique to Facilitate Interactive Visualisation of Large Datasets (2008) BSc. Präsentation, 2016, M. Haberbusch

21 BSc. Präsentation, 2016, M. Haberbusch
Simulation 1000 parameter combinations per structure Currently 74 structures BSc. Präsentation, 2016, M. Haberbusch

22 Visualization Prototype
BSc. Präsentation, 2016, M. Haberbusch

23 Advantages & Disadvantages
BSc. Präsentation, 2016, M. Haberbusch

24 BSc. Präsentation, 2016, M. Haberbusch
Advantages Allows quickly determining optima via pareto view Identifying the distribution of simulation results Understanding the performance of specific parameter combinations on different structuresizes Identify input parameter trends Comparing the quality of the graph regarding two metrics Testing and comparing parameter combinations directly via side by side comparison of the drawn structures BSc. Präsentation, 2016, M. Haberbusch

25 BSc. Präsentation, 2016, M. Haberbusch
Disadvantages Just two metrics in the visualization Only three input parameters in the visualization No multiple point selection for better comparison No statement on the stability of the algorithm Performance (loadtimes) Difficult/unable to identify combinations in the input parameter scatter plot Red green blue coding problematic regarding colorblindness BSc. Präsentation, 2016, M. Haberbusch

26 BSc. Präsentation, 2016, M. Haberbusch
Improvements Parallel coordinates instead or in addition to input parameter scatter plots Eventually color coding of points in input parameter scatter plots Performance tuning (fine tuning of caching) Highlighted points placed on top of others Input parameter trends depending on structure size Two paretoscatter side by side to compare outcomes for two different structsizes BSc. Präsentation, 2016, M. Haberbusch

27 Thank you for your attention!
BSc. Präsentation, 2016, M. Haberbusch

28 BSc. Präsentation, 2016, M. Haberbusch
Any Questions??? BSc. Präsentation, 2016, M. Haberbusch

29 Simulation Workflow (1)
rsampling.php to generate N random parameter combinations rstructuress.php to generate N random RNA sequences with different length in FASTA-format RNAFold from Vienna RNA Package to generate the secondary structures startsimulation.sh to run the simulation BSc. Präsentation, 2016, M. Haberbusch

30 Simulation Workflow (2)
combinations.json structures.struct CGCUUCAUAUAAUCCUAAUGACCUAU ((..((....)).(((....))).)) […] ./startsimulation.sh combinations.json structures.struct simulationresults BSc. Präsentation, 2016, M. Haberbusch

31 Simulation Workflow (3)
Simulation output: Fileformat: JSON One file per structure containing all iterations BSc. Präsentation, 2016, M. Haberbusch

32 BSc. Präsentation, 2016, M. Haberbusch
Simulation Dataset Structure of the dataset to load in the simulation BSc. Präsentation, 2016, M. Haberbusch


Download ppt "28. Juni 2016, BSc. Präsentation, M. Haberbusch"

Similar presentations


Ads by Google