Development of an interactive pipeline for Genome wide association analysis Falola Damilare & Adigun Taiwo – Covenant University Bioinformatics research.

Slides:



Advertisements
Similar presentations
1 Copyright © 2002 Pearson Education, Inc.. 2 Chapter 1 Introduction to Perl and CGI.
Advertisements

Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Outline IS400: Development of Business Applications on the Internet Fall 2004 Instructor: Dr. Boris Jukic Server Side Web Technologies: Part 2.
2440: 141 Web Site Administration Web Server-Side Programming Professor: Enoch E. Damson.
XForms: A case study Rajiv Shivane & Pavitar Singh.
NetTech Solutions Working with Web Elements Lesson 6.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Silverlight Technology. Table of Contents 1.What is Silverlight Technology? 2.Silverlight Overview. 2.1 How it works 2.2 Silverlight development tools.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Chapter 13 A & B Programming Languages and the.
Web Page Design The Basics. The Web Page A document (file) created using the HTML scripting language. A document (file) created using the HTML scripting.
Role of Metadata in dissemination of census data Regional Seminar on dissemination and spatial analysis of census data, Nairobi, September, 2010.
Enhancements to Galaxy for delivering on NIH Commons
MIPAR – A Science Gateway for Analyzing and Sharing Medical Images
IBIS, A Framework For the Interoperability Of Bio-repository Information System In Africa - Final report Abayomi Mosaku and Boladele Akanle - Covenant.
EthERNet Social Network
ACEPRD Plant Repository
Web Technologies Computing Science Thompson Rivers University
An Adaptable e-Service Communication Model for Rural Agricultural Extension (e-AgriSERVICOMM) Olutayo Ajayi , Babarinde Oluwaseyi.
Education eLibrary and Repository
Presented by : Mosaku Abayomi and Akanle Boladele
WEKA Machine Learning Use Case – Breast Cancer - Final report
Intelligent Medical Image Analyzer
A web portal for management of biological data and applications
Online BIOS QTL atlases
An Adaptable e-Service Communication Model for Rural Agricultural Extension (e-AgriSERVICOM)- Final Report Olutayo Ajayi Babarinde.
An Adaptable e-Service Communication Model for Rural Agricultural Extension (e-AgriSERVICOM) Olutayo Ajayi , Babarinde Oluwaseyi.
GWAS-TOOL – Final report
An Adaptable e-Service Communication Model for Rural Agricultural Extension (e-AgriSERVICOMM) Intermediate report Olutayo Ajayi
MIPAR Extension- Final report
Medical Image Processor and Repository
ACEPRD Plant Repository – Intermediate report
Defuzzifier - Final report
Development of a SGW-based Plant Tissue Culture Micropropagation Yield Forecasting Application, Plantisc2 Collins Udanor – University of Nigeria Nsukka.
Segun OYEYIOLA – Obafemi Awolowo University, Ile-Ife - Nigeria
Medical Image Analyzer - Final report
An Adaptable e-Service Communication Model for Rural Agricultural Extension (e-AgriSERVICOMM) Intermediate report Olutayo Ajayi
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Tochukwu Eze, Ekene Ezeasor, and Ikemefuna Uzochukwu
WIMEA – ICT: Science Gateway for Weather Information Management in East Africa to interact with ICT Tool WRF MAKWEBA, Damas – DSM Institute of Technology.
Development of a SGW-based Plant Tissue Culture Micropropagation Yield Forecasting Application, Plantisc2 - Final report Collins Udanor – University of.
WEB BASED PREDICTIVE DEFUZZIFIER
Segun OYEYIOLA – Obafemi Awolowo University -
Public Health Gateway In Kenya
Silverlight Technology
Development of a SGW-based Plant Tissue Culture Micropropagation Yield Forecasting Application, Plantisc2 – Intermediate report Collins Udanor – University.
MIPAR (Extension)– Intermediate report
Use case name FirstName LastName – Organisation - Country ( address)
and Program Development
Public Health Gateway In Kenya
One SNP at a Time: Moving beyond GWAS in Psoriasis
Use case name - Final report
Use case name FirstName LastName – Organisation - Country ( address)
Web Technologies Computing Science Thompson Rivers University
HPC for large NGS data: Microbial diversity analysis
Discovery From Data Repositories H Craig Mak  Nature Biotechnology 29, 46–47 (2011) 2013 /06 /10.
MIPAR Extension- Final report
HPC416S - Final report Trust Odia – Covenant University Bioinformatics Research Group - Nigeria WACREN e-Research.
Use case name – Intermediate report
Use case name - Final report
Web Application Development Using PHP
DIBBs Brown Dog BDFiddle
Presentation transcript:

Development of an interactive pipeline for Genome wide association analysis Falola Damilare & Adigun Taiwo – Covenant University Bioinformatics research – Nigeria (dare.falola@cu.edu.ng & taiwo.adigun@covenantuniversity.edu.ng) WACREN e-Research Hackfest – Lagos (Nigeria)

Background information Scientific Problem, Aim and benefits Outline Background information Scientific Problem, Aim and benefits Computational & Data model Implementation strategy Typical user action workflow Summary and Conclusion

Background Information The need for tailor healthcare and treatment therapies to individual patients based on their genetic make-up and other biological features is becoming more essential in today’s clinical practice. Genome Wide Association Study (GWAS) has been applied extensively to uncover several variations also known Single Nucleotide Polymorphisms (SNPs) and genes related to different diseases, traits and clinical symptoms.

Background Information Genome-wide association studies involves the collection of several unrelated individuals with and without a specific trait or disease. the use of high-throughput genotyping technologies to assay hundreds of thousands of single-nucleotide polymorphisms (SNPs) of those individuals. relate the genotyped SNPs using appropriate statistical techniques e.g. Chi Square, Logistic regression etc. to clinical conditions and measurable traits to find what SNPs might be associated with the disease.

Background Information

Typical GWAS workflow

Scientific Problem, Aim and benefits A typical GWAS analysis involves the use of numerous complex commands from different languages, which makes research work complex for researchers. Use of large computing and storage resources to perform state of art GWAS data analysis which might not be available for most African or developing country researchers. AIM The aim of this project is to develop and implement an e-infrastructure that will provide state-of-the art GWAS analysis to local researchers. This tool will include all tools. Benefits This allows users focus mainly on the research problem, by making the analysis process a black box technique, which will bring about better and accurate research results. This solution also brings in user interactivity providing better visualization of results, swift comparison of results from different types of analysis, and management of several projects.

Computational & Data model

Typical user action workflow: The main users of the system are: Public health or medical researchers, scientists, and bioinformaticians who have and would upload genotype & phenotype data. i.e. either as a raw-intensity file, for analysis starting at the first phase or in a plink format, for analysis starting at the second phase or a list of significant SNPs for the third phase. A typical GWAS analysis involves three main phases, SNP chip genotype calling, Association testing and Post GWAS analysis.

Typical user action workflow: Phase 1 includes four (4) stages, which are initial quality control, genotype calling, post-calling quality control and conversion to plink file format. Phase II includes four steps, they are quality control, Population stratification correction association testing and result visualization. Phase III involves the annotation of the biological significant markers we associated with the disease phenotype in Stage II.

Implementation strategy Back-end Each sub stages of every phase have implemented in various standalone bash, perl, R scripts and Java source codes. The business logic of the system will be implemented using Java technologies which includes: Servlets and Java Server Pages. Each scripts for each phase will be parallelized using "processes input and output declarations" of NextFlow DSL (Domain Specific Language). Complex stages like population stratification will be put into different NextFlow pipeline scripts. Java API for RESTful Web Services (JAX-RS) and Javscript Object Notation (JSON) will be used to aid developers' programmatic access to the web application  FutureGateway will be used to provide access to distributed computing resources such as grid, cloud and HPCs.  

Implementation strategy Front-end Dataset upload will be done via FTP or globus online APIs for JAVA in to a storage element. gLibrary will be used to manage metadata about the data. HTML5 and JavaScript will be used for UI design. styling of the interface will be done using Cascading Style Sheet (CSS) and the system will be made mobile responsive using the CSS 3 @media Query.  The database will be built using MYSQL (Relational Database Management System) RDBMS.

Summary and conclusions This solution makes GWAS analysis easier to perform, by requiring limited understanding computational needs from researchers. This allows them to focus mainly on research problem and give better biological interpretation to the results.

Special Appreciation to Abayomi Mosaku, Bruce Becker and Mario Torrisi Thank you! Special Appreciation to Abayomi Mosaku, Bruce Becker and Mario Torrisi sci-gaia.eu info@sci-gaia.eu