Agenda VAPrototype GenomeServices iPlant API 1. GUIGUI Design info/file GEO# GOterms of interest GenExpr2Ddata GenExprSum ContrastFiles EnrichedGO graphs.

Slides:



Advertisements
Similar presentations
Annotation of Image Segments using Ontologies Justin Preece Research Assistant, Bioinformatics Dept. of Botany and Plant Pathology Oregon State University.
Advertisements

AHRT: The Automated Human Resources Tool BY Roi Ceren Muthukumaran Chandrasekaran.
Abstract BarleyBase ( is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression.
The Design Of A Web Document Snapshots Delivery System David Chao College of Business San Francisco State University.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
GeWorkbench Remote Access to caArray Data Fan Lin Ph.D. Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and.
IBM Proof of Technology Discovering the Value of SOA with WebSphere Process Integration © 2005 IBM Corporation SOA on your terms and our expertise WebSphere.
IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.
Cytoscape A powerful bioinformatic tool Mathieu Michaud
Review of Ondex Bernice Rogowitz G2P Visualization and Visual Analytics Team March 18, 2010.
Customized cloud platform for computing on your terms !
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Configuration Management and Server Administration Mohan Bang Endeca Server.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
IPlant Omics Data Analysis and Visualization System iPlant Viz Team, Lecong Zhou, demo,
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
SEMESTER PROJECT PRESENTATION CS 6030 – Bioinformatics Instructor Dr.Elise de Doncker Chandana Guduru Jason Eric Johnson.
Web Mashups -Nirav Shah.
Networks and Interactions Boo Virk v1.0.
Abstract BarleyBase is a USDA-funded public repository for plant microarray data. BarleyBase houses raw and normalized expression data from the 22K Affymetrix.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
RNAseq analyses -- methods
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
NGS data analysis CCM Seminar series Michael Liang:

The HDF Group ESIP Summer Meeting HDF Studio John Readey The HDF Group 1 July 8 – 11, 2014.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Phase II Additions to LSG Search capability to Gene Browser –Though GUI in Gene Browser BLAST plugin that invokes remote EBI BLAST service Working set.
UBio Training Courses Micro-RNA web tools Gonzalo
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
Using Biological Cyberinfrastructure Scaling Science and People: Applications in Data Storage, HPC, Cloud Analysis, and Bioinformatics Training Scaling.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
“My Experiment” and What I Want to Discover My experiment involved comparing the effect of long term ozone exposure on gene expression in a wild type and.
My CoGe Comparing our genomes. Background and Introduction  Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.
Analysis of GEO datasets using GEO2R Parthav Jailwala CCR Collaborative Bioinformatics Resource CCR/NCI/NIH.
Server-side Programming The combination of –HTML –JavaScript –DOM is sometimes referred to as Dynamic HTML (DHTML) Web pages that include scripting are.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Esri UC 2014 | Technical Workshop | Creating Geoprocessing Services Kevin Hibma.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
112/14/2015 Discovery of Composable Web Services Presented by: Duygu ÇELİK Submitted by: Duygu ÇELİK & Vassilya ABDULOVA Submitted to: Assoc.Prof.Dr.Atilla.
IPG2P Steering Committee January 25, CI Development Update Feb 2011 – next release – API function and Job Execution Paves way for user-led analysis.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Data analytics and mash-up Real time analytics of employment data Team Shadowfax 1/25/2016 CMPE Class Project 0.
Lei Kong, Ph.D. Center for Bioinformatics Peking University ABrowse - A General Purpose Genome Browser Framework.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment.
July 19, 2004Joint Techs – Columbus, OH Network Performance Advisor Tanya M. Brethour NLANR/DAST.
Canadian Bioinformatics Workshops
Visualizing data from Galaxy
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Convert generic gUSE Portal into a science gateway Akos Balasko.
Ingenuity Pathway Analysis Alex Pico. Description "IPA is a software application that enables researchers to analyze and understand the complex biological.
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
1 RIC 2009 Symbolic Nuclear Analysis Package - SNAP version 1.0: Features and Applications Chester Gingrich RES/DSA/CDB 3/12/09.
Web Mashups -Nirav Shah.
Middleware independent Information Service
Data Virtualization Tutorial: JSON_TABLE Queries
Interoperability & Standards
Serverless Architecture in the Cloud
Presentation transcript:

Agenda VAPrototype GenomeServices iPlant API 1

GUIGUI Design info/file GEO# GOterms of interest GenExpr2Ddata GenExprSum ContrastFiles EnrichedGO graphs GeneMANIA output AffyGen Analyser AffyGen Analyser BiNGO GO Analyser GO Analyser Enriched GOterms GeneMANIA Gene Lists Gene Lists GenAV Analysis & Visualization of Affymetrix Gene Expression Data User Output(tex t files, graphs)

3 LIVE DEMO

4 Working group-led iteration and discussion (Jan-June) Componentization Reusability Identification of potential GUI representations for work products Summer Supercomputing Institute Meeting (July) Refinement of workflow Identification of entry and exit point Iteration on GUI representations Cyberinfrastructure-oriented design Implementation decisions Technology/language Work allocation How did we get here?

5 Expression Analysis BioConductor limma Retrieve data Specify experiment design Normalize (gcRMA) Linear model fit Bayesian correction Hypothesis testing Emit results NCBI GEO iPlant Data Storage API Limma is a standard module for expression analysis Limma incorporates translation and integration code to handle most common array platforms Limma writes verbose but consistent delimited results People know how to use BioC/Limma and can do so on their desktop systems Entry point is user upload expression file into the iPlant Data API

6 VAPrototype Retrieve data via /data API Iterate over experiments Perform category enrichment Consolidate results Return as JSON data structure 1.Invoke VAPrototype via iPlant /jobs API 2.Poll for service to complete 3.Fetch results as JSON 4.Render to dynamic table 5.Interpret user interactions Lecong.cgi Accept gene list Accept control list Accept parameters Run analysis using call to R Return JSON data structure iPlant Jobs API iPlant Data API R/Bioconductor/HyperGO

7 1.Interpret user interactions 1.Sorting 2.Downloading 3.Invoke Network Analysis service via iPlant /jobs API 4.Poll /jobs for completion 5.Fetch results (GraphML) 6.Render in Cytoscape Web BuildNetwork Accept gene list Accept parameters (species, etc) Accept algorithm name (GeneMania) Invoke GeneMania plugin (Java) to predict network Convert all gene names to AGI codes Convert domain-specific report to GraphML iPlant Jobs API iPlant Genome Service API Gene Mania

8

9

What’s next 10 VAPrototype won’t see any explicit additional development since it is a proof of principle We need to focus on delivering robust versions of the functions that are mocked up It serves as a reference implementation for a 3 rd party DE It also illuminates specific data integration needs We may use it as a testing ground for new ideas in GUI, service coordination, and API design It will be ported to use the full implementation of the iPlant API and used as an example for potential developers Web application portion: 1 day Web services: 1 week

Genome Services 11 Why is this needed? This is G2P not genomics! Support multiple genomes in UHTS services Support germplasms and natural accessions Pave the way to supporting user genomes Make best use of existing resources Sane, authority-led approach to data integration

Current Ideas Return a structured list of taxonomic identifiers (Genus, species, version, germplasm/accession) supported by iPlant Given a genus, species, version, and germ plasm/accession identifier: Return a URI pointing to a multiple-FASTA containing the genome sequence Return a URI pointing to a GFF3 version of the genome annotation Return a URI pointing to a GTF version of the genome annotation Return a URI pointing to the dummy expr files needed by Cufflinks for RNAseq Be able to actually return the files referenced by these URIs for download Given the taxonomic identifier plus a name or synonym of a gene Return an authoritative name for said gene Given the taxonomic identifier plus a microarray platform name plus a probe identifier: Return the canonical gene name mapped to that microarray probe iPlant Genome Services API Clade- specific data authorities NCBI and EBI Local Knowledge Mirroring relationships

Genome Services iPlant Genome Services API Clade- specific data authorities NCBI and EBI Local Knowledge Mirroring relationships Direct relationships Indirect relationships (CoGE) Taxonomic Name Resolution Service (TNRS) Discovery Environment TAIR Gramene Phytozome Etc.

The iPlant API 14 The iPlant API will support the following use cases: 1.I have a command-line tool that performs a specific type of bioinformatics analysis and I want to make it available to others. 2.I have a web service that performs a specific type of bioinformatics analysis and I want to make it available to others. 3.I have a web site that people can use to perform analyses and I want to make it available to others. 4.I want to write an web application that chains multiple types of tools together. 5.I want to use a workflow manager like Taverna or Kepler to orchestrate a set of analytical steps.

Architecture

Core Services Eventing I/O Data Transforms App Discovery Job Mgmt. User Profile Mgmt. Authentication User/Project Auditing Mashups (Orchestration)

I/O Services Getting raw data into and out of the iPlant CI and moving data around internally /io: upload files and stage URIs (http, https, ftp, sftp, gsiftp, jdbc, amazon s3, irods) /io/list: list iPlant files /io/ : download, delete file

Job Management Services Submitting and managing jobs to run supported applications as well as querying for historical information about jobs /job: submitting a job /job/history: historical job history /job/ : kill an active job or get information about a job /job/ /input/list: get a listing of the input files associated with a specific job /job/ /input/ : retrieve a specific input file in the format it was in when the job ran /job/ /output/list: get a listing of the output files associated with a specific job /job/ /output/ : retrieve a specific output file associated with the job

Application Discovery Services Application discovery and management (different from semantic web service discovery) /apps: add a new application to the iPlant CI /apps/list: list all supported applications /apps/search: search for a specific application /apps/type/list: list all supported application types /apps/type/ : list all supported applications of a specific type /apps/name/ : list all supported applications matching a given name