Developing Cyberinfrastructure to Support Computational Chemistry Workflows Marlon Pierce (IU), Suresh Marru (IU), Sudhakar Pamidighantam (NCSA) Sashikiran.

Slides:



Advertisements
Similar presentations
LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Advertisements

Pulan Yu School of Informatics Indiana University Bloomington Web service based Varuna.Net.
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
The CLARION Project for the Infrastructure for Integration in Structural Sciences (I2S2) mtg, Rutherford Labs, 11 th February 2010 CLARION – Chemical Laboratory.
GridChem and ParamChem: Science Gateways for Computational Chemistry (and More) Marlon Pierce, Suresh Marru Indiana University Sudhakar Pamidighantam NCSA.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Indiana University QuakeSim Activities Marlon Pierce, Geoffrey Fox, Xiaoming Gao, Jun Ji, Chao Sun.
Open Grid Computing Environments Marlon Pierce, Suresh Marru, Gregor von Laszewski, Mary Thomas, Maytal Dahan, Gopi Kandaswamy, and Wenjun Wu.
Integrating Chemistry Scholarship with Web Architectures, Grid Computing and Semantic Web Sashi Kiran Challa, Marlon Pierce, Suresh Marru Indiana University,
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
The Open Gateway Computing Environment: Experiences Developing Tools for Scientific Communities in the Apache Software Foundation Marlon Pierce Indiana.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
E-Science Workflow Support with Grid-Enabled Microsoft Project Gregor von Laszewski and Leor E. Dilmanian, Rochester Institute of Technology Abstract von.
Cluster Computing through an Application-oriented Computational Chemistry Grid Kent Milfeld and Chona Guiang, Sudhakar Pamidighantam, Jim Giuliani Supported.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Future Grid Future Grid User Portal Marlon Pierce Indiana University.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
National Center for Supercomputing Applications GridChem: Integrated Cyber Infrastructure for Computational Chemistry Sudhakar.
Towards a Javascript CoG Kit Gregor von Laszewski Fugang Wang Marlon Pierce Gerald Guo
CloudCom Software for Science Gateways: Open Grid Computing Environments Marlon Pierce, Suresh Marru Pervasive Technology Institute Indiana University.
Software for Science Gateways: Open Grid Computing Environments Marlon Pierce, Suresh Marru Pervasive Technology Institute Indiana University
OGCE Workflow Suite GopiKandaswamy Suresh Marru SrinathPerera ChathuraHerath Marlon Pierce TeraGrid 2008.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
GridFE: Web-accessible Grid System Front End Jared Yanovich, PSC Robert Budden, PSC.
Open Grid Computing Environments: Advanced Gateway Support Activities RT Project Review October 7 th, 2010.
The ACGT Workflow Editing & Enactment Environment Giorgos Zacharioudakis Institute of Computer Science, Foundation for Research & Technology – Hellas (ICS-FORTH)
UltraScan Gateway Advanced Support GIG Team: Suresh Marru, Raminder Singh, Marlon Pierce Pervasive Technology Institute Indiana University Gateway Personal:
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
Apache Airavata (Incubating) Gateway to Grids & Clouds Suresh Marru Nov 10 th 2011.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
NEES Cyberinfrastructure Center at the San Diego Supercomputer Center, UCSD George E. Brown, Jr. Network for Earthquake Engineering Simulation NEES TeraGrid.
6 February 2009 ©2009 Cesare Pautasso | 1 JOpera and XtremWeb-CH in the Virtual EZ-Grid Cesare Pautasso Faculty of Informatics University.
OGCE Components for Enhancing UltraScan Job Management. Suresh Marru,Raminder Singh, Marlon Pierce.
INFSO-RI Enabling Grids for E-sciencE Running ECCE on EGEE clusters Olav Vahtras KTH.
Building Grid Portals with OGCE: Big Red Portal and GTLAB Mehmet A. Nacar, Jong Youl Choi, Marlon Pierce, Geoffrey Fox Community Grids Lab Indiana University.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
A Technical Overview Bill Branan DuraCloud Technical Lead.
GridChem Architecture Overview Rion Dooley. Presentation Outline Computational Chemistry Grid (CCG) Current Architectural Overview CCG Future Architectural.
GridChem Sciene Gateway and Challenges in Distributed Services Sudhakar Pamidighantam NCSA, University of Illinois at Urbaba- Champaign
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
BioVLAB-Microarray: Microarray Data Analysis in Virtual Environment Youngik Yang, Jong Youl Choi, Kwangmin Choi, Marlon Pierce, Dennis Gannon, and Sun.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Computational chemistry with ECCE on EGEE.
IU OREChem Summary Slides Marlon Pierce, Geoffrey Fox, Sashikiran Challa.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
Holding slide prior to starting show. Lessons Learned from the GECEM Portal David Walker Cardiff University
A Desktop Client for HPC Chemistry Applications: GridChem Kent Milfeld Supported by the NSF NMI Program under Award #
EMI is partially funded by the European Commission under Grant Agreement RI Open Source Software and the ScienceSoft Initiative Alberto DI MEGLIO,
Lightweight OGCE Gadget Portal for Science Gateways Zhenhua Guo, Marlon Pierce Community Grids Laboratory, Pervasive Technology Institute, Indiana University,
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
Shaowen Wang 1, 2, Yan Liu 1, 2, Nancy Wilkins-Diehr 3, Stuart Martin 4,5 1. CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Overview for ENVRI Gergely Sipos, Malgorzata Krakowian EGI.eu
Enabling Grids for E-sciencE University of Perugia Computational Chemistry status report EGAAP Meeting – 21 rst April 2005 Athens, Greece.
Open Grid Computing Environment Summary
OGCE Short Summary Marlon Pierce Community Grids Lab
Enable computational and experimental  scientists to do “more” computational chemistry by providing capability  computing resources and services at their.
Shaowen Wang1, 2, Yan Liu1, 2, Nancy Wilkins-Diehr3, Stuart Martin4,5
WS-PGRADE for Molecular Sciences and XSEDE
Marlon Pierce Indiana University February 14, 2012
Open Grid Computing Environments
OGCE Portal Applications for Grid Computing
Presentation transcript:

Developing Cyberinfrastructure to Support Computational Chemistry Workflows Marlon Pierce (IU), Suresh Marru (IU), Sudhakar Pamidighantam (NCSA) Sashikiran Challa (IU), Ye Fan (NCSA/IU), Patanachai Tangchaisin (IU)

Part 1: Reusable Middleware for OREChem Services and workflows for OREChem

Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” 3

PSU NMR Spectra and Structural Data Experiment data Bibliographic metadata Citations Figures Tables Chunks Reactions Molecular Compounds Cambridge Indiana Workflows, TeraGrid services Triplestore On Azure Cloud Triplestore On Azure Cloud Southampton A not particularly accurate summary of OREChem 4

IU’s Objective To build a pipeline to: Fetch ORE ATOM feeds Transform ATOM feeds into triples and store them into a triple store ( Using GRDDL/Saxon HE) Extract crystallographically obtained 3D coordinates information Submit compute intensive electronic structure calculations, geometry optimization tasks to tools like Gaussian09 on TeraGrid supercomputing resources. Transform the Gaussian output into triples and store them into a triple store 5

Extract Moiety feeds in CML format Convert CML to Gaussian Input format Gaussian on TeraGrid Gaussian Output to RDF triples Triplestore ATOM Feeds from eCrystals or CrystalEye OREChem-Computation Workflow N3 files or RDF/XML 6 Moiety files

ORECHEM REST Services Web serviceDescriptionInputOutput InChIExtractorExtracts InChIs by parsing the ATOM Feed entries ATOM feed URLString of InChI’s InChIto3DGenerates 3D coordinates of an InChI. (Open Babel) InChI string3D coordniates in CML format CML2GaussGenerates Gaussian input file. (Jumbo Converters) 3D coordinates (CML) Gaussian input file URL ATOM2RDFATOM to RDF/XML SAXON-XSLT (or GRDDL transformation) ATOM feed URLRDF/XML triples file URL RDFIntoVirtuosoPut the triples into Triple Store. (Jack-rabbit WEBDAV Client) POST RDF/XML triples file URL GRAPH IRI for SPARQL queries 7

ORECHEM REST Services Web serviceDescriptionInputOutput FeedsHarvest er Fetch the moiety feeds from Crystal Eye. (crystal-eye harvester) harvester name, number of feeds to be fetched URLs of the cml.xml files CML2Gaussia nSemCompCh em Generate Gaussian Input file. (Semantic Comp Chem) POST cml.xml file URL URL of the Gaussian Input file oiety&numofentries=5 nputgenerator 8

9 OREChem Workflow in XBaya

Part 2: Computational Chemistry Middleware Reusing software from the Open Gateway Computing Environments (OGCE) Project

What Is a Science Gateway? User Interface and supporting Web services to scientific applications, data sets, and resources running on cyberinfrastructure. – Science portals, Grid Computing Environments, … – Broaden and simplify usage Cyberinfrastructure: Distributed computing resources and overlaying middleware for scientific computing. – Prominent examples include TeraGrid, Open Science Grid – Middleware includes Globus, Condor, iRods/SRB, … – Some of these approaches being pushed by scientific cloud computing – That is another topic

TeraGrid is one of the largest investments in shared CI from NSF’s Office of Cyberinfrastructure Soon to become TeraGrid/XD

Computational Chemistry Grid Has a long history (S. Pamidighantam) – Started in 1998 as Quantum Chemistry Workbench – Evolved into ChemViz in NCSA Expedition Era – A pioneer of the TeraGrid Science Gateway and Community Account concepts – Manages software installations and licensing as well as middleware Currently in two incarnations – GridChem - Science Gateway for Molecular Sciences Production gateway – ParamChem – Automatic Parameterization of Molecular Mechanics Infrastructure research built on GridChem

GridChem Science Gateway Supported Applications – Gaussian, CHARMM, NWChem, GAMESS, Molpro, QMCPack, MD Amber, ACES, NAMD, Wien2K, Gromacs, Castep Usage Statistics (December 2010) – 431 Distinct Users – 37,500 Computational jobs’ metadata in DB – Over 2,000,000 Service Units consumed – Tracked over 50 peer reviewed publications – Reportable metrics are an important issue Supported Applications – Gaussian, CHARMM, NWChem, GAMESS, Molpro, QMCPack, MD Amber, ACES, NAMD, Wien2K, Gromacs, Castep Usage Statistics (December 2010) – 431 Distinct Users – 37,500 Computational jobs’ metadata in DB – Over 2,000,000 Service Units consumed – Tracked over 50 peer reviewed publications – Reportable metrics are an important issue

Simplified GridChem Architecture OGCE/GridChem Middleware GridChem Client Gaussian, GAMES & Other Molecular Editors & Input Generators Output Analysis & Visualization Gaussian, CHARMM, NWChem, GAMESS, NAMD, Amber … Job Managers & Data Movement Interfaces Job Managers & Data Movement Interfaces Configure Inputs Submit & Monitor Jobs Download Output Monitor Resources Manage Jobs

Sample GridChem Post Processing

Collaborations with Open Gateway Computing Environments The OGCE has several general purpose tools that are being phased into GridChem’s production middleware. XBaya: Graphical composition and execution of sequence of tasks. Workflow Interpreter Service and GFAC – Supports long running executions and asynchronous invocations. – Stop, rewind, and replay executions. – Support parametric sweeps of workflows. – Integrate human interactions into workflow executions. The OGCE has several general purpose tools that are being phased into GridChem’s production middleware. XBaya: Graphical composition and execution of sequence of tasks. Workflow Interpreter Service and GFAC – Supports long running executions and asynchronous invocations. – Stop, rewind, and replay executions. – Support parametric sweeps of workflows. – Integrate human interactions into workflow executions.

OGCE Workflow & Job Management Java CoG Abstraction Java CoG Abstraction DRMAA & SSH Utilities GridChem Client TeraGrid/X D Globus Campus Resources Condor, SSH, (SLURM) OGCE-Generalized GridChem Infrastructure Cloud API’s Amazon, Eucalyptus EC2 Interface Other Grid Middleware European Grids Unicore, Open Nebula (Requirements Driven) Molecular Editors & Input Generators Output Analysis and Visualization

ParamChem Overview Collaboration between University of Maryland, NCSA, University of Kentucky, University of Florida Goal: automate the process of parameterization for classical molecular mechanics and semi-empirical methods – These are realized as parameter sweeps of workflows. – Results disseminated through GridChem data management tools – Coupled execution of Quantum Chemistry and Molecular Mechanics. OGCE partners with ParamChem through the NSF SDCI program to provide workflow and job management middleware. Dynamics applications with optimization algorithms are being constructed as workflow chains. Workflow chains are submitted as part of parametric sweeps – In progress

Empirical ForceFields Parameterization Need Process Vanommeslaeghe et al. J. Comp.Chem 2010, 31, Lack of Accurate Force Fields Produce Erroneous Property Estimation

ParamChem Workflows Initial Structur e Optimized Structure

ParamChem Workflow

Part 3: Developing Sustainable Science Gateway Software The Open Gateway Computing Environments Project and Apache Software Foundation

OGCE Software NameDescription OGCE Gadget Container An OpenSocial and Google gadget-compatible Web container for running Web gadgets. GFACA Web service for generating, securely invoking, and managing the lifecycle of scientific applications on Grids and Clouds Workflow ToolsComposer (XBaya), interpreter (enactment) engine, event system, and service registry to support scientific workflows on Grids and Clouds. Gadgets and Gadget Building Tools Tools for building secure Google-gadget based Science Gateways. We try very hard to keep software scope under control. We don’t build data management systems, for example. We collaborate with groups who do.

OGCE Funds Software Lifecycle Obvious but new of NSF as it becomes more interested in sustaining its research investments.

Apache Incubators Joining Apache is our software sustainability strategy – Open source licensing, meritocracy, visibility Apache’s community development model is our experiment – More important than simply being open source. Need to go beyond SourceForge – Distributed control, distributed credit. Airavata: tools for science gateway services and workflows – XBaya, GFAC, Messenger, XRegistry – Collaboration with WS02/LSF, IBM – Builds on Apache Axis2, Apache ODE, (Apache Hadoop) Rave: OpenSocial gadget manager, general purpose gadgets – Collaboration with Hippo, Mitre, SURFnet – Builds on Apache Shindig Joining Apache is our software sustainability strategy – Open source licensing, meritocracy, visibility Apache’s community development model is our experiment – More important than simply being open source. Need to go beyond SourceForge – Distributed control, distributed credit. Airavata: tools for science gateway services and workflows – XBaya, GFAC, Messenger, XRegistry – Collaboration with WS02/LSF, IBM – Builds on Apache Axis2, Apache ODE, (Apache Hadoop) Rave: OpenSocial gadget manager, general purpose gadgets – Collaboration with Hippo, Mitre, SURFnet – Builds on Apache Shindig

More Information OGCE Web Site: News Feed/Blog: Contact us: – – Software Downloads: Software is available via SVN from our SourceForge project. – – See ogce.org/ogce/index.php/Portal_download