Science Gateways Marlon Pierce Science Gateway Group Indiana University
What I Want to Accomplish Invite you to participate in the Science Gateway Institute – NSF S2I2 planning grant Invite you to participate in Apache Airavata – Open community software for building scientific workflows for gateways
What Are Science Gateways? Web-based user interfaces and services that provide a science-centric view of cyberinfrastructure. Often centered around running science applications and workflows on grids and clouds. But can also be data-centric, information-centric – Earth System Grid – eBird and citizen science portals Or community-centric – Such as many HUBzero hubs.
GatewayDomainMetrics NanoHUBNanotechnologyHas supported nanotechnology simulations and data sharing among more than 250,000 users since 2000 and has been cited in more than 900 publications. CIPRESBioinformaticsMade it possible for more than 6,000 biologists to run phylogenetic analyses on XSEDE computing resources over the past 3 years and has enabled more than 475 publications over that period. UltraScanBiophysicsSupported the data analysis needs of over 120 active scientists and has contributed to over 60 publications during the last 3 years. GridChemChemistryProvided access to computational chemistry tools for more than 800 users, enabling 47 publications between 2007 and Gateways Support Science
Science Gateways Institute
Science is all about connections – Instruments, sensor networks, HPC facilities, campus laboratories, visualization facilities, data stores – Connections are often made through software A critical, but often overlooked component NSF vision for cyberinfrastructure in the 21st century Software is critical to today’s scientific advances
Scientific Software Elements (SSE) – Small groups create software that advances one or more area Scientific Software Integration (SSI) – Larger interdisciplinary teams, software frameworks Scientific Software Innovation Institutes (S2I2) Software vision implemented in 2010 Software Infrastructure for Sustained Innovation (SI2) program
Institutes: Long term hubs of excellence Serve a research community of substantial size and disciplinary breadth Expertise, processes, architectures, resources and implementation mechanisms to transform research practices and productivity Support, outreach, workforce development, proactive approach to diversity Pathways to community involvement
The Science Gateway Institute Partners Project Leadership – Nancy Wilkins-Diehr, SDSC Community Workshop Organization – Katherine Lawrence, University of Michigan Workforce Development – Linda Hayden, ECSU Gateway Providers – iPlant: Dan Stanzione, Rion Dooley – HUBzero: Michael McLennan, Michael Zentner – Apache Airavata: Marlon Pierce, Suresh Marru
Millions of dollars are spent on gateways, but developers face several challenges: They often work in isolation even though development can be quite similar across domain areas. They need to bridge cyberinfrastructure—locally, campus-wide, nationally, and sometimes internationally. They need foundational building blocks so they can focus on higher-level, grand-challenge functionality. They struggle to secure sustainable funding because gateways span the worlds of research and infrastructure.
Business plan development and review Development environment, consulting, documentation and software recommendations Software repositories Software engineering facilities Software assessment services – like Open Source Software Advisory Service, Apache assessment service, Software Sustainability Institute (UK) Build-and-test facilities Hosting service Offering gateways expertise in the following areas: – Usability assessment – Licensing – Sustainability – Project management – Security Incubator Service Assist with the entire lifecycle of a gateway:
Apache Airavata: Software for Scientific Workflows
What Is Apache Airavata? Science Gateway software system to Compose, manage, execute, and monitor distributed, computational workflows Wrap legacy command line scientific applications with Web services. Run jobs on computational resources ranging from local resources to computational grids and clouds
Workflow Interpreter Application Factory Message Box Regist ry Apache Airavata API Lorem ipsumLorem ipsum insolensinsolens p1p1 m5m5 duo duo x End Users Gateway Developer Scientific Application Core Developer Computational Resources Apache Airavata Architecture
DomainDescription AstronomyImage processing pipeline for One Degree Imager instrument on XSEDE AstrophysicsSupporting workflow of Dark Energy Survey simulations working group on XSEDE BioinformaticsSupported workflow executions on Amazon EC2 for BioVLAB project BiophysicsManage large scale data analysis of analytical ultracentrifugation experiments on XSEDE and campus resources Computational Chemistry Manage workflows to support computational chemistry parameter studies for ParamChem.org on XSEDE Nuclear PhysicsWorkflows for nuclear structure calculations using Leadership Class Configuration Interaction (LCCI) computations on DOE resources Apache Airavata in Action
Cyberinfrastructure: How Open is Open Source Software? What’s missing? Open source licensing Open standards Open codes (GitHub, SourceForge, Google Code, etc We also need open governance
Open Community Software and Governance Open source projects need diversity, governance. – Reproducibility – Sustainability Incentives for projects to diversify their developer base. Govern Software releases Contributions Credit sharing. Members are added Project direction decisions. IP, legal issues Our approach: Apache Software Foundation Collaborate Compete
More Information Science Gateway Institute: – Nancy Wilkins-Diehr, PI Contact me: Apache Airavata: You can contribute to Apache Airavata! Join the mailing list: YouTube presentation on Apache and NSF Cyberinfrastructure: U U
science gateway /sī′ əns gāt′ wā′/ n. 1.an online community space for science and engineering research and education. 2.a Web-based resource for accessing data, software, computing services, and equipment specific to the needs of a science or engineering discipline. We are building an institute to serve you—and others like you— with resources, services, experts, and ideas for creating and sustaining science gateways. Sign up to join the conversation: Are you building gateways that serve your science discipline? Do you wish you could connect with and learn from others who are doing the same thing?
1.Multi-level, long term support (individual, team, institute) 2.Responsibility for verification, validation, reproducibility 3.Consistent policy on open source 4.Collaborations across divisions, agencies and industry 5.Use of ACCI to obtain community input on priorities NSF CI Advisory Committee commissions 6 task forces Software task force recommends to NSF:
Figure 1. High-level architecture of software offerings and value-added services provided by the institute.
Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced Science Tools Science Gateways: Enabling & Democratizing Scientific Research
Simultaneous NSF study identifies limitations to short-lived science portals or gateways Characteristics of short funding cycles – Build exciting prototypes with input from scientists – Work with early adopters to extend capabilities – Tools are publicized, more scientists interested – Funding ends – Scientists who invested their time to use new tools are disillusioned Less likely to try something new again – Start again on new short-term project Need to break this cycle and fund for long-term success Science Gateway Institute conceptualization award in 2012
Gateway-Building Support Institute staff assigned to a project for months, up to a year – Assist with gateway development or implementation of advanced features Workflows, fault tolerance, sensor feeds, HPC simulations – Teach research teams what it takes to build, enhance, operate, and maintain gateways after support ends – Peer-reviewed request process open to all
Gateway Forum Gathering place for scientific web developers across NSF directorates, agencies, and international boundaries Social forums, white papers, blogs, testimonials and user stories Annual conference Broad and engaging symposium series Gateway training program – Synchronous and asynchronous, video tutorials – Best practices, case studies Showcase of successful projects Environment that enables continuous community feedback
Gateway Framework Modular, layered approach – Supports community contributions – Grocery store approach allows developers to pick and choose the components they need Tiered architecture 1.Value-added services Publication channel for delivering content to a wider audience Information repositories for good design practices Information/code samples for best practices in user-interface and user- experience design 2.Core web framework which includes hosted site creation and content management 3.Platform API to provide a cohesive set of RESTful web services upon which the previous two layers rely 4.Systems layer where the hardware and low-level middleware reside Clouds and cloud services, HPC systems, grid middleware, data warehouses, databases, instrumentation, and distributed data stores
Workforce Development Terrific opportunities for students and IT professionals – Much science gateway development currently done by campus IT Gateway building training – Web development is a natural interest area for students Very visual, see results of programming instantly – Builds cross-disciplinary communication skills Talk to scientists, construct a gateway that meets their needs – Utilize existing programming opportunities such as Google Summer of Code Opportunities to proactively address diversity
Community engagement activities in conceptualization grant One-on-one interviews with community leaders Group-based data collection – Focus groups, BOFs, workshops – Broad online surveys Social-feedback services – Get Satisfaction, UserVoice, HUBzero Continued events in the full institute to stay in touch with the community – Annual conference – Rolling 5-minute polls