R.J. Allan Portals and User Interfaces for Data Management and Grid Computing Rob Allan Leader of the Grid Technology Group 9 th November 2007
Making the Grid easier to use… 1.Why are we doing it? 2.What solutions are we investigating? 3.What has research given us? 4.How is it being used?
Institutions need Autonomy and Security Host – client relationship Example solution suggested by Web server - browser Communication must be initiated by client because of firewall around client’s institution. Can use a proxy or gateway.
The Grid “Client Problem” Grid Core Consumer clients: PC, TV, video, AG Workplace: desktop clients Portable clients: phones, laptop, pda, data entry… Middleware e.g. Globus Grid Core Many clients want to access a few Grid- enabled resources
R.J. Allan Grid Infrastructure Deployment (adapted from Foster and Kesselman)
How to Deliver e-Science Services? Provide heavyweight functionality (Globus, Condor, SRB?), but only on Grid-enabled hosts; Implied need for client-server software architecture, e.g.: –Web-based portal with familiar browser –Client programming library, API in C, C++ Java, Perl, Python, R etc. –Ability to link to existing applications/ GUIs –Command-based shell interface –Drag and Drop interface Need a published set of services on Grid hosts – OGSA model needs registry and semantics; Need easy development and deployment framework for applications and client tools. International workshops: Portals and Portlets, NeSC, 14-17th July 2003 Portals and Portlets, NeSC, 17-19th July 2006
Lightweight Grid Computing Concept: Aim was to develop “lightweight” client interfaces to the Grid using Web services via an intermediate server. Lessons learned from portal development which has a similar architecture. The server is “trusted” and can have ports open enabling Globus and other Grid middleware to be used at the back end. Research and outcomes: GROWL – developed as a JISC-funded Virtual Research Environment prototype MCS/ RMCS – developed in the NERC-funded e-Minerals project: Environment from the Molecular Level AgentX – developed with e-Science core funding to support data interoperability for Collaborative Computational Project (CCP) applications
Architecture
R.J. Allan Users of Lightweight Interfaces (adapted from Foster and Kesselman)
Opportunities GROWL, RMCS and AgentX now constitute a computational framework which is being deployed for application users in the CCPs as a collaboration with CSED, NceSS: the National Centre for e-Social Science via the ESRC e-Infrastructure project and for large-scale facilities such as the Diamond Light Source. We organised a workshop in May 2005 to compare solutions for lightweight Grid computing in addition to RMCS, GROWL and AgentX, there is AHE, GridSAM, GridSite, WSRF::Lite, … Also considered SAGA from GGF as a standard interface There could be a significant chance for the UK to establish a lightweight Grid client toolkit combining the best aspects of these solutions A joint bid to OMII was unsuccessful
Portals Concept: Aim was to develop Web-based interfaces using technology accessible to standard browsers. Embrace all aspects of this, including Web 2.0. Best practice from existing tools for on-line learning, information management, collaboration and research. Use current Java language standards for interoperabilty: JSR-168, JSR-170, JSR-286, WSRP, AJAX, … Research and outcomes: DataPortal – example of pilot project for data management HPCPortal/ InfoPortal – examples of pilot projects for Grid computing NGS Portal – evolved from HPCPortal and InfoPortal into a generic interface for the National Grid Service e-HTPX – management of protein crystallography pipeline, data reduction and structure analysis Sakai – development of a framework and tools as a Virtual Research Environment for different communities JSR-168 is the key Java portlet technology. Announced shortly after our first international conference in 2003.
R.J. Allan Users of Portals (adapted from Foster and Kesselman)
NGS Portal NGS Portal is a generic portal for HPC applications on Grid clusters Embraces new ideas and standards, e.g. JSDL Goal: to build communities, sharing job descriptions and best practice Portlets tried and tested - can be used by new e-Science projects (available on CD?) Opportunities to collaborate to enhance the suite of portlets and develop a repository, e.g. for workflow Additional funding from OMII
Virtual Research Environments Sakai: Open Source, Java, Collaborative Learning Environment. Enterprise- level - can support 10,000s of users 1.We we funded by JISC to carry out the Sakai Evaluation Exercise (2005) 2.I developed much of the background material for the JISC VRE-1 Programme – realisation that VLE and VRE are similar 3.We were funded by JISC to develop a Virtual Research Environment based on Sakai (2005-7) 4.Groups at Cambridge and Hull/ East Anglia were funded to apply Sakai to targeted research areas (2005-7) 5.We were funded by JISC to carry out a review of how the Information Environment could meet the needs of researchers using portals (2006) 6.We are currently further developing the CREE and SPP portlets for access and cross-search of open archival institutional repositories (2007)
R.J. Allan Some Questions about VRE Development How can existing tools be re-purposes to rapidly meet the needs of e- Research? – collaboration, information management, training How can we make use of the best Java technology available? – JSR- 168, Spring, Hibernate, AJAX, etc. How can we include legacy code, e.g. Perl, C++? – WSRP, bridges Can we federate services underpinning a portal, e.g. using remote portlets? - WSRP Can we develop portlets using service registries? – UDDI, WSDL Can portlets truly be made re-usable in different frameworks? Can we support a repository of such portlets? Can we include commercial portlets and frameworks (IBM, Sun, BEA, Oracle)? How can we include Workflow, Semantics, Web 2.0?
Emerging Architecture
R.J. Allan Some Questions about VRE Usage Deployment and evaluation of such a VRE tests and extends our understanding of practical ICT-based support for research in the following areas: How can portal frameworks be configured to best suit the expectations and work practices of different research user communities and institutional or organisational contexts? Can tools from multiple institutions and organisations be brought together coherently to enable sharing of information, processes and collaboration? Can community-specific tools be integrated meaningfully alongside generic and remotely-hosted Web tools? Can a portal based approach provide the flexibility to enable effective use by both researchers and administrators? At what points are desktop tools or those provided by a mobile platform, more effective? How might these be best integrated within a meaningful user experience? How can we engage HCI and CSCW specialists
Science Gateways Sakai is now the preferred delivery framework for portals to support growing communities of users who want to manage data and information, access Grid resources, and collaborate on-line using Web 2.0 style tools. The Generic portal and VRE tools are being customised to their needs. We refer to these as “science gateways”. ReDRESS: Resource Discovery for e-Social Science NceSS: National Centre for e-Social Science and the ESRC e- Infrastructure users University of Cambridge – teaching and research University of Manchester – research services CCP9 and Psi-k: large EU network of excellence (1,900 users) NW-GRID: project management and users Diamond Light Source is investigating Sakai-based tools
Opportunities Hosting Science Gateways for growing number of communities –may require special-purpose servers to enhance the Grid infrastructure, e.g. for data pre-processing, visualisation or statistical analysis engage the community in contributing to a research portlet repository, e.g. via NGS WP6 provide CD version of pre-configured portals for new projects, they can develop and add their own tools engage further with international community, Sakai, GridSphere, uPortal, etc. Growth of Campus Grids, requires easy access for non-experts
Summary Demonstrated need to make distributed resources accessible to end users engaged in research activities Address needs to integrate and manage access to underlying services Delivery to a variety of users with different needs Common architecture emerging Addressing issues of rapid deployment and re-usability of software Engaging with activities of NeSC, NIEeS and NCeSS for outreach and training Beginning to address usability issues An ongoing programme of targetted research to address the above has produced outcomes - papers, best practice and software – all available from STFC e-Science Centre.