The Collaboratory: computing environments and infrastructure for structural biology research Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.

Slides:



Advertisements
Similar presentations
automated single login access to Novell storage resources
Advertisements

Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Welcome to the Award Winning Easiest to Use & Most Advanced View, Manage, and Control Security, Access Control, Video, Energy & Lighting Systems, & Critical.
HP Quality Center Overview.
15.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 15: Configuring a Windows.
System Center Configuration Manager Push Software By, Teresa Behm.
Web Server Hardware and Software
NCS Grid Service Ken Meacham, IT Innovation Crystal Grid Workshop, Sept 2004.
Components and Architecture CS 543 – Data Warehousing.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 10: Server Administration.
Chapter 9: Moving to Design
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Hands-On Microsoft Windows Server 2003 Administration Chapter 6 Managing Printers, Publishing, Auditing, and Desk Resources.
Chapter 8: Network Operating Systems and Windows Server 2003-Based Networking Network+ Guide to Networks Third Edition.
Maintaining and Updating Windows Server 2008
Web-Enabling the Warehouse Chapter 16. Benefits of Web-Enabling a Data Warehouse Better-informed decision making Lower costs of deployment and management.
Check Disk. Disk Defragmenter Using Disk Defragmenter Effectively Run Disk Defragmenter when the computer will receive the least usage. Educate users.
Live Meeting APIs Robert Devine Program Manager Microsoft Corporation.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
XenData Digital Archives Simplify your video archive workflow XenData LTO Video Archive Solutions Overview © Copyright 2013 XenData Limited.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
High-Speed, High Volume Document Storage, Retrieval, and Manipulation with Documentum and Snowbound March 8, 2007.
Chapter-4 Windows 2000 Professional Win2K Professional provides a very usable interface and was designed for use in the desktop PC. Microsoft server system.
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Trimble Connected Community
Customized cloud platform for computing on your terms !
Databases and the Internet. Lecture Objectives Databases and the Internet Characteristics and Benefits of Internet Server-Side vs. Client-Side Special.
The purpose of this Software Requirements Specification document is to clearly define the system under development, that is, the International Etruscan.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
About Dynamic Sites (Front End / Back End Implementations) by Janssen & Associates Affordable Website Solutions for Individuals and Small Businesses.
Tutorial 10 Adding Spry Elements and Database Functionality Dreamweaver CS3 Tutorial 101.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 13.
Copyright 2000 eMation SECURITY - Controlling Data Access with
Implementation - Deployment Methods of deployment –User PC –Network shared (workstation install) –Terminal server –Web Deployment (ActiveX) (Note: this.
Plenary meeting 2015 – Chania - Crete CASCADE Data Services Yusuf Yigini, Panos Panagos, Martha B. Dunbar Joint Research Centre - European Commission.
Tutorial 121 Creating a New Web Forms Page You will find that creating Web Forms is similar to creating traditional Windows applications in Visual Basic.
SUSE Linux Enterprise Desktop Administration Chapter 12 Administer Printing.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Computer Emergency Notification System (CENS)
Fisheries Oceanography Collaboration Software Donald Denbo NOAA/PMEL-UW/JISAO Presented by Nancy Soreide NOAA/PMEL AMS 2002/IIPS 10.3.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
IPortal Bringing your company and your business partners together through customized WEB-based portal software. SanSueB Software Presents iPortal.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
SSRL Crystal Mounting System. Sample Storage Cassette Stores 96 samples mounted on standard Hampton pins NdFeB ring magnet Teflon washer Hampton pin.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Microsoft Management Seminar Series SMS 2003 Change Management.
Page 1 Printing & Terminal Services Lecture 8 Hassan Shuja 11/16/2004.
Free Powerpoint Templates Page 1 Free Powerpoint Templates Users and Documents.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
A Remote Collaboration Environment for Protein Crystallography HEPiX-HEPNT Conference, 8 Oct 1999 Nicholas Sauter, Stanford Synchrotron Radiation Laboratory.
Macromolecular Crystallography Workshop 2004 Recent developments regarding our Computer Environment, Remote Access and Backup Options.
Goals Structural Biology Collaboratory Allow a team of researchers distributed anywhere in the world to perform a complete crystallographic experiment.
The SMB Archive System: Data Backup Across the Web Kenneth R. Sharp Stanford Synchrotron Radiation Laboratory.
SQL Server 2012 Session: 1 Session: 4 SQL Azure Data Management Using Microsoft SQL Server.
Automating Installations by Using the Microsoft Windows 2000 Setup Manager Create setup scripts simply and easily. Create and modify answer files and UDFs.
1 (c) 2013 FabSoft. MOST Cloud Service What is a Cloud Service? A cloud service is internet-based, meaning that MOST is hosted on a server farm on the.
Maintaining and Updating Windows Server 2008 Lesson 8.
Software sales at U Waterloo Successfully moved software sales online Handle purchases from university accounts Integrated with our Active Directory and.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Web Programming Language
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Joseph JaJa, Mike Smorul, and Sangchul Song
Printer Admin Print Job Manager
A Remote Collaboration Environment for Protein Crystallography
Presentation transcript:

The Collaboratory: computing environments and infrastructure for structural biology research Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory

What is the Collaboratory? Technically: an R&D program funded by NIH NIH’s definition of a Collaboratory: “A laboratory without walls.” Pilot program to investigate if collaboration and remote access tools could improve the efficiency of NCRR resources. Supplement to the NCRR grant that funds the SMB group. Currently funds three full-time employees in the SMB group: Thomas Eriksson, Ken Sharp, and Tim McPhillips. Funding has been extended through the end of the NCRR parent grant; the Collaboratory program will be renewed within the context of the parent grant in In practice: a group-wide effort to create a coherent computational research environment for our users Goal is to provide users with a coherent, overarching system for collecting data and solving structures--not just a bunch of tools. Software development, systems management, instrument design, hardware development, beam line automation, maintenance of equipment, etc--all are critical to the Collaboratory. Everyone in the PX group contributes to the Collaboratory effort.

The core Collaboratory development team

“Something there is that doesn’t love a wall…” What kind of walls has the Collaboratory removed? Walls between beam lines: Users can move between beam lines and find the same computer systems, user accounts and file systems wherever they go. Walls of geographical distance: Users can access the beam line, computing resources, and their data from anywhere in the world. Walls between collaborators: Local and remote coworkers can see samples, monitor the beam line, view data, and share data collection sessions. Walls between detectors and disk storage: High performance network and file server allows users to collect data from large area detectors at maximum speed. Walls between data and solved structures: High performance computers enable users to process their data and solve structures in real time. And coming down this year: Walls between traditional and web-based applications; walls between users and support staff; and walls between users and archived data.

…but “good fences make good neighbors!” What kind of fences has the Collaboratory put up? Fences between user groups: Each user group’s data is secure from snooping, theft, and tampering by other groups. Fences between networks: Computer systems at the beam lines are protected from network disturbances elsewhere at SSRL; instrument control computers are on an isolated network. Fences that keep users from damaging equipment remotely: Access control and rights restrictions in Blu-Ice make remote control of beam lines safe. Fences between computer systems and crackers: High level of security means users need not worry about data loss or system downtime due to marauders from the Internet.

Implications of the automated sample mounting system SSRL cassette design allow hundreds of pre-frozen crystals to be examined without entering the hutch. Automatic crystal centering system allows the crystal to be aligned automatically in the beam. In 2003, users of the robot on 11-1 entered the hutch only once to install cassettes in dispensing dewar. In 2004, users will not be allowed to use robot if they re- enter hutch after cassettes are loaded under staff supervision. Cassettes of crystals can be shipped to beam line via FEDEX. Cassettes can be placed in the hutch by staff, allowing users to work remotely. Local and remote users will have equal access to the hutch when using the robot (i.e., none). In theory, many users of the sample mounting robot need not come on site at all. BUT -- Need appropriate computing, network, and software infrastructure to enable remote access to full experimental capabilities of beam lines.

Collaboratory tools and sample mounting robots will allow SSRL users to work completely remotely in 2004 Blu-Ice for beam line control Can run locally or remotely. Multiple copies may run simultaneously. Security features prevent unsafe actions. Beam line video system Monitor sample in beam, experimental hardware, and crystals under microscope. Video streams may be viewed via Blu-Ice or through a web browser. Archive System Back up data to multi-terabyte robot tape system at SDSC over network. Simple web interface for data archival and retrieval. No need to use backup tapes. Remote Unix desktop Fully functional Unix desktop environment. Blu-Ice and all data processing software may be run remotely. Free ICA client from Citrix.

Why a high capacity, long term data archive is needed Need a replacement for tapes Tapes age and medium formats change rapidly. Storage capacity and reliability of tapes limited. Much manual book-keeping is needed to keep track of data stored on tapes. Need to support large-area CCD detectors Three Q315 detectors and a MAR 325 will each be generating MB of image data every 5 seconds when the SPEAR3 upgrade is complete. RAID data storage at SSRL will be 24 TB in all that data must be backed up somehow! Need to archive data as rapidly as it is collected. Need to support high-throughput structural biology Automated beam lines will generated huge amounts of data. Large numbers of samples and targets require that metadata be stored and tracked systematically. Data must be archived automatically and easy to retrieve.

High Performance Storage System and Storage Resource Broker at SDSC High Performance Storage System (HPSS) Long term data storage system at SDSC. Currently stores over 344 TB of data in 18 million files. Currently provides 0.9 PB of storage. Storage Resource Broker (SRB) Client-server middleware for accessing heterogeneous resources over the network. May be used to store and retrieve data on the HPSS at SDSC. Powerful metadata querying system allows data sets to be accessed based on their attributes. Data sets can be replicated over multiple resources. The challenge Capabilities of HPSS and SRB far exceed the perceived needs of our beam line users. Educating users to effectively use these systems for managing their data is a challenge. Our users need a customized interface with simplified functionality.

InQ SRB client for Microsoft Windows SRB client applications Users must be able to upload data, download data, and view the data in the archive. Users perform these functions via SRB client applications. InQ for Microsoft Windows InQ is the easiest to use client provided by SDSC. Individual files or entire folders may be uploaded or downloaded. Files in the archive may be browsed either by directory structure or by data attributes. Limitations of InQ Runs only on Microsoft Windows platforms. Windows is not the major platform used at synchrotron light sources or in crystallography research labs. No batch job capability for long archive jobs. Exposes confusing SRB features and terminology (resources, containers, collections, etc).

MySRB web browser-based SRB client MySRB MySRB is a powerful web-based SRB client. Can be run from standard web browsers. Files in the archive may be browsed either by directory structure or by data attributes. Limitations of MySRB No way to upload or download more than one file at a time. The otherwise rich functionality and powerful features are confusing to users. The bottom line: Additional infrastructure must be designed and implemented in order to make the SRB a viable storage system for crystallographic data. A browser-based user interface is ideal.

The Collaboratory interface for using the SRB archive Simple archive job definition Users may rapidly browse their data sets at SSRL. Directory contents are listed in the browser window. Directories may be navigated by clicking on directory names. Files to be uploaded may be filtered according to a list of wildcards. Subdirectories may be archived recursively. The only SRB related information required is the name of the new data collection to create. Convenient web browser interface Users may define archive jobs over the web from anywhere in the world using any common type of computer. Users need only log in to the Collaboratory portal with their Unix account name and password.

Monitoring archive jobs and downloading data Batch operation Archive job runs in background once definition is confirmed. Browser does not hang during archival. New jobs may be started while previously defined jobs are in progress. A job status page indicates definitions and status of all running jobs. is sent to the user when a job is complete. Similar interface for data download Users browse their archived data sets in exactly the same fashion. Data may be downloaded from the archive to a directory at SSRL (analogous to an upload job). Another option is to download selected files in one or more tar files directly to any computer on the Internet.

Significant infrastructure is required to provide this “simple” interface--but the payoff is huge. Authentication Gateway Server Java servlet that provides a common authentication protocol for all Collaboratory applications. Used to authenticate archive system users. All web-based Collaboratory software are being updated to use this single authentication server. Support for the authentication server has already been integrated into Blu- Ice/DCS. Allows users to navigate between web applications seamlessly without authenticating multiple times. Will allow access to be controlled based on the beam port schedule. Will allow users to start web-based applications from within Blu-Ice without requiring the user to authenticate again within the browser. Impersonation Server Unix daemon that can run any non- interactive program on behalf of any Unix user. Enables web applications to run background jobs for a user with the actual rights of the Unix user account. Accepts commands via the HTTP protocol. Verifies authentication information with the Authentication Server. Used by the Collaboratory archive system to list directories in the web browser and run background archive jobs as the user. Will enable fluorescence scans and autochooch to be executed by the scripting engine in DCSS. Will allow further analyses to be initiated by the beam line control system automatically.

Projects for the next year Integration of web-based Collaboratory tools A new web-based environment for monitoring beam lines and viewing results will be developed over the next year. The diffraction image viewer, beam line video web application, and archive system will be integrated into this system. Will enable real-time monitoring of beam line operations and experimental results via the web. Layout of user interface will likely mimic Blu-Ice’s tab look and feel to leverage user familiarity and experience. Currently investigating tools for rapidly developing powerful web-based applications in a component-based framework (e.g., WebObjects).

Projects for the next year Web-based proposal management system Provide all SSRL users with web-browser based tools for submitting proposals and beam time requests; updating personal information; and viewing personalized beam time schedules. Facilitate communication with user administration and user support staff. Integrate with production SSRL database system, eliminate older user interfaces and reporting tools. SSRL will run a separate instance of the Authentication Gateway Server for this purpose. Users will be able to use this system to specify which Unix accounts are enabled to collect data at the beam line when a particular proposal is active. No more editing the MySQL table! First new interfaces will be rolled out by the end of 2003; major features will likely be released in late 2004.

Collaboratory projects for the next 5 years… Ice-Floe Provide users with the databases, user interfaces, and project management capabilities required to make maximum use of high- throughput structural biology resources. Present users with a high-level interface to automated beam lines and automated structure determination systems. Enable user to focus on the workflow of carrying out their research rather than the details of each operation. Ice-Breaker Develop an open protocol for communicating with beam line automation systems. Work with developers at other light sources to make protocol compatible across a large fraction of structural biology beam lines worldwide. Enable anyone to develop their own interface to automated beam lines, support in-house LIMS, interface to other software packages, etc. Allow users to choose the interface most useful to them, independent of the light source.

Where we’re going: data grids, compute grids and experimental resource grids