Grid Computing Chip Watson Jefferson Lab Hall B Collaboration Meeting 1-Nov-2001.

Slides:



Advertisements
Similar presentations
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Data Grids for Collection Federation Reagan W. Moore University.
Advertisements

Web Service Architecture
Data Management Expert Panel - WP2. WP2 Overview.
JLab Lattice Portal – Data Grid Web Service Ying Chen, Chip Watson Thomas Jefferson National Accelerator Facility.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Distributed components
VL-e PoC Introduction Maurice Bouwhuis VL-e work shop, April 7 th, 2006.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
Interpret Application Specifications
UNICORE UNiform Interface to COmputing REsources Olga Alexandrova, TITE 3 Daniela Grudinschi, TITE 3.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
第十四章 J2EE 入门 Introduction What is J2EE ?
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
Web Services BOF This is a proposed new working group coming out of the Grid Computing Environments Research Group, as an outgrowth of their investigations.
CS CS 5150 Software Engineering Lecture 13 System Architecture and Design 1.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Web Services An Introduction Copyright © Curt Hill.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Portals, Services, Interfaces Marlon Pierce Indiana University March 15, 2002.
Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
Compute and Storage For the Farm at Jlab
CASTOR: possible evolution into the LHC era
Netscape Application Server
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Chapter 2: System Structures
Artem Trunov and EKP team EPK – Uni Karlsruhe
LQCD Computing Operations
#01 Client/Server Computing
Ch > 28.4.
Patrick Dreher Research Scientist & Associate Director
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
XML for Data Grid Applications
#01 Client/Server Computing
Presentation transcript:

Grid Computing Chip Watson Jefferson Lab Hall B Collaboration Meeting 1-Nov-2001

1 What is the Grid? Some wax philosophical, and say it is an unlimited capacity for computing. Like the power grid, you just plug in and use it, don’t care who provides it. Difficulty: “metering” your use of resources, and charging for them. We aren’t there yet. Simpler view: it is a large computer center, with a geographically distributed file system and batch system. This view assumes you have a right to use each piece of the distributed system, subject to perhaps local accounting constraints.

1 Key Aspects of the Grid Data Grid: Location independent file system. If you know the “logical name” of a data set, you can find it. (Normal access controls apply). Files can migrate around the grid to optimize usage, and may exist in multiple locations. Computational Grid: Submit a job to “the grid”. You describe the requirements of your job, and grid middleware finds an appropriate place to run it. Jobs can be batch, or even interactive.

1 Other Important Aspects Single Sign-On You “log on” to the grid once, and you can use the distributed resources for a certain period of time (sort of like the AFS file system) Analog: all day metro ticket

1 Distributed Computing Model In the “old” model, a lab has a large computer center, provisioned for all demanding data storage, analysis and simulation requirements. In the “current” model, only a fraction resides at the lab. already widely used in HEP experiments large experiment may enlist a major computing partner site, e.g. IN2P3 for BaBar In the “new” model, many sites large and small participate. Some sites may be special based upon capacity or specialized capabilities (such as robotic storage). LHC will use a 3 tier model, with a large central facility (tier 0), distributing data to moderately large national centers (tier 1), which in turn service small nodes (tier 2) What is a reasonable distribution for Hall D???

1 Why desert a working model? Easier to get additional funds State matching funds Also: NSF or other funding agency Easier to involve students Room full of computers more attractive than account on a machine 1000 km away Opportunity for innovation Easier to play with local machine than to get root access on machine 1000 km away

Case Study: The Lattice Portal A prototype virtual computer center for Jefferson Lab (under development)

1 Contents Components of the virtual computer center 1. Data management 2. Batch system 3. Interactive system Architectural components 1. Information Services using XML (Java servlets) Replica Catalog Data Grid Server (file cache & transfer agent) Batch Server 2. Authentication using X.509, SSL 3. Java client packages

1 A Virtual Computer Center: Data Management Global Logical File System (possibly constrained to a project) 1. Logical names for files (location independent) 2. Active files possibly cached at multiple sites 3. Inactive files in off-line storage (tape silo, multi-site) Data Grid Node – Manages a cache of logical files, perhaps using multiple file servers, NFS exports files locally – Maps logical name to local (physical) file name – Supports file transfers between managed and unmanaged storage, and between grid nodes (queuing of transfer requests) Replica Catalog – Tracks which logical files exist at which data grid nodes – Contains some file meta-data to allow file selection by attributes as well as by name

1 In picture form… DataGridServerFileServer ClientProgram ReplicaCatalog library File HostReplica Catalog Host 1. Get file names from meta data (energy, target, magnet settings) 2. Contact replica catalog to locate desired file. Get referral to a Data Grid Server 1. Get file state (on disk), additional info, referral to transfer agent 2. Get the file (parallel streams) MetaDataCatalog MetaData Catalog Host

1 A Virtual Computer Center: Batch System Global Queue(s) A user submits a batch job, perhaps using a web interface, to the virtual computer center (a.k.a. meta- facility). Based upon the locations of the executable, the input data files, and the necessary compute resources, the job is assigned to a particular compute grid node (cluster of machines). Compute Grid Node Set of co-located compute resources managed by a batch system. Typically co-located with a data grid node. E.g. Jefferson Lab’s Computer Center.

1 Virtual Computer Center: Interactive Conventional remote login is expected to be less common, as all capabilities are remotely accessible. Nevertheless… Interactive Services 1. ssh login to machine of desired architecture and operating system 2. interactive access to small clusters for serial and parallel jobs (or fast turnaround on local batch system)

1 Implementation? As with any distributed system, there are many ways to construct a meta-facility or grid: CORBA (distributed object system) DCOM (Windows only) Custom protocols over TCP/IP or UDP/IP Grid Middleware Globus (from ANL) Legion (UVA) Web Services... or some combination of the above

1 What are Web Services? Web Services are functions or methods that can be accessed across the web. Think of this as a “better” RPC (remote procedure call) system. Why better?

1 Why Web Services ? Use of industry standards HTTP, HTTPS, XML, SOAP, WSDL, UDDI, … Support for many languages Compiled and scripted Self describing protocols easier management of versioning, evolution Support for authentication Strong Industry Support: Microsoft’s.NET initiative SUN’s ONE (Open Net Environment) IBM contributions to Apache / SOAP

1 Web Browser XML to HTML servlet Web Service Application Web Service Grid Service Local Backend Services (batch, file, etc.) Web Server (Portal) Authenticated connections Remote Web Server Web Service Storage system Grid resources, e.g. Condor Batch system A three tier web services architecture

1 Web Services Details: Data Grid Replica Catalog & Data Grid Node List – contents of a directory Navigate – to another directory, or follow a soft link Mkdir – make a new directory Link – make a new link Delete – a logical file, directory, or link Properties – set / retrieve properties of a file, directory, or link (including protection / access control) Replica Catalog specific Create – a new logical file Add/Remove/Select/Delete replica – manipulate references to where file is stored Data Grid Node specific Allocate – space for an incoming file Copy – a file to/from unmanaged space, or another grid node Locate – get reference to physical file for transfer or local access

1 Web Services Details: Batch System User Job Operations Submit Resource requirements (CPU, memory, disk, net, …) Dependencies on other jobs / events Executables, libraries, etc., input files, output files … Cancel Suspend / Resume List – by queue, owner, site, … View allocation, usage Operator Operations On systems, queues, jobs On quota / allocation system

1 Technology Choice: XML + … Advantages Self describing data (contains meta data) Facilitates heterogenous systems Robust against evolution (no fragile versioning that distributed object systems encounter) New server generates additional tags which are ignored by old client New client detects absence of new tags & knows it is talking to an old server (and/or supplies defaults) Capable of defining all key concepts and operations for both client-server and client-portal communications Technologies XML – eXtensible Markup Language SOAP – Simple Object Access Protocol (~modern rpc system) WSDL – Web Services Description Language (~idl) UDDI – Universal Description, Discovery and Integration

1 Technology Choices: Java Servlets Java Advantages 1. Rapid code development 2. No memory leaks 3. Easy to use interface to SQL databases, XML libraries 4. Rich library of low level components (containers, etc.) Web + Servlet Advantages 1. Java (see above) 2. Scalability (see e-commerce) 3. Modular web services One servlet can invoke another, e.g. to translate XML to HTML Minor Web Inconvenience 1. Asynchronous notification of clients of web services

1 PPDG Collaboration: JLAB/SRB Web services as a general portal to a variety of back end storage systems (JLAB, SRB, …) And other services – batch Project should define the abstractions at the web services level; define all metadata for interacting with a storage system Define XML to describe digital objects and collections/directories (ALL) Metadata to describe logical namespace of the grid (SRB, JLAB, GridFTP attributes…) Standard structure for organizing as XML Define (WSDL ?) operations of browse, query, manage (ALL) Listing files available through interface, Caching, replication, pinning, staging, resource allocation, etc Back-end implementations JASMine (JLAB) SRB (SDSC) (SRM, Globus) Implement demonstration web services client (JLAB) Web services clients should be able to interact with any of these

1 Tape storage system slot STK silos 8 Redwood, , drives 7 Data movers ~ 300 GB buffer each Software – JASMine 15 TB Experiment cache pools 2 TB Farm cache 0.5 TB LQCD cache pool JASMine managed mass storage sub-systems JLAB mss - JASMine Features Stand alone cache manager Pluggable policies Implemented in Java Distributed, scaleable Pluggable security Authentication & authorization To be integrated with GSI Scheduling of drives Can manage tape, tape and disk, or disk alone

1 Example – demo client Similar to a graphical ftp client, but: Each half can attach to a grid node: Cache – managed filesystem User’s home directory Other file systems at web server Replica catalog Local mss if it is separate from replica system Can move files in/out of managed store Negotiates compatible protocols between grid nodes E.g., http, SRB, gridFTP, ftp, bbftp, JASMine, etc

1 Technologies Employed Apache web server Tomcat servlet engine, SOAP libraries Java Server Pages (JSP) XML data format XSL style sheets for presentation X.509 certificate authentication Web interface to a simple certificate authority to issue certificates valid within the meta-facility (signed by Jefferson Lab)

1 Data Grid Capabilities planned: Replicated data (multi-site), global tree structured name space (like Unix file system) Replica catalog, replicated to multi-site using mySQL as back end, probably using mySQL’s ability to replica the catalog (fault tolerance) Browse by attributes as well as by name Parallel file transfers ( bbftp, gridftp, … Jpars – 100% java parallel file transfers (w/ 3 rd party, authen.) Drag-n-drop between sites Policy based replication (auto migrate between sites)

1 Status Prototype Browse contents of a prototype disk cache / tape storage file system Move files between managed and unmanaged storage on data node Move files (including entire directories) between desktop and data node Displays if file is currently in disk cache Can request move from tape to disk (not released) Soon 3 rd party file transfers (between 2 servers)

1 Near Term Convert from raw XML to SOAP (this month) Deploy disk cache manager to FSU & MIT (4Q01) Abstract disk-to-tape migration of current system to use WAN site-to-site migration of files; wrapping, e.g. gridftp or other parallel transfer (1Q02)

1 Conclusions Grid Capabilities are starting to emerge Jefferson Lab will have a functioning data grid in FY02 Jefferson Lab will have a functioning meta-facility in FY03