Korea Workshop May 2005 1 Grid Analysis Environment (GAE) (overview) Frank van Lingen (on behalf of the GAE.

Slides:



Advertisements
Similar presentations
1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue.
Advertisements

LEAD Portal: a TeraGrid Gateway and Application Service Architecture Marcus Christie and Suresh Marru Indiana University LEAD Project (
Abstraction Layers Why do we need them? –Protection against change Where in the hourglass do we put them? –Computer Scientist perspective Expose low-level.
Data Management Expert Panel - WP2. WP2 Overview.
High Performance Computing Course Notes Grid Computing.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
A Computation Management Agent for Multi-Institutional Grids
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Other servers Java client, ROOT (analysis tool), IGUANA (CMS viz. tool), ROOT-CAVES client (analysis sharing tool), … any app that can make XML-RPC/SOAP.
October 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Workload Management Massimo Sgaravatto INFN Padova.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Client/Server Grid applications to manage complex workflows Filippo Spiga* on behalf of CRAB development team * INFN Milano Bicocca (IT)
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Data Analysis on Handheld Devices Using Clarens Tahir Azim NUST.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
ANSTO E-Science workshop Romain Quilici University of Sydney CIMA CIMA Instrument Remote Control Instrument Remote Control Integration with GridSphere.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
ACAT 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
CPT Demo May Build on SC03 Demo and extend it. Phase 1: Doing Root Analysis and add BOSS, Rendezvous, and Pool RLS catalog to analysis workflow.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1October 9, 2001 Sun in Scientific & Engineering Computing Grid Computing with Sun Wolfgang Gentzsch Director Grid Computing Cracow Grid Workshop, November.
CHEP 2004 Grid Enabled Analysis: Prototype, Status and Results (on behalf of the GAE collaboration) Caltech, University of Florida, NUST, UBP Frank van.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
Clarens Toolkit Building Blocks for a Simple TeraGrid Gateway Tutorial Conrad Steenberg Julian Bunn, Matthew Graham, Joseph Jacob, Craig Miller, Roy Williams.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
California Institute of Technology
GWE Core Grid Wizard Enterprise (
The CMS Grid Analysis Environment GAE (The CAIGEE Architecture)
LCG middleware and LHC experiments ARDA project
Monitoring of the infrastructure from the VO perspective
Patrick Dreher Research Scientist & Associate Director
Leigh Grundhoefer Indiana University
The Anatomy and The Physiology of the Grid
Gordon Erlebacher Florida State University
gLite The EGEE Middleware Distribution
Presentation transcript:

Korea Workshop May Grid Analysis Environment (GAE) (overview) Frank van Lingen (on behalf of the GAE group)

Korea Workshop May Outline GAE System View Frameworks Early Results Projects

Korea Workshop May Goal Provide a transparent environment for a physicist to perform his/her analysis (batch/interactive) in a distributed dynamic environment: Identify your data (Catalogs), submit your (complex) job (Scheduling, Workflow,JDL), get “fair” access to resources (Priority, Accounting), monitor job progress (Monitor, Steering), get the results (Storage, Retrieval), repeat the process and refine results Support data transfers ranging from the (predictable) movement of large scale (simulated) data, to the highly dynamic analysis tasks initiated by rapidly changing teams of scientist

Korea Workshop May Constraints The “Acid Test” for Grids; crucial for LHC experiments  Large, diverse, distributed community of users  Support for 100s to 1000s of analysis tasks, shared among dozens of sites  Widely varying task requirements and priorities  Need for priority schemes, robust authentication and security Operates in a resource-limited and policy-constrained global system  Dominated by collaboration policy and strategy (resource usage and priorities)  Requires real-time monitoring; task and workflow tracking; decisions often based on a global system view

Korea Workshop May NetworkComputeStorage System View Local Services (High Level Services) Global Services (Domain) Applications Monitoring Development Deployment Support/ Feedback Service Oriented Architecture (Frameworks) (Domain) Portal Testing Resources Multiple System Stages Interface Specifications! Input Not only research, but also: Software Engineering Sociology Software Lifecycle Design

Korea Workshop May System View (Details) Domains  Virtual Organization and Role management Service Oriented Architecture  Authorized Access  Access Control Management(groups/individuals)  Discoverable  Protocols (XML-RPC, SOAP,….)  Service Version Management  Frameworks: Clarens, MonALISA... Monitoring  End-to-end monitoring,collecting and disseminating information  Provide Visualization of Monitor Data to Users

Korea Workshop May System View (Details) Local Services (Local View)  Local Catalogs, Storage Systems, Task Tracking (Single User Tasks), Policies, Job Submission Global Services (Global View)  Discovery Service, Global Catalogs, Job Tracking (Multiple User Tasks), Policies High Level Services (“Autonomous”)  Acts on monitor data and has global view  Scheduling, Data Transfer, Network Optimization, Tasks Tracking (many users)

Korea Workshop May System View (Details) (Domain) Portal  One Stop Shop for Applications/Users to access and Use Grid Resources  Task Tracking (Single User Tasks)  Graphical User Interface  User session logging (provide feedback when failures occur) (Domain) Applications  ORCA/COBRA, IGUANA, PHYSH,….

Korea Workshop May Single User View 8 Client Application Discovery Planner/ Scheduler Monitor Information Policy Steering Catalogs Job Submission Storage Management Execution Dataset service 9 Data Transfer Catalogs to select datasets, Catalogs to select datasets, Resource & Application Discovery Resource & Application Discovery Schedulers guide jobs to resources Schedulers guide jobs to resources Policies enable “fair” access to resources Policies enable “fair” access to resources Robust (large size) data (set) transfer Robust (large size) data (set) transfer Feedback to users (e.g. status of their jobs) Feedback to users (e.g. status of their jobs) Crash recovery of components (identify and restart) Crash recovery of components (identify and restart) Provide secure authorized access to resources and services. Provide secure authorized access to resources and services. Thousands of user jobs (multi user environment!)

Korea Workshop May Service Oriented View Grid Wide Execution Monitoring Application Execution Catalogs Meta Data Virtual Data replica Scheduler Fully Abstract Planner Fully Concrete Planner Grid Services Web Server Analysis Client - Discovery, - ACL management, - Certificate based access HTTP, SOAP, XML-RPC Clients talk standard protocols to “Grid Services Web Server”, Simple Web service API allows simple or complex analysis clients Typical clients: ROOT, Web Browser, …. Clarens portal hides complexity Key features: Global Scheduler, Catalogs, Monitoring, Grid- wide Execution service.

Korea Workshop May Peer-2-Peer View discover publish subscribe “Peer-2-Peer” configuration enhances: robustnessrobustness ScalabilityScalabilityProvide: DiscoveryDiscovery  Services  Software  …… PublicationPublication SubscriptionSubscription No Single point of failure

Korea Workshop May Framework (Clarens) Authentication (X509) Access control on Web Services. Remote file access (with access control) Discovery of Web Services and Software Shell service. Shell like access to remote machines (managed by access control lists) Proxy certificate functionality Group management VO and role management Good performance of the Web Service Framework Integration with MonALISA 3 rd party application Client Web server Web serverService Clarens Clarens http/https XML-RPC, SOAP. JavaRMI, JSON RPC, …..

Korea Workshop May MonALISA JINI Network Web Services (WS), Applications (App) App SS AppS MonALISA Station Servers Framework (MonALISA) WS App WS App WS (1) Publish (2) Disseminate (3) Subscribe (predicates) (4) Steer/Retrieve Monitor Sensors (Web Services/Applications/…) Active filter agents  Process data  Application specific monitoring Mobile agents  decision support  global optimisations MonALISA able to dynamically  register & discover Based on multi-threaded engine  Very scalable Services are self describing Code updates  Automatic & secure Dynamic configuration for services  Secure Admin Interface Fully distributed, no single point of failure!

Korea Workshop May GAE Related Projects DISUN (deployment)  Deployment and Support for Distributed Scientific Analysis Ultralight (development)  Treating the network as resource  “Vertically” Integrated Monitor Information  Multi User, resource constraint view MCPS (development)  Provide Clarens based Web Services for batch analysis (workflow) SPHINX (development)  Policy based scheduling (global service) exposed a Clarens Web Service using MonALISA monitor information SRM/Dcache (development)  Service based data transfer (local service) Lambda Station (development)  Authorized programmability of routers using MonALISA & Clarens PHYSH  Clarens based services for command line user analysis CRAB  Client to support user analysis using Clarens Framework

Korea Workshop May GAE and Ultralight Make the Network an Integrated Managed Resource Unpredictable multi user analysis Overall demand typically fills the capacity of the resources Real time monitor systems for networks, storage, computing resources,… : E2E monitoring Network Resources Network Planning Request Planning Monitor Application Interfaces Support data transfers ranging from the (predictable) movement of large scale (simulated and real) data, to the highly dynamic analysis tasks initiated by rapidly changing teams of scientists

Korea Workshop May Combining Grid Projects into Grid Analysis Environment DISUN Ultralight MCPS Clarens_Applications MonALISA_Applications SPHINX SRM/dCache Lambda Station Policy Grid Analysis Environment Frameworks …….. PHEDEX CRAB PHYSH Development Deployment Support/ Feedback Testing GAE focuses on integration

Korea Workshop May GAE Deployment Installation of CMS (ORCA, COBRA, IGUANA,…) and LCG (POOL, SEAL,…) software on Caltech GAE testbed. Serves as environment to integrate applications as web services into the Clarens framework. Demonstrated distributed multi user GAE prototype at SC03, SC04 (using BOSS and SPHINX) Analysis Prototype in January 2005 currently upgraded to work with CRAB. PHEDEX deployed at Caltech, UFL, UCSD and transferring data

Korea Workshop May GAE Deployment Clarens has been deployed on ~30+ machines. Other sites: Caltech, Florida, Fermilab, CERN, Pakistan, INFN (see Clarens Discovery Service)  Available in Java and Python Core System: Discovery, Proxy, Authentication, Remote File Access, Shell Service,…. Clarens Discovery Service part of OSG (uses MonALISA)  Talking with EGEE on common interface  Software Discovery Service being developed (SCRAM based prototype available. GAE distributed test framework (Java, Python) available. Daily testing and verification. JavaPythonJavaPython Generic Catalog Service developed to expose pooldbs, refdb, phedex, …and other DBs as entities with key/values

Korea Workshop May GAE Deployment PHEDEX and Pubdb Web Services deployed in Florida and Caltech.  Part of the test framework Working with the MCPS group to develop an Clarens based MCPS Web Service interface and browser based GUI.  Supporting Production and Analysis Web Service interface to BOSS (CMS Job/Task book-keeping system) SPHINX: Distributed scheduler developed at UFL Clarens/MonALISA Integration: Facilitating user-level Job/Task Interaction  First version of a Steering Service Java Webstart dashboard being developed for Clarens based services  Also: Cross browser support for JavaScript browser GUI (Firefox, IE, Safari, ….) based on JSON CAVES: Analysis code-sharing environment developed at UFL Work with CERN to have the GAE components included in CMS software distribution. GAE components being integrated in the DPE and VDT distribution used in US-CMS.

Korea Workshop May Lessons learned Quality of (the) service(s)  Lot of exception handling needed for robust services (gracefully failure of services)  Time outs are important Need very good performance for composite services Discovery services  enables location independent service composition.  semantics of services are important (different name, name space, and/or WSDL) Web service design: Not every application is developed with a web service interface in mind Interfaces of 3rd party applications change: Rapid Application Development Overlapping functionality of applications (but not same interfaces!) Not one single solution for CMS Not every problem has a technical solution, conventions also important

Korea Workshop May Data movement using PHEDEX -Integration of runjob into current deployment of services  Full chain of end to end analysis -Develop/deploy accounting service -Improved GUI interface (Webstart client) -Improve exception handling -Integrate/interoperability mass storage (e.g. SRM) applications into/with Clarens environment -Steering service -Autonomous replication -Trend analysis using monitor data -E2E error trapping and diagnosis: cause and effect -Strategic Workflow re-planning -Adaptive steering and optimization algorithms -Multi user distributed environment Future Directions

Korea Workshop May  More Information GAE: Ultralight: WIKI: Thank You. Questions?