Presentation is loading. Please wait.

Presentation is loading. Please wait.

February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Towards a US (and LHC) Grid Environment for HENP Experiments.

Similar presentations


Presentation on theme: "February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Towards a US (and LHC) Grid Environment for HENP Experiments."— Presentation transcript:

1 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Towards a US (and LHC) Grid Environment for HENP Experiments CHEP 2000 Grid Workshop Harvey B. Newman, Caltech Padova February 12, 2000

2 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Data Grid Hierarchy: Integration, Collaboration, Marshal resources Tier2 Center ~1 TIPS Online System Offline Farm ~20 TIPS CERN Computer Center Fermilab ~4 TIPS France Regional Center Italy Regional Center Germany Regional Center Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~2.4 Gbits/sec 100 - 1000 Mbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight Tier2 Center ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 3 Tier 4 1 TIPS = 25,000 SpecInt95 PC (today) = 10-15 SpecInt95 Tier2 Center ~1 TIPS Tier 2

3 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) To Solve: the LHC “Data Problem”  The proposed LHC computing and data handling will not support FREE access, transport or processing for more than a small part of the data è Balance between proximity to large computational and data handling facilities, and proximity to end users and more local resources for frequently-accessed datasets è Strategies must be studied and prototyped, to ensure both: acceptable turnaround times, and efficient resource utilisation  Problems to be Explored è How to meet demands of hundreds of users who need transparent access to local and remote data, in disk caches and tape stores è Prioritise hundreds of requests of local and remote communities, consistent with local and regional policies è Ensure that the system is dimensioned/used/managed optimally, for the mixed workload

4 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Regional Center Architecture Example by I. Gaines (MONARC) Regional Center Architecture Example by I. Gaines (MONARC) Tapes Network from CERN Network from Tier 2 & simulation centers Tape Mass Storage & Disk Servers Database Servers Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Reconstruction Raw/Sim  ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD  AOD AOD  DPD Scheduled Physics groups Individual Analysis AOD  DPD and plots Chaotic Physicists Desktops Tier 2 Local institutes CERN Tapes

5 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grid Services Architecture [*]: GridFabric GridServices ApplnToolkits Applns Networks, data stores, computers, display devices, etc.; associated local services (local implementations) Protocols, authentication, policy, resource management, instrumentation, data discovery, etc.... RemoteviztoolkitRemotecomp.toolkitRemotedatatoolkitRemotesensorstoolkitRemotecollab.toolkit HEP Data-Analysis Related Applications [*] Adapted from Ian Foster

6 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grid Hierarchy Goals: Better Resource Use and Faster Turnaround “Grid” integration and (de facto standard) common services to ease development, operation, management and security  Efficient resource use and improved responsiveness through: è Treatment of the ensemble of site and network resources as an integrated (loosely coupled) system è Resource discovery, query estimation (redirection), co-scheduling, prioritization, local and global allocations è Network and site “instrumentation”: performance tracking, monitoring, forward-prediction, problem trapping and handling

7 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) GriPhyN: First Production Scale “Grid Physics Network” Develop a New Integrated Distributed System, while Meeting Primary Goals of the US LIGO, SDSS and LHC Programs Develop a New Integrated Distributed System, while Meeting Primary Goals of the US LIGO, SDSS and LHC Programs  Unified GRID System Concept; Hierarchical Structure  ~Twenty Centers; with Three Sub-Implementations è 5-6 Each in US for LIGO, CMS, ATLAS; 2-3 for SDSS è Emphasis on Training, Mentoring and Remote Collaboration  Focus on LIGO, SDSS (+BaBar and Run2) handling of real data, and LHC Mock Data Challenges with simulated data Making the Process of Discovery Accessible to Students Worldwide Making the Process of Discovery Accessible to Students Worldwide GriPhyN Web Site: http://www.phys.ufl.edu/~avery/mre/ White Paper: http://www.phys.ufl.edu/~avery/mre/white_paper.html

8 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grid Development Issues  Integration of applications with Grid Middleware k Performance-oriented user application software architecture is required, to deal with the realities of data access and delivery k Application frameworks must work with system state and policy information (“instructions”) from the Grid  O(R)DBMS’s must be extended to work across networks è E.g. “Invisible” (to the DBMS) data transport, and catalog update  Interfacility cooperation at a new level, across world regions è Agreement on choice and implementation of standard Grid components, services, security and authentication è Interface the common services locally to match with heterogeneous resources, performance levels, and local operational requirements è Accounting and “exchange of value” software to enable cooperation

9 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) u RD45, GIOD:Networked Object Databases u Clipper/GC; High speed access to Objects or File data FNAL/SAM for processing and analysis u SLAC/OOFS Distributed File System + Objectivity Interface u NILE, Condor:Fault Tolerant Distributed Computing with Heterogeneous CPU Resources u MONARC:LHC Computing Models: Architecture, Simulation, Strategy, Politics u PPDG:First Distributed Data Services and Data Grid System Prototype u ALDAP: OO Database Structures and Access Methods for Astrophysics and HENP Data u GriPhyN: Production-Scale Data Grid èSimulation/Modeling, Application + Network Instrumentation, System Optimization/Evaluation u APOGEE Roles of Projects for HENP Distributed Analysis

10 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Other ODBMS tests DRO WAN Tests with CERN Production on CERN’s PCSF and file movement to Caltech Objectivity/DB Creation of 32000 database federation Tests with Versant (fallback ODBMS)

11 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) The China Clipper Project: A Data Intensive Grid China Clipper Goal Develop and demonstrate middleware allowing applications transparent, high-speed access to large data sets distributed over wide-area networks.  Builds on expertise and assets at ANL, LBNL & SLAC  NERSC, ESnet  Builds on Globus Middleware and high-performance distributed storage system (DPSS from LBNL)  Initial focus on large DOE HENP applications  RHIC/STAR, BaBar  Demonstrated data rates to 57 Mbytes/sec. ANL-SLAC-Berkeley

12 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grand Challenge Architecture An order-optimized prefetch architecture for data retrieval from multilevel storage in a multiuser environment  Queries select events and specific event components based upon tag attribute ranges è Query estimates are provided prior to execution è Queries are monitored for progress, multi-use  Because event components are distributed over several files, processing an event requires delivery of a “bundle” of files  Events are delivered in an order that takes advantage of what is already on disk, and multiuser policy-based prefetching of further data from tertiary storage  GCA intercomponent communication is CORBA-based, but physicists are shielded from this layer

13 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) GCA System Overview Client GCA STACS Staged event files Event Tags (Other) disk-resident event data Index HPSS pftp File Catalog Client

14 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) STorage Access Coordination System (STACS) Query Estimator Query Monitor Cache Manager Policy Module Bit-Sliced Index File Catalog Query Status, Cache Map Query Estimate File Bundles, Event lists Pftp and file purge commands List of file bundles and events Requests for file caching and purging

15 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) The Particle Physics Data Grid (PPDG) u First Year Goal: Optimized cached read access to 1-10 Gbytes, drawn from a total data set of order One Petabyte PRIMARY SITE Data Acquisition, CPU, Disk, Tape Robot SECONDARY SITE CPU, Disk, Tape Robot Site to Site Data Replication Service 100 Mbytes/sec ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Multi-Site Cached File Access Service University CPU, Disk, Users PRIMARY SITE DAQ, Tape, CPU, Disk, Robot Satellite Site Tape, CPU, Disk, Robot University CPU, Disk, Users University Users University Users University Users Satellite Site Tape, CPU, Disk, Robot

16 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) The Particle Physics Data Grid (PPDG) The ability to query and partially retrieve hundreds of terabytes across Wide Area Networks within seconds, PPDG uses advanced services in three areas: è Distributed caching: to allow for rapid data delivery in response to multiple requests è Matchmaking and Request/Resource co-scheduling: to manage workflow and use computing and net resources efficiently; to achieve high throughput è Differentiated Services: to allow particle-physics bulk data transport to coexist with interactive and real-time remote collaboration sessions, and other network traffic.

17 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) PPDG: Architecture for Reliable High Speed Data Delivery Object-based and File-based Application Services Cache Manager File Access Service MatchmakingService Cost Estimation File Fetching Service File Replication Index End-to-End Network Services Mass Storage Manager ResourceManagement File Mover Site Boundary Security Domain + Future File and Object Export; Cache & State Tracking; Forward Prediction

18 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) First Year PPDG “System” Components Middleware Components (Initial Choice): See PPDG Proposal Object and File-Based Objectivity/DB (SLAC enhanced) Application Services GC Query Object, Event Iterator, Query Monitor Object and File-Based Objectivity/DB (SLAC enhanced) Application Services GC Query Object, Event Iterator, Query Monitor FNAL SAM System FNAL SAM System Resource ManagementStart with Human Intervention (but begin to deploy resource discovery & mgmnt tools: Condor, SRB) Resource ManagementStart with Human Intervention (but begin to deploy resource discovery & mgmnt tools: Condor, SRB) File Access Service Components of OOFS (SLAC) File Access Service Components of OOFS (SLAC) Cache ManagerGC Cache Manager (LBNL) Cache ManagerGC Cache Manager (LBNL) Mass Storage ManagerHPSS, Enstore, OSM (Site-dependent) Matchmaking Service Condor (U. Wisconsin) Matchmaking Service Condor (U. Wisconsin) File Replication Index MCAT (SDSC) File Replication Index MCAT (SDSC) Transfer Cost Estimation ServiceGlobus (ANL) Transfer Cost Estimation ServiceGlobus (ANL) File Fetching ServiceComponents of OOFS File Movers(s) SRB (SDSC) ; Site specific File Movers(s) SRB (SDSC) ; Site specific End-to-end Network ServicesGlobus tools for QoS reservation End-to-end Network ServicesGlobus tools for QoS reservation Security and authenticationGlobus (ANL)

19 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) CONDOR Matchmaking A Resource Allocation Paradigm  Parties use ClassAds to advertise properties, requirements and ranking to a matchmaker  ClassAds are Self- describing (no separate schema)  ClassAds combine query and data http://www.cs.wisc.edu/condor http://www.cs.wisc.edu/condor Resource Local Resource Management Owner Agent Environment Agent Customer Agent Application Agent Application

20 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Request Queue Owner Agent Execution Agent Application Process Customer Agent Application Process Application Agent Data & Object Files Ckpt Files Object Files Remote I/O & Ckpt Object Files Submission Execution Remote Execution in Condor Agents for Remote Execution in CONDOR

21 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Mobile Agents è Execute Asynchronously è Reduce Network Load: Local Conversations è Overcome Network Latency; Some Outages è Adaptive  Robust, Fault Tolerant è Naturally Heterogeneous è Extensible Concept: Agent Hierarchies Beyond Traditional Architectures: Mobile Agents (Java Aglets) “Agents are objects with rules and legs” -- D. Taylor Application ServiceAgent

22 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Using the Globus Tools  Tests with “gsiftp”, a modified ftp server/client that allows control of the TCP buffer size  Transfers of Objy database files from the Exemplar to è Itself è An O2K at Argonne (via CalREN2 and Abilene) è A Linux machine at INFN (via US-CERN Transatlantic link)  Target /dev/null in multiple streams (1 to 16 parallel gsiftp sessions).  Aggregate throughput as a function of number of streams and send/receive buffer sizes ~25 MB/sec on HiPPI loop-back ~4MB/sec to Argonne by tuning TCP window size Saturating available B/W to Argonne

23 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Distributed Data Delivery and LHC Software Architecture Software Architectural Choices Software Architectural Choices è Traditional, single-threaded applications k Wait for data location, arrival and reassembly OR è Performance-Oriented (Complex) k I/O requests up-front; multi-threaded; data driven; respond to ensemble of (changing) cost estimates k Possible code movement as well as data movement k Loosely coupled, dynamic

24 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) GriPhyN Foundation Build on the Distributed System Results of the GIOD, MONARC, NILE, Clipper/GC and PPDG Projects Build on the Distributed System Results of the GIOD, MONARC, NILE, Clipper/GC and PPDG Projects  Long Term Vision in Three Phases è 1. Read/write access to high volume data and processing power k Condor/Globus/SRB + NetLogger components to manage jobs and resources è 2. WAN-distributed data-intensive Grid computing system k Tasks move automatically to the “most effective” Node in the Grid k Scalable implementation using mobile agent technology è 3. “Virtual Data” concept for multi-PB distributed data management, with large-scale Agent Hierarchies k Transparently match data to sites, manage data replication or transport, co-schedule data & compute resources Build on VRVS Developments for Remote Collaboration

25 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) INSTRUMENTATION, SIMULATION, OPTIMIZATION, COORDINATION u SIMULATION of a Production-Scale Grid Hierarchy kProvide a Toolset for HENP experiments to test and optimize their data analysis and resource usage strategies u INSTRUMENTATION of Grid Prototypes kCharacterize the Grid components’ performance under load kValidate the Simulation kMonitor, Track and Report system state, trends and “Events” u OPTIMIZATION of the Data Grid kGenetic algorithms, or other evolutionary methods kDeliver optimization package for HENP distributed systems kApplications to other experiments; accelerator and other control systems; other fields u COORDINATE with Experiment-Specific Projects: CMS, ATLAS, BaBar, Run2 GriPhyN/APOGEE: Production-Design of a Data Analysis Grid

26 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grid (IT) Issues to be Addressed  Dataset compaction; data caching and mirroring strategies è Using large time-quanta or very high bandwidth bursts, for large data transactions  Query estimators, Query Monitors (cf. GCA work) è Enable flexible, resilient prioritisation schemes (marginal utility) è Query redirection, fragmentation, priority alteration, etc.  Pre-Emptive and realtime data/resource matchmaking è Resource discovery k Data and CPU Location Brokers è Co-scheduling and queueing processes  State, workflow, & performance-monitoring instrumentation; tracking and forward prediction  Security: Authentication (for resource allocation/usage and priority); running a certificate authority

27 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) CMS Example: Data Grid Program of Work (I) u FY 2000 è Build basic services; “1 Million event” samples on proto-Tier2’s k For HLT milestones and detector/physics studies with ORCA è MONARC Phase 3 simulations for study/optimization u FY 2001 è Set up initial Grid system based on PPDG deliverables at the first Tier2 centers and Tier1-prototype centers k High speed site-to-site file replication service k Multi-site cached file access è CMS Data Challenges in support of DAQ TDR è Shakedown of preliminary PPDG (+ MONARC and GIOD) system strategies and tools  FY 2002 è Deploy Grid system at the second set of Tier2 centers è CMS Data Challenges for Software and Computing TDR and Physics TDR

28 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Data Analysis Grid Program of Work (II) u FY 2003 è Deploy Tier2 centers at last set of sites è 5%-Scale Data Challenge in Support of Physics TDR è Production-prototype test of Grid Hierarchy System, with first elements of the production Tier1 Center u FY 2004 è 20% Production (Online and Offline) CMS Mock Data Challenge, with all Tier2 Centers, and partly completed Tier1 Center è Build Production-quality Grid System u FY 2005 (Q1 - Q2) è Final Production CMS (Online and Offline) Shakedown è Full distributed system software and instrumentation è Using full capabilities of the Tier2 and Tier1 Centers

29 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Summary  The HENP/LHC data handling problem H Multi-Petabyte scale, binary pre-filtered data, resources distributed worldwide H Has no analog now, but will be increasingly prevalent in research, and industry by ~2005. è Development of a robust PB-scale networked data access and analysis system is mission-critical è An effective partnership exists, HENP-wide, through many R&D projects - H RD45, GIOD, MONARC, Clipper, GLOBUS, CONDOR, ALDAP, PPDG,... è An aggressive R&D program is required to develop H Resilient “self-aware” systems, for data access, processing and analysis across a hierarchy of networks è Solutions that could be widely applicable to data problems in other scientific fields and industry, by LHC startup H Focus on Data Grids for Next Generation Physics

30 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) LHC Data Models: 1994-2000 HEP data models are complex! è Rich hierarchy of hundreds of complex data types (classes) è Many relations between them è Different access patterns (Multiple Viewpoints) OO technology OO technology è OO applications deal with networks of objects (and containers) è Pointers (or references) are used to describe relations Existing solutions do not scale è Solution suggested by RD45: ODBMS coupled to a Mass Storage System Construction of “Compact” Datasets for Analysis: Rapid Access/Navigation/Transport Event TrackList TrackerCalorimeter Track Track Track Track Track HitList Hit Hit Hit Hit Hit

31 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Web-Based Server-Farm Networks Circa 2000 Dynamic (Grid-Like) Content Delivery Engines Akamai, Adero, Sandpiper Akamai, Adero, Sandpiper u 1200  Thousands of Network-Resident Servers è 25  60 ISP Networks è 25  30 Countries è 40+ Corporate Customers è $ 25 B Capitalization u Resource Discovery è Build “Weathermap” of Server Network (State Tracking) è Query Estimation; Matchmaking/Optimization; Request rerouting è Virtual IP Addressing: One address per server-farm u Mirroring, Caching u (1200) Autonomous-Agent Implementation Content Delivery Networks (CDN)

32 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Strawman Tier 2 Evolution 20002005 20002005 è Linux Farm: 1,200 SI95 20,000 SI95 * è Disks on CPUs 4 TB50 TB è RAID Array 1 TB30 TB è Tape Library 1-2 TB 50-100 TB è LAN Speed 0.1 - 1 Gbps 10-100 Gbps è WAN Speed 155 - 622 Mbps 2.5 - 10 Gbps è Collaborative MPEG2 VGA Realtime HDTV Infrastructure (1.5 - 3 Mbps) (10 - 20 Mbps) [*] Reflects lower Tier 2 component costs due to less demanding usage. Some of the CPU will be used for simulation.

33 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) USCMS S&C Spending profile 2006 is a model year for the operations phase of CMS

34 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) GriPhyN Cost è System support$ 8.0 M è R&D$ 15.0 M è Software$ 2.0 M è Tier 2 networking$ 10.0 M è Tier 2 hardware$ 50.0 M è Total$ 85.0 M

35 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Grid Hierarchy Concept: Broader Advantages  Partitioning of users into “proximate” communities into for support, troubleshooting, mentoring  Partitioning of facility tasks, to manage and focus resources  Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region k Lower tiers of the hierarchy  More local control

36 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Storage Request Brokers (SRB)  Name Transparency: Access to data by attributes stored in an RDBMS (MCAT).  Location Transparency: Logical collections (by attributes) spanning multiple physical resources.  Combined Location and Name Transparency means that datasets can be replicated across multiple caches and data archives (PPDG).  Data Management Protocol Transparency: SRB with custom-built drivers in front of each storage system è User does not need to know how the data is accessed; SRB deals with local file system managers  SRBs (agents) authenticate themselves and users, using Grid Security Infrastructure (GSI)

37 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Role of Simulation for Distributed Systems Simulations are widely recognized and used as essential tools for the design, performance evaluation and optimisation of complex distributed systems Simulations are widely recognized and used as essential tools for the design, performance evaluation and optimisation of complex distributed systems è From battlefields to agriculture; from the factory floor to telecommunications systems è Discrete event simulations with an appropriate and high level of abstraction è Just beginning to be part of the HEP culture k Some experience in trigger, DAQ and tightly coupled computing systems: CERN CS2 models (Event-oriented) k MONARC (Process-Oriented; Java 2 Threads + Class Lib) è These simulations are very different from HEP “Monte Carlos” k “Time” intervals and interrupts are the essentials Simulation is a vital part of the study of site architectures, network behavior, data access/processing/delivery strategies, for HENP Grid Design and Optimization Simulation is a vital part of the study of site architectures, network behavior, data access/processing/delivery strategies, for HENP Grid Design and Optimization

38 February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Monitoring Architecture: Use of NetLogger in CLIPPER End-to-end monitoring of grid assets is necessary to  Resolve network throughput problems  Dynamically schedule resources Add precision-timed event monitor agents to: è ATM switches è Storage servers è Testbed computational resources Produce trend analysis modules for monitor agents Make results available to applications


Download ppt "February 12, 2000: Towards a US and LHC Grid Environment for Experiments Harvey Newman (CIT) Towards a US (and LHC) Grid Environment for HENP Experiments."

Similar presentations


Ads by Google