September 4,2001Lee Lueking, FNAL1 SAM Resource Management Lee Lueking CHEP 2001 September 3-8, 2001 Beijing China
September 4,2001Lee Lueking, FNAL2 Intro to SAM is Sequential Access to data via Meta-data Project started in 1997 to handle D0’s needs for Run II data system. Current SAM team includes: –Lauri Loebel-Carpenter, Lee Lueking*, Carmenita Moore, Igor Terekhov, Julie Trumbo, Sinisa Veseli, Matthew Vranicar, Stephen P. White, Victoria White*. (*project leaders)
September 4,2001Lee Lueking, FNAL3 Overview Goals of Resource Management Users, Groups and Access modes Resources and Resource Management Strategies Implementation –System Configuration –Rules and Policies –Disk Cache Management –Fair Share scheduling –Resource Co-allocation Plans and Conclusion
September 4,2001Lee Lueking, FNAL4 Goals of Resource Management Implement experiment policies on prioritization and fair sharing in resource usage, by user categories (access modes, research group etc) Maximize throughput in terms of real work done (i.e. user jobs and not system internal jobs such as data transfers)
September 4,2001Lee Lueking, FNAL5 Groups Users whose datasets, processing styles and goals are largely shared. Defined by: –physics topics, like Higgs, Top, W/Z, B, QCD, and New Phenomena –detector elements like calorimeter, silicon tracking, muon, and so on –particle identification like jets, electron, muon, and tau. Users must be registered and it is possible for each individual to be included many groups.
September 4,2001Lee Lueking, FNAL6 Access Modes Storage –Data acquisition storage –Monte Carlo data storage –General User data storage Delivery –Frequently accessed data –Cooperative access and processing –Data file delivery on demand –Random access event selection
September 4,2001Lee Lueking, FNAL7 Resources Tape mounts Tape volume access Tape drive usage Network throughput Disk cache Processing CPU Memory cache
September 4,2001Lee Lueking, FNAL8 Management Strategies Divide the problem into 3 tier hierarchy: Local (station), Site, Global Hardware Configuration: Mass Storage System (ATL) access, Network, Disk assignments. Establish Rules: Group allocations, Access mode priorities, Data routing paths, Type of processing, etc. Algorithms to combine rules
September 4,2001Lee Lueking, FNAL9 The Hierarchy of Resource Managers Global RM Sites Connected by WAN Stations And MSS’s Connected By LANs Batch queues and disks Site RM Station – Local RM Experiment Policies, Fair Share Allocations, Cost Metrics
September 4,2001Lee Lueking, FNAL10 Implementation
September 4,2001Lee Lueking, FNAL11 Overview of Sam Database Server(s) (to Central DB) Name Server Site or Global Resource Manager(s) Log server Station 1 Servers Station 2 Servers Station 3 Servers Station n Servers Mass Storage System(s) Shared Globally Local Shared Locally Arrows indicate Control and data flow
September 4,2001Lee Lueking, FNAL12 The SAM Station Responsibilities –Cache Management –Project (Job) Management –Movement of data files to/from MSS or other Stations Consists of a set of inter-communicating servers: –Station Master Server, –File Storage Server, –File Stager(s), –Project (Job) Manager(s)
September 4,2001Lee Lueking, FNAL13 Components of a SAM Station Station & Cache Manager File Storage Server File Stager(s) Project Managers /Consumers eworkers File Storage Clients MSS or Other Station MSS or Other Station Data flow Control Producers/ Cache Disk Temp Disk
September 4,2001Lee Lueking, FNAL14 Station Configuration Disks assigned to the cache Batch system used Batch queues available Batch queue depth Processing capacity CPU and physical memory Mass Storage Systems available Inter -station transfer mechanism: BBFTP, rcp Disk accessibility for distributed cluster Network connection, bandwidth, subnet for each machine Security issues, access to kerberos tickets, etc. Waits, timeouts and retries on failure conditions
September 4,2001Lee Lueking, FNAL15 Rules and Policies Disk cache allocated to each group Disk cache refreshment algorithm for each group:LRU,FIFO, etc. Minimum amount of data to deliver at a time from each tape for a project Order files brought into the cache. Through which station files will be routed when retrieving from a particular Mass Storage System Which data access activities have the highest priority Which data storing activities have the highest priority To which MSS’s are files stored, and to which tapes Sharing of the resources of a station among groups Which users belong to which groups How many projects per group are allowed What processing activities are allowed on each station? * To which stations should data access and processing activities be sent? * How should the resources of a local cluster of stations be shared among groups?* * Currently done by administrators
September 4,2001Lee Lueking, FNAL16 Station Management Caches –Allocations established for groups on each station. –Resources are allocated by group Total Size Lock (pin) Size Refresh algorithm: LRU,FIFO,… –No rigid assignment to particular physical disks. Projects –Number of concurrent projects for each group, on each station. Administration is by authorized users only –Station admins –Group admins
September 4,2001Lee Lueking, FNAL17 Station Administration: Dump(1) % sam dump station –groups *** BEGIN DUMP STATION central-analysis, id=21 running at d0mino 5 days 22 hours 24 minutes 20 seconds, admins: lueking Known batch systems: lsf Default batch system: lsf No Source location is preferred There are 1 authorized transfer groups Full delivery unit is enforced; external deliveries are unconstrained
September 4,2001Lee Lueking, FNAL18 Station Administration: Dump (2) AUTHORIZED GROUPS: group algo: admins: cope lueking melanson terekhov veseli white, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 5/50, disk: KB/ KB, locks:0B/ KB group cal: admins: lueking terekhov veseli white, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 1/10, disk: KB/78125MB, locks:0B/78125MB group demo: admins: lueking terekhov veseli white, swap policy: LRU, fair share: , quotas (cur/max): projects = 2/50, disk: KB/ KB, locks:0B/0KB group dzero: admins: lueking melanson terekhov veseli white, swap policy: LRU, fair share: , quotas (cur/max): projects = 10/100, disk: KB/ KB, locks:0B/ KB group emid: admins: lueking terekhov veseli white, swap policy: LRU, fair share: 0, quotas (cur/max): projects = 0/10, disk: KB/ KB, locks:0B/ KB group test: admins: lueking terekhov veseli white, swap policy: LRU, fair share: , quotas (cur/max): projects = 1/20, disk: KB/ KB, locks:237179KB/ KB group thumbnail: admins: lueking melanson schellma, swap policy: LRU, fair share: , quotas (cur/max): projects = 0/5, disk: KB/ KB, locks:0B/0KB *** END OF STATION DUMP ***
September 4,2001Lee Lueking, FNAL19 Adding Data to the System Metadata descriptions for: –Detector data –Monte Carlo data –Processing details Mapping to storage locations (we call auto- destinations) Station forwarding specification
September 4,2001Lee Lueking, FNAL20 Replica Site WAN Data flow Station Mass Storage System User (producer) Forwarding + Caching = Global Replication NIKHEF (Amsterdam) 155 Mbps Sara Fermilab D0robot
September 4,2001Lee Lueking, FNAL21 Replica Site WAN Data flow Station Mass Storage System User (producer) Routing + Caching = Global Replication
September 4,2001Lee Lueking, FNAL22 Resource Management Approaches Fair Sharing (policies) –Allocation of resources and scheduling of jobs –The goal is to ensure that, in a busy environment, each abstract user gets a fixed share of “resources” or gets a fixed share of “work” done Co-allocation and reservation (optimization)
September 4,2001Lee Lueking, FNAL23 Fair Share and Computational Economy Jobs, when executed, incur costs (through resource utilization) and realize benefits (through getting work done) Maintain a tuple (vector) of cumulative costs/benefits for each abstract user and compare them to his allocated fair share to set priority higher/lower Incorporate all known resource types and benefit metrics, totally flexible
September 4,2001Lee Lueking, FNAL24 Job Control: Station Integration with the Abstract Batch System Client Local RM (Station Master) Batch System Process Manager (SAM wrapper script) User Task Job Manager (Project Master) Sam submit submit dispatchinvoke Sam condition satisfied resubmit setJobCount/stop invoke jobEnd 1.Fair Share Job Scheduling 2.Resource Co-allocation
September 4,2001Lee Lueking, FNAL25 Future Plans Tape mounts were a critical resource in the past, but the inter-station movement of data is perceived to be a future constraint as more stations are deployed with large disk caches. In addition to moving the data to computing resources, the system will evolve to move the processing to the data. Job control language that will specify each task at a level that will allow the system to decide when and where it can optimally be processed. Incorporate standard grid components as availability and need dictates: GridFTP, GSI, Condor, DAGMan, etc..
September 4,2001Lee Lueking, FNAL26 Conclusion The SAM system used for D0 data management and access represents a large step toward a global data grid. Resources are managed at station, site and global levels. The system is governed by station configuration and rules/policies. Fair share resource allocation and scheduling controls amount of work done by each group, access mode, etc. co-allocation coordinates data and processing to most effectively utilize the overall system.