9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Operating System.
Processes Management.
LIBRA: Lightweight Data Skew Mitigation in MapReduce
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
Chapter 3 Operating Systems. Chapter 3 Operating Systems 3.1 The Evolution of Operating Systems 3.1 The Evolution of Operating Systems 3.2 Operating System.
Processes CSCI 444/544 Operating Systems Fall 2008.
Device Management.
Computer Organization and Architecture
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Layers and Views of a Computer System Operating System Services Program creation Program execution Access to I/O devices Controlled access to files System.
Types of Operating System
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.
L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.
Operating Systems.  Operating System Support Operating System Support  OS As User/Computer Interface OS As User/Computer Interface  OS As Resource.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Operating Systems.
Input and output (IO) systems Last week we considered the memory management layer of the operating system. This week we will look at another layer of the.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
3rd Nov 2000HEPiX/HEPNT CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Stephen Wolbers CHEP2000 February 7-11, 2000 Stephen Wolbers CHEP2000 February 7-11, 2000 CDF Farms Group: Jaroslav Antos, Antonio Chan, Paoti Chang, Yen-Chu.
SLAC Site Report Chuck Boeheim Assistant Director, SLAC Computing Services.
Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
OPERATING SYSTEMS CS 3530 Summer 2014 Systems with Multi-programming Chapter 4.
Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,
Operating Systems Objective n The historic background n What the OS means? n Characteristics and types of OS n General Concept of Computer System.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
Online Monitoring System at KLOE Alessandra Doria INFN - Napoli for the KLOE collaboration CHEP 2000 Padova, 7-11 February 2000 NAPOLI.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 31 Memory Management.
Time Management.  Time management is concerned with OS facilities and services which measure real time.  These services include:  Keeping track of.
Batch Software at JLAB Ian Bird Jefferson Lab CHEP February, 2000.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
Big Data is a Big Deal!.
Applied Operating System Concepts
Processes and threads.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
2. OPERATING SYSTEM 2.1 Operating System Function
Operating System.
Lesson Objectives Aims Key Words
Where are being used the OS?
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
TYPES OFF OPERATING SYSTEM
CS703 - Advanced Operating Systems
Operating Systems.
Chapter 8: Memory management
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Lecture Topics: 11/1 General Operating System Concepts Processes
Uniprocessor scheduling
ATLAS DC2 & Continuous production
The ATLAS Computing Model
Development of LHCb Computing Model F Harris
Mr. M. D. Jamadar Assistant Professor
Presentation transcript:

9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources Organization of Data User View of Access to Data Batch queues Disk Management Tests

9 February 2000CHEP2000 Paper 3682 Hardware Resources; Organization of Data. Mixed flavor unix cluster (CPU resource). Fibre channel disk arrays on each node of cluster currently (disk resource). Tape drives and robot tape library (tape drives resource). Drives connected directly on each node. Concentrate in talk on resources during reading of data. Datasets, filesets, files of 1GB. Datasets: raw, primary, secondary… Tapes store a group of filesets. Associations in Datafile Catalog (see Paper 367).

9 February 2000CHEP2000 Paper 3683 User View

9 February 2000CHEP2000 Paper 3684 User View of Access to Data Batch queues to manage cpu cycles Access data only from disk, not tape. Staging jobs in parallel. Disk inventory manager package for shared disk space. Batch queues to manage tape drives.

9 February 2000CHEP2000 Paper 3685 Batch Queues LSF (Platform Computing) proposed. Fairshare scheduling. Combined quotas across queues desirable. CPU queues for analysis jobs: Allocate CPU cycles by group, by user, by special project. I/O queues for staging jobs: input, output, event pick. 1 tape drive per queue slot. I/O job cpu use is proportional to data volume. Allocate drives and data volume by group, user, project.

9 February 2000CHEP2000 Paper 3686 Disk Management By fileset (reduce bookkeeping overhead) Allow static filesets for important datasets Filesets remain on disk until space is needed. Use-reservation prevents deletion of fileset. Delete algorithm looks at frequency of use and time since last use-reservation. Allocate space algorithm uses quotas by group and user.

9 February 2000CHEP2000 Paper 3687 User Job and Disk Management User gives dataset Dataset converted to list of filesets Stager manages list and returns next fileset when asked. Stager Part of User Job: Maintains small buffer of use-reservations to keep ahead of analysis job Adds use-reservations for filesets on disk or spawns input staging jobs to maintain buffer Releases use-reservations when fileset processed

9 February 2000CHEP2000 Paper 3688 Effects of Disk Management Job processes filesets on disk first (different orders, different times) Multiple jobs using same fileset share staging jobs Fast analysis job gets multiple staging jobs Only a fraction of a dataset is present on disk at one time (conserves disk space).

9 February 2000CHEP2000 Paper 3689 Prototype Tests Set of basic queues on workstation (LSF) Basic staging software Simulated analysis jobs which process dummy data Set of big and small dummy datasets Basic CDF Data Catalog software with contents for this simulation Purpose is to test ideas on resource management, and evaluate how analysis jobs interact in a resource limited environment.

9 February 2000CHEP2000 Paper Prototype Scaled Down Environment Single cpu workstation, b0ib04 Staging disk 9 GB Filesets of size 0.5 GB 4 small 1GB i.e. 10% of disk 4 large 10 GB i.e. 100% of disk 2 cpu queues, short & long Analysis jobs with variable cpu time 4 execution slots for each cpu queue 2 simulated tape drives (2 slots in io queue) 1 real tape drive in Emass robot

9 February 2000CHEP2000 Paper Simulation Scenarios Purpose: Investigate effect of patterns of use by collaboration (CDF “spin” jobs, repetitive small dataset jobs) Exercise data access features Choosing scenarios: A. Short vs long job competition B. Several jobs using same big dataset (CDF “spin” jobs) C. Competition for tape drives and disk space

9 February 2000CHEP2000 Paper Some scenarios studied: One long job vs a stream of short jobs Three long jobs on same dataset, see figure Ten long jobs on same dataset Mixed set of different long jobs and users (6 jobs, 6 users, 4 datasets). Stream of short jobs vs 4 different long jobs The disk allowed 4 different big datasets to be processed together, as expected for this simulation. Extra staging jobs for the streams of short jobs occurred when expected (when contesting against 4 or more different big datasets).

9 February 2000CHEP2000 Paper Trial 45 Three Long Jobs

9 February 2000CHEP2000 Paper Trial 29 Stream of shorts vs 1 long, 1 short

9 February 2000CHEP2000 Paper Conclusions from Prototype Tests DIM/Stager worked well. Stager functions appropriate, simple. Gave guidance for full implementation (client/server structure, cleanup, admin functions) Limited test of LSF (batch queues) worked well.

9 February 2000CHEP2000 Paper Mock Data Challenge 1 During December 99 and January 00, CDF successfully tested the movement of MC simulated data from the online Level 3 trigger farm of processors to the tape library, and through the offline reconstruction farm back to the tape library. Many sub-groups were involved. The resource management methods discussed here were implemented and used but will not be stressed until the rate tests of Challenge 2 in Spring 2000.

9 February 2000CHEP2000 Paper Summary Resource management methods were explained. Prototype tests were extolled. Full implementation of methods is underway. More tests to come. CDF Engineering run occurs in August 00.