Download presentation
Presentation is loading. Please wait.
Published byLesley Reeves Modified over 9 years ago
1
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges of particle physics. Future users of the Grid require transparent scheduling systems and data management, provided by intelligent middleware. These users will be able to run a wide range of programs each of which can run on any computer. Each type of program will require a selection of data files which must be accessible to every site even though they can be stored anywhere on the Grid. In order to optimise the efficiency of the whole Grid, controlled intelligent copying (or replication) of data files will be required. File usage patterns will be logged in order to continuously optimise replica creation and deletion. Optimisation of replicas results in reduction of network loading, minimised program execution times and optimal storage use. Grid Simulation - Optor To dynamically optimise the placement of replicas within the Grid a Replica Optimiser is needed. This is currently being tested within a simulation of the Grid to check its performance and stability. In this demonstration Optor is used to model aspects of the UK particle physics Grid in order to test strategies used for dynamic replica optimisation. The Optor inputs are the size of the storage and computing processing power available at each site, the network configuration between pairs of sites, program file definitions including their file size and policy definitions per site (controlling which experiment’s programs are allowed to run). In this simplified demonstration, all sites have the same allocated processing power and storage resources, all file sizes are the same, all network links are the same bandwidth, and input files are far larger than output files. Programs within Optor are described by a selection of files, which can be found at one or more sites on the Grid and must be copied to the local site. Each program reads two or three selected files from an experiment's data set. GRID Simulation sites within Britain showing the number of parallel network connections A day in the life of the Grid When the simulation starts, master files are randomly placed across the Grid and initially there are no replicas. As time evolves, CPU resources at the individual sites run in parallel. A program running at one of the sites reads the selected data, which may not (yet) be stored locally. Files may then be retrieved from remote storage at some other site on the Grid. Each remote retrieval introduces network traffic, which is monitored as the number of connections between sites at a given time. As network traffic builds up bottlenecks may form increasing the time taken to run programs. Initially the network loading is high as the files are randomly scattered throughout the Grid, but as the simulation progresses the network load is reduced. Program running time can be reduced by at least a factor of two using basic replica optimisation. The histograms show the time taken for programs to run when files are read remotely (yellow), and when replicas are made to local sites, deleting the oldest replicas when space is required (green). Simulation flow charts: overview, resource broker and computing element. Program running times with and without optimisation.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.