J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012
A GENDA NICA off-line data processing parameters Tasks for simulation Simulation platform choice Model efficiency estimation First results Conclusion 2
D ATA P ROCESSING S CHEMA F OR NICA MPD 3 NICA’s data flow parameters: high speed of the events generation (to 6 KHZ), in the central collision of Au-Au about 1000 particles are formed, the size of the file with modelled information from detectors for events occupies about 5 TB. №ParameterValue 1Speed of data collection from all detector’s components 4.7 GB/s 2Duration of the set of statistics period within a year 120 days 3Frequency of the event emergence on an installation exit 6 KHz 4Dead time after event emergence1 cicle (50%) 5Average of tracks in an event500 6Average of particles collisions20 7Average of bytes on each collision45 8Average time of event's reconstruction on the processor in capacity 1КSI2K 2 s. MPD parameters
S OURCE D ATA 4 №RequirementsValue 1Quantity of events to processing in a year1.87 е10 2Total data volume to storage in a year8,4 PB 3Total Disk space in case storage is RAID6 (+25%) in a year10 PB 4Total CPUs in grid structure, minimum necessary for data recovery with the speed equal to a set of events, proceeding from 7000 thousand astronomical clock of work a year Numbers of grid sites20 6Minimum of Data transfer speed from JINR to Sites2,5 Gb/s The specification of requirements to NICA experiment off-line data processing The expected number of data processed events is about 19 billions. If data transfer speed from sensors is 4.7 GB/s, the total amount of source data can be estimated as 30 PB annually, or 8.4 PB after processing.
G RID FOR EXPERIMENTS 5 Hierarchical grid infrastructure with some computing centers Tier 0/1/2 already used in ALICE experiment and others. PANDA experiment wants to use it also. Questions For Simulation Grid Infrastructure Architecture? Number Resource centers? Amount of the Resources? Capacity of the network? Resource distribution between users groups? etc. Urgency Recommendation and specification for NICA grid infrastructure creation
S IMULATION T ASKS 6 Task 1. Task 2.
G RID S IM S IMULATION P ACKAGE 7 Allows to simulate various classes of heterogeneous resources, users, applications and brokers There are no restrictions on jobs number which can be sent on a resource; Capacity of a network between resources can be set; System supports simulation of statistical and dynamic schedulers; Statistics of all or the chosen operations can be registered Implemented in Java Configuration files are used to set simulation’s parameters Source code is available A lot of examples of the GridSim using Multilevel architecture allows to add new components easily GridSim Architecture
M ODEL EFFICIENCY ESTIMATION 8 Parameters of the model efficiency: a) Average network loading by days [%] b) Numbers of the running /waiting jobs c) Number of uses CPUs d)Total Data transfers in hours [GB] e)Total Storage uses [%] f) Cluster uses [%] j) Refused CPUs [%]
M ODEL C OMPONENTS 1. User Interface (edit/add model) 2. MySQL database to save simulation parameters 3. Simulation System 4. Results Visualization Tools 9
10 T EST SIMULATION Clusters: 1 Machine 2 CPUs Users: 1 Jobs: 10
E XAMPLE OF G RAPHIC R EPRESENTATION O F T HE S IMULATION R ESULTS Waiting and Running Jobs 2. Average Clusters Usage
D ONE ! The web interface of the model editing with one test scenario of the grid work is created key parameters of the model estimate are allocated; Results visualization tools are created; Simulation passed debugging and verification phase. 12
C ONCLUSION 13 to estimate some architectures (parameters) of the data processing system by changing entrance data only; library of scenarios (Data processing, architectures, other) will allow to compare various technical solutions and to choose optimum. The model will allow : ― the user interface development; ― debugging the model in client-server architecture ― development of a scenarios sets of grid systems work ― user’s editing and adding grid model parameters Plans:
14 Questions?