Grid operations in 2014 ALICE Offline week 20 March 2015 Latchezar Betev.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
1
EuroCondens SGB E.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Addition and Subtraction Equations
UNITED NATIONS Shipment Details Report – January 2006.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
The 5S numbers game..
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
The basics for simulations
PP Test Review Sections 6-1 to 6-6
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
VOORBLAD.
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Progressive Aerobic Cardiovascular Endurance Run
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2012 National Heart Foundation of Australia. Slide 2.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
When you see… Find the zeros You think….
Employment Ontario Program Updates EO Leadership Summit – May 13, 2013 Barb Simmons, MTCU.
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Addition 1’s to 20.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Foundation Stage Results CLL (6 or above) 79% 73.5%79.4%86.5% M (6 or above) 91%99%97%99% PSE (6 or above) 96%84%100%91.2%97.3% CLL.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
Static Equilibrium; Elasticity and Fracture
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Resistência dos Materiais, 5ª ed.
Intracellular Compartments and Transport
A SMALL TRUTH TO MAKE LIFE 100%
PSSA Preparation.
Essential Cell Biology
Weekly Attendance by Class w/e 6 th September 2013.
Physics for Scientists & Engineers, 3rd Edition
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
Presentation transcript:

Grid operations in 2014 ALICE Offline week 20 March 2015 Latchezar Betev

The ALICE Grid sites today 53 in Europe 10 in Aisa 2 in Africa 2 in South America 8 in North America 2 UNAM CHPC WUT KISTI T1 Bandung Cibinong RRC-KI T1

New sites KISTI – officially a T1 in WLCG UNAM – MoU for T2 in November 2014, towards a T1 WUT (Poland) in production September 2014 RRC-KI T1 in production January 2014 ZA_CHPC x4 capacity in November 2014 Bandung and Cibinong in production September

A new job record 4 November 2014 – 70K concurrent jobs … and we are breaching 72K jobs every week since February

CPU resources evolution % +30%+12%+30% Year on year increase

Resources evolution From 2011 to 2014 – 88% CPU increase average per year – slightly above the WLCG projection – Due to new sites (!) and above-flat budget capacity increase Storage capacity is growing at ~15% per year – Also slightly above flat-budget scenario – Remains critical in terms of what we can store – timely cleanup and reviews must continue 6

Yearly job profile 7 Out of productions Opportunistic resources

Resources distribution Continuous and remarkable 50/50 share between large (T0/T1) and smaller computing centres 8

9 Central services operation Certificate Expiration 2h30m Power cut 4h30m TQ broken 36h Total downtime 43 hours => 99.5% availability The blue grass above sites profile – site updates announcements, see individual sites for details

10 Catalogue stats 294 Mio new entries 262 Mio new entries 342 Mio new entries New generation servers (being ordered) can still hold the entore catalogue in memory

Computing tasks and workflow 11 Centrally managed PWG managed Individually managed RAW, MC productions Organized analysis (trains) Individual analysis AliEn Central Task Queue Computing centre..... Priority 3 Priority 1 Priority 2

Wall time resources share Organized analysis: centres MC productions: centres RAW data processing: T0/T1s only Individual analysis: centres 432 users

13 Organized analysis 5800 jobs 4400 jobs 3000 jobs +47% +32% Year on year increase

14 Individual analysis 4700 jobs 432 ind.users 4600 jobs 446 ind.users 6900 jobs 465 ind.users -50% +3% Year on year increase Individual analysis +47% +32% Year on year increase organized analysis

Analysis evolution From 2012 to 2014 the individual user analysis has decreased by 50% – It has remained at the same level of resources utilization between 2013 and 2014 The organized analysis fully compensated the ‘loss’ of individual already in 2013 Since 2013, the amount of resources used by analysis has grown by 35%, all of it organized The number of individual users has remained steady at ~445 There is still ample room to increase the share of the organized analysis 15

Efficiency per workflow 16 Average over all sites

Grid efficiency 17 +8% +2% Year on year change 86%84% 76%

Grid efficiency evolution Since the re-introduction of TTree Cache, the efficiency has stabilized at ~85% – The dramatic decrease of individual analysis also helped the efficiency increase In the past year, there is a slight upward trend, could be attributed to the better availability of storage (see next) We could expect a slight (2-5%) increase – If the individual analysis is decreased by factor 2 – If the current efficiency level of the other activities remains the same 18

Storage availability 19 +4% Year on year change 91%87% 83%

Storage availability evolution Constant improvement in availability – SEs are independent, no correlation in downtime Directly affecting the workload efficiency Room for further increase! – Allowed downtime for availability >99% = 88 hours 20 Top 15 SEs, one year average Current replica model (2 copies) => probability for both replicas to be SE availability = availability = 0.25%

Storage use 21 -7% +4% Year on year change 284PB 29PB 28PB 237PB 30PB 180PB Write +32% +20% Read Ratio r/w

Storage use evolution Increase in read volume – directly correlated with the increase in analysis activity – Improved ratio read/write In 1 year ALICE overwrites the entire disk storage completely – Timely cleanup is critical to keep the SEs in good health – … and to have free space for the new data – The disk cleanup is a continuous activity – Minimal amount of ‘dark data’ and files with low popularity 22

Resources usage Requirements Pledges Usage

Summary 2014 was (another) successful year for Grid operations Despite the absence of data taking, the Grid resources use was uninterrupted – In fact it has increased, as was the available capacity New centers have entered production – the Grid is expanding above the ‘flat budget’ scenario Substantial increase of analysis, most of it organized Efficiency remains high, and can be increased further The computing centres operation continues to be smooth – Software and hardware updates have negligible effect on general Grid availability 24