ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

ALICE G RID SERVICES IP V 6 READINESS
Pankaj Kumar Qinglan Zhang Sagar Davasam Sowjanya Puligadda Wei Liu
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Torrent-based Software Distribution in ALICE.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
G RID SERVICES IP V 6 READINESS
ALICE data access WLCG data WG revival 4 October 2013.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
October 8, 2015 University of Tulsa - Center for Information Security Microsoft Windows 2000 DNS October 8, 2015.
Data Transfers in the Grid: Workload Analysis of Globus GridFTP Nicolas Kourtellis, Lydia Prieto, Gustavo Zarrate, Adriana Iamnitchi University of South.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
N EWS OF M ON ALISA SITE MONITORING
Site operations Outline Central services VoBox services Monitoring Storage and networking 4/8/20142ALICE-USA Review - Site Operations.
ALICE – networking LHCONE workshop 10/02/ Quick plans: Run 2 data taking Both for Pb+Pb and p+p – Reach 1 nb -1 integrated luminosity for rare.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Update on replica management
Status of PDC’07 and user analysis issues (from admin point of view) L. Betev August 28, 2007.
Monitoring with MonALISA Costin Grigoras. What is MonALISA ?  Caltech project started in 2002
1 WLCG-GDB Meeting. CERN, 12 May 2010 Patricia Méndez Lorenzo (CERN, IT-ES)
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
JAliEn Java AliEn middleware A. Grigoras, C. Grigoras, M. Pedreira P Saiz, S. Schreiner ALICE Offline Week – June 2013.
AliEn central services Costin Grigoras. Hardware overview  27 machines  Mix of SLC4, SLC5, Ubuntu 8.04, 8.10, 9.04  100 cores  20 KVA UPSs  2 * 1Gbps.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
CERN IT Department CH-1211 Genève 23 Switzerland t ALICE XROOTD news New xrootd bundle release Fixes and caveats A few nice-to-know-better.
Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Mario Reale – GARR NetJobs: Network Monitoring Using Grid Jobs.
1 R. Voicu 1, I. Legrand 1, H. Newman 1 2 C.Grigoras 1 California Institute of Technology 2 CERN CHEP 2010 Taipei, October 21 st, 2010 End to End Storage.
GRID interoperability and operation challenges under real load for the ALICE experiment F. Carminati, L. Betev, P. Saiz, F. Furano, P. Méndez Lorenzo,
Status of GSDC, KISTI Sang-Un Ahn, for the GSDC Tier-1 Team
ALICE computing Focus on STEP09 and analysis activities ALICE computing Focus on STEP09 and analysis activities Latchezar Betev Réunion LCG-France, LAPP.
SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.
Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.
MONALISA MONITORING AND CONTROL Costin Grigoras. O UTLINE MonALISA services and clients Usage in ALICE Online SE discovery mechanism Data management 3.
ALICE WLCG operations report Maarten Litmaath CERN IT-SDC ALICE T1-T2 Workshop Torino Feb 23, 2015 v1.2.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Storage discovery in AliEn
Federating Data in the ALICE Experiment
WLCG IPv6 deployment strategy
Data Formats and Impact on Federated Access
ALICE internal and external network
ALICE Monitoring
Torrent-based software distribution
Update on Plan for KISTI-GSDC
Torrent-based software distribution
Storage elements discovery
Ákos Frohner EGEE'08 September 2008
Publishing ALICE data & CVMFS infrastructure monitoring
湖南大学-信息科学与工程学院-计算机与科学系
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Presentation transcript:

ALICE DATA ACCESS MODEL

Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring  Replica discovery mechanism  Plans

ALICE data access model 05/13/2014 ALICE Data Access Model 3  Central catalogue of LFNs and replica locations  Data files are accessed directly from the storage  Jobs go to where a copy of the data is  Other required files are read remotely (configuration, calibration, executing and validating scripts etc)  Urgent tasks (organized analysis) relaxes the locality constraint to get the job done quickly for the ‘tail’ (few last percent) of the jobs  For all requests the client gets a sorted list of replica locations, function of storage availability and its location, i.e. closest (local) first

Data management 05/13/2014 ALICE Data Access Model 4  Exclusive use of xrootd protocol for data access  While also supporting http, ftp, torrent for downloading other input files  At the end of the job N replicas are uploaded from the job itself (2x ESDs, 3xAODs, etc...)  Scheduled data transfers with xrd3cp  T0 -> one T1 / run, selected at data taking time  Calibration data replication  Storage migration / decommissioning

Organized analysis trains 05/13/2014 ALICE Data Access Model 5  Running many user tasks over the same input data  Users are strongly encouraged to join the trains instead of running their own tasks  The most IO-demanding central processing  The average analysis requires 2MB/s/core to be 100% CPU efficient, but majority of the current infrastructure doesn’t support that  New CPUs require much more…

Analysis trains activity 05/13/2014 ALICE Data Access Model 6  Last week of organized analysis train read volume:  Local site storage: 1.56 PB  Remote storage: 64 TB  0.38% remote data access (failover, lifted locality restrictions)  Read throughput:  Local site storage avg: 1.35 MB/s  Remote site storage avg: 0.73 MB/s  Reading remotely introduces a large penalty!

Job file access stats 05/13/2014 ALICE Data Access Model 7

Remote access efficiency 05/13/2014 ALICE Data Access Model 8 Problems can come from both network and the storage IO performance seen by jobs doesn’t always match the tests Congested firewall / network segment, different OS settings, saturated storage IO Reflected in the overall efficiency Storage WNs CERNLEGNAROTORINOCNAFFZK CERN2.668 MB/s0.27 MB/s FZK0.486 MB/s0.161 MB/s0.213 MB/s2.963 MB/s LEGNARO1.611 MB/s2.628 MB/s0.673 MB/s0.749 MB/s TORINO1.848 MB/s1.609 MB/s0.684 MB/s0.891 MB/s CNAF2.193 MB/s0.623 MB/s2.126 MB/s

SE monitoring 05/13/2014 ALICE Data Access Model 9  xrootd and EOS data servers publish two monitoring streams  ApMon daemon reporting the data server host monitoring and external xrootd parameters Node total traffic, load, IO, sockets, disk IO, memory … Version, total and used space  xrootd monitoring configured as: xrootd.monitor all flush 60s window 30s dest files info user MONALISA_HOST:9930 Client IP, read and written bytes

Infrastructure monitoring 05/13/2014 ALICE Data Access Model 10  On each site VoBox a MonALISA service collects  Local SE monitoring data (network interface activity, load, sockets, client access stats etc)  Job resource consumption, WN host monitoring …  Traffic data is aggregated in client IPv4 C-class, LAN/WAN, client site, server site  ML services perform VoBox to VoBox measurements  traceroute / tracepath  1 stream available bandwidth measurements (FDT)FDT This is what impacts the job efficiency  All results are archived and we also infer the network topology and utilization from them

AS level topology view in MonALISA 05/13/2014 ALICE Data Access Model 11

Available bandwidth per stream 05/13/2014 ALICE Data Access Model 12 Funny ICMP throttling Discreet effect of the congestion control algorithm on congested links (x 8.39Mbps) 4MB buffers 8MB16MB 1000 km2000 km4000 km

SE functional tests 05/13/2014 ALICE Data Access Model 13  Performed centrally every 2h, targeting the declared redirector  add/get/rm suite using the entire AliEn stack  Or just get if the storage is full  The dynamically discovered xrootd data servers are tested individually, with a simplified suite  Monitor discrepancies between declared volume and total space currently seen by the redirector  Site admins prompted to solve the above issuesprompted  And many other related tests, like insufficiently large TCP buffer sizes

Replica discovery mechanism 05/13/2014 ALICE Data Access Model 14  Closest working replicas are used for both reading and writing  Sorting the SEs by the network distance to the client making the request Combining network topology data with the geographical location  Leaving as last resort the SEs that fail the respective functional test  Weighted with their free space and recent reliability  Writing is slightly randomized for more ‘democratic’ data distribution

Distance metric function  distance(IP, IP)  Same C-class network  Common domain name  Same AS  Same country (+ f (RTT between the respective AS-es if known) )  If distance between the AS-es is known, use it  Same continent  Far, far away  distance(IP, Set ): Client's public IP to all known IPs for the storage 05/13/2014 ALICE Data Access Model

Weight factors  Free space modifies the distance with  f (ln(free space / 5TB))  Recent history of add, resp. get contribute with  75% * last day success ratio +  25% * last week success ratio  The result is a uniform federation with a fully automatic data placement procedure based on monitoring data 05/13/2014 ALICE Data Access Model 16

Plans 05/13/2014 ALICE Data Access Model 17  In the near future ALICE will upgrade to xrootd 4.0 centrally  AliEn, using the xrootd the command line  ROOT, using xrootd as library  Eventually replacing xrd3cp with the new client that implements the same functionality  Implement IPv6 network topology discovery and use it for SE discovery  We have already started getting requests on IPv6  Retry using async IO in ROOT with the new releases

Site plans 05/13/2014 ALICE Data Access Model 18  Long overdue xrootd upgrade  Some sites still run 5y old versions  Will ask all existing ones to upgrade to 4.0 as soon as it is stable  For newly deployed storage we plan to use EOS  Using the RAIN functionality between well-connected sites  Work closer with the sites to identify IO bottlenecks and solve them  Keeping in mind the 2MB/s/core requirement