CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Scalla/xrootd WAN globalization tools: where we are. Now my WAN is well tuned! So what?

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
EGEE is a project funded by the European Union under contract IST Using SRM: DPM and dCache G.Donvito,V.Spinoso INFN Bari
Duke Atlas Tier 3 Site Doug Benjamin (Duke University)
GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
CERN IT Department CH-1211 Genève 23 Switzerland t XROOTD news Status and strategic directions ALICE-GridKa operations meeting 03 July 2009.
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
ALICE data access WLCG data WG revival 4 October 2013.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1 cmsd xrootd cmsd xrootd cmsd xrootd cmsd xrootd Client Client A small 2-level cluster. Can.
11-July-2008Fabrizio Furano - Data access and Storage: new directions1.
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
OSG Storage Architectures Tuesday Afternoon Brian Bockelman, OSG Staff University of Nebraska-Lincoln.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd setup An introduction/tutorial for ALICE sysadmins.
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
File sharing requirements of remote users G. Bagliesi INFN - Pisa EP Forum on File Sharing 18/6/2001.
02-June-2008Fabrizio Furano - Data access and Storage: new directions1.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data Access in the HEP community Getting performance with extreme HEP data distribution.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
Slide 1/29 Informed Prefetching in ROOT Leandro Franco 23 June 2006 ROOT Team Meeting CERN.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS XROOTD news New release New features.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
ALICE DATA ACCESS MODEL Outline 05/13/2014 ALICE Data Access Model 2  ALICE data access model  Infrastructure and SE monitoring.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
CERN IT Department CH-1211 Genève 23 Switzerland t ALICE XROOTD news New xrootd bundle release Fixes and caveats A few nice-to-know-better.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
11-June-2008Fabrizio Furano - Data access and Storage: new directions1.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
Data transfers and storage Kilian Schwarz GSI. GSI – current storage capacities vobox LCG RB/CE GSI batchfarm: ALICE cluster (67 nodes/480 cores for batch.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd through WAN… Can I? Now my WAN is well tuned! So what? Now my WAN is well tuned.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.
CERN IT Department CH-1211 Genève 23 Switzerland t Large DBs on the GRID Getting performance with wild HEP data distribution.
Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
CERN IT Department CH-1211 Genève 23 Switzerland t Xrootd LHC An up-to-date technical survey about xrootd- based storage solutions.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Scott Koranda, UWM & NCSA 20 November 2016www.griphyn.org Lightweight Replication of Heavyweight Data Scott Koranda University of Wisconsin-Milwaukee &
Federating Data in the ALICE Experiment
Data Access in HEP A tech overview
Ideas for a HEP computing model
Vincenzo Spinoso EGI.eu/INFN
Xrootd explained Cooperation among ALICE SEs
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Ákos Frohner EGEE'08 September 2008
Grid Canada Testbed using HEP applications
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Scalla/xrootd WAN globalization tools: where we are. Now my WAN is well tuned! So what? The network is there. Can we use it?

CERN IT Department CH-1211 Genève 23 Switzerland t Outline Wan specifics What can be done, what can be desired –Two simple use cases –What improved recently Reference test on a 10Gb WAN –How does direct access behave? An interesting usage example About storage globalization Conclusions F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Buy it, tune it… use it ! We could have a very powerful WAN system connecting all the sites –And there is. We learnt also how to tune it. –But then? Over WAN I can use my fancy HTML browser –I have to read my CHEP09 slides in INDICO –What about first transferring the whole INDICO to my HD? Disk space is cheap now… Is it practical/robust/reasonable to do that? –Is it true that for my analysis I just can transfer files around among N fancy storage pools? F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t WANs are difficult In WANs each client/server response comes much later –E.g. 180ms later With well tuned WANs one needs apps and tools built with WANs in mind –Otherwise they are walls impossible to climb I.e. VERY bad performance… unusable –Bulk xfer apps are easy (gridftp, xrdcp, fdt, etc.) –There are more interesting use cases, and much more benefit to get ROOT has the right things in it If used in the right way With XROOTD … OK!, CASTOR too ( ) F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t What can we do Basically, with an XROOTD-based frontend we can do 2 things via WAN: –Access remote data –Aggregate remote storages Build an unique storage pool with subclusters in different sites No practical size limits, up to 262K servers in theory No third-party SW needed So, we don’t need to know in advance where a file is… We just need to know which is the file we need There are pitfalls and things to consider –But a great benefit to get as well –Let’s see what’s possible and some of the new ideas F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t A simple use case I am a physicist, waiting for the results of my analysis jobs –Many bunches, several outputs Will be saved e.g. to an SE at CERN –My laptop is configured to show histograms etc, with ROOT –I leave for a conference, the jobs finish while in the plane –When there, I want to simply draw the results from my home directory –When there, I want to save my new histos in the same place –I have no time to loose in tweaking to get a copy of everything. I loose copies into the confusion. –I want to leave the things where they are. I know nothing about things to tweak. What can I expect? Can I do it? I know nothing about things to tweak. What can I expect? Can I do it? F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Another use case ALICE analysis on the GRID Each job reads ~ MB from ALICE::CERN::SE These are cond data accessed directly, not file copies –I.e. VERY efficient, one job reads only what it needs. It would be nice to speed it up –At 10MB/s it takes 10 secs –At 5MB/s it takes 20secs –At 1MB/s it takes 100 Sometimes data are accessed elsewhere –It would be nice if it was more efficient Better usage of resources, more processed jobs/day After all, ROOT/ALIROOT is not able to r/w data at more than 20MB/s with 100% usage of 1 core –Probably will do better in the future This fits perfectly with the current WAN status F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Up to now Up to now the WAN speedup was possible with ROOT+XrdClient+XROOTD Up to x with respect to basic client-server protocols –But it needed a tweak to enable/disable the WAN mode of XrdClient When to switch it on? So, difficult to automatize. Very technical, hence nobody cares! Now the things went further –The good old WAN mode is OK for bulk xfers –The new internal improvements use the efficiency of the newer kernels and TCP stacks –Interactive things should need nothing now So, if you have: –The new client (bundled in ROOT!) –A new server (available through xrd-installer) With the new fixed configuration! –You can expect a good improvement over the past Without doing nothing special, no user tweaks F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Exercise Caltech machinery: 10Gb network Client and server (super-well tuned) –Selectable latency: ~0.1ms = super-fast LAN ~180ms = client here, server in California –(almost a worst case for WAN access) Various tests: –Populate a 30GB repo, read it back –Draw various histograms Much heavier than the normal, to make it measurable From a minimal access to the whole files Putting heavy calcs on the read data Up to reading and computing everything –Analysis-like behaviour –Write a big output (~600M) from ROOT T h a n k s t o I o s i f L e g r a n d a n d R a m i r o V o i c u F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Exercise This is not a “Bandwidth race” –The goal is not to fill the 10Gb bandwidth Others are interested in that, and do it very well We wanted to see: –Can we use all this to live better with data? –How does a normal task perform in LAN/WAN? In a measurable and stable WAN environment Local disk vs XROOTD vs HTTP (Apache2) –Why HTTP? Because it is just the most difficult opponent: Efficient (LAN+WAN) and lightweight No bandwidth waste * Very robust server (but not enough OK for HEP data mgmt) Well integrated in ROOT, works well (except writes, not supported) * See the talk about gpfs/xrootd (on Thu) and the Lustre analysis by A.Peters (ACAT08) F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t 10Gb WAN 180ms Analysis F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t 10Gb WAN 180ms Analysis An estimation of Overheads and write performance F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Comments Things look quite interesting –BTW same order of magnitude than a local RAID disk (and who has a RAID in the laptop?) –Writing gets really a boost Aren’t job outputs written that way sometimes? Even with Tfile::Cp We have to remember that it’s a worst-case –Very far repository –Much more data than a personal histo or an analysis debug (who’s drawing 30GB personal histograms? If you do, then the grid is probably a better choice.) F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Comments As always, this is not supposed to substitute a good local storage cluster –But can be a good thing for: Interactive life, multicore laptops Saving the life of a job landed in a place where its input is not present Federating relatively close sites… –E.g. one has the storage, the other has the WNs An user willing to debug its analysis code locally –Without copying all the repo locally Whatever could come to the mind of a WWW user F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t A nice example ALICE conditions data repository – Regular ROOT files annotated in the AliEn catalogue –Populated from various online DBs and runtime detector tasks –Nothing strange – Primary copy on xrootd storage servers at CERN (5x, 30 TB total) – Accessed directly by all MC and reconstruction jobs on the Grid –Up to 12Kjobs, Up to 6-8K connections –Directly means no pre-copy, i.e. very byte-efficient F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t A nice example F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t More than Globalization: The VMSS Xrootd site (GSI) A globalized cluster ALICE global redirector Local clients work Normally at each site Missing a file? Ask to the global redirector Get redirected to the right collaborating cluster, and fetch it. Immediately. A smart client could point here Any other Xrootd site (CERN) Cmsd Xrootd Virtual Mass Storage System … built on data Globalization F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t Conclusion Many things are possible –E.g. solving the cond data problem was a breakthrough for Alice It would be nice to use the globalization to lighten the File Catalog –And use it as a metadata catalog, as it should Technologically it’s satisfactory –But not ended here, there are new possible things, like e.g. Torrent-like Extreme copy –To boost data movements also in difficult sites Mounting locally a globalized WAN Xrootd metacluster –As a local file system using the XCFS tools F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t QUESTIONS? Thank you! F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t 10Gb LAN – Reference F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t 10Gb LAN – Reference F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)

CERN IT Department CH-1211 Genève 23 Switzerland t 10GbWAN 180ms Bulk Xfer F. Furano, A. Hanushevsky - Scalla/xrootd WAN globalization tools: where we are. (CHEP09)