Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES Evolution of WLCG Data & Storage Management Outcome of Amsterdam.

Slides:



Advertisements
Similar presentations
Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
Advertisements

Data Management TEG Status Dirk Duellmann & Brian Bockelman WLCG GDB, 9. Nov 2011.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Distributed Tier1 scenarios G. Donvito INFN-BARI.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
16 th May 2006Alessandra Forti Storage Alessandra Forti Group seminar 16th May 2006.
ALICE data access WLCG data WG revival 4 October 2013.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Types of Operating Systems
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Enabling Grids for E-sciencE Introduction Data Management Jan Just Keijser Nikhef Grid Tutorial, November 2008.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
From the Transatlantic Networking Workshop to the DAM Jamboree to the LHCOPN Meeting (Geneva-Amsterdam-Barcelona) David Foster CERN-IT.
Summary of Data Management Jamboree Ian Bird WLCG Workshop Imperial College 7 th July 2010.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Future home directories at CERN
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES GGUS Ticket review T1 Service Coordination Meeting 2010/10/28.
Evolution of storage and data management Ian Bird GDB: 12 th May 2010.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
The CMS Top 5 Issues/Concerns wrt. WLCG services WLCG-MB April 3, 2007 Matthias Kasemann CERN/DESY.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,
Storage Interfaces and Access pre-GDB Wahid Bhimji University of Edinburgh On behalf of all those who participated.
Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The AliEn File Catalogue Jamboree on Evolution of WLCG Data &
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
Outcome should be a documented strategy Not everything needs to go back to square one! – Some things work! – Some work has already been (is being) done.
Computing Fabrics & Networking Technologies Summary Talk Tony Cass usual disclaimers apply! October 2 nd 2010.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Author etc Alarm framework requirements Andrea Sciabà Tony Wildish.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Protocols in DPM Ricardo Rocha.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
WLCG Jamboree – Summary of Day 1 Simone Campana, Maria Girone, Maarten Litmaath, Gavin McCance, Andrea Sciaba, Dan van der Ster.
Federating Data in the ALICE Experiment
Evolution of storage and data management
Jean-Philippe Baud, IT-GD, CERN November 2007
WLCG IPv6 deployment strategy
DAaM summary Ian Bird MB 29th June 2010.
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
dCache “Intro” a layperson perspective Frank Würthwein UCSD
GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.
Introduction to Data Management in EGI
Final summary Ian Bird Amsterdam, DAaM 18th June 2010.
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
湖南大学-信息科学与工程学院-计算机与科学系
Presentation transcript:

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Evolution of WLCG Data & Storage Management Outcome of Amsterdam Jamboree Andrea Sciabà Réunion des sites LCG-France June, CPPM Marseille

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Introduction Held in Amsterdam Two and a half days 100 attendees 30 presentations Check for details on the talks and the attached documents

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Goals Challenges –Performance and scalability for analysis –Long term sustainability of current solutions –Keep up with technological advances –Look at similar solutions The goal is to have a better solution by 2013 –Focus on analysis and user access to data –Based on available tools as far as possible And try to avoid HEP-only solutions –More “network-centric” cloud of storage –Less complexity, more focused use cases

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Technical areas Use tape just as backup archive Allow remote data access Look into P2P technologies “Global home directory” Address catalogue consistency Revisit authorisation mechanisms –Quotas, ACLs,... Virtualisation, multicore

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Agenda Day 1: Setting the scenario Day 2: Review of existing technologies and potential solutions Day 3: summary, agreement on demonstrators and prototypes, plan and timeline

CERN IT Department CH-1211 Geneva 23 Switzerland t ES The “strawman” model Some past assumptions do not hold anymore –“Network will be a bottleneck, disk will be scarce, need to send jobs to the data,...” Key features –Tape is a true archival system e.g. “stage” from a remote site’s disk rather than from tape –Transparent (also remote) data access More efficient use of networks –More CPU-efficient data access (also remote) –Less deterministic system (= more flexible and responsive) P2P to be seriously investigated

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Networks and tape storage LHCOPN fine today but limited to T0-T1, T1-T1 flows –Flows are larger than expected –T1-T2, T2-T2 becoming significant Need to study patterns, design an architecture and build it –If not done, network will do become a problem HSM are not really used as such –Data are explicitly pre-staged –Users are often forbidden to access tape Introduce the notion of file-set? –More efficient dataset placement on disk and tape Use disks for archive? –Focus on cost and power efficiency, not performance Would clustered storage be more operationally efficient?

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Data access and transfer Various well known issues –Heterogeneity of storage and data access protocols need fine tuning for CPU efficiency –Authorization depends on system Desiderata –A transparent, efficient, fault tolerant data access layer –Reliable data transfer and popularity-aware dataset replication –Global namespace –Transparent caching Use UNIX paradigm? The focus should be on a common data access layer Efficient usage of network, meltdown avoidance, sustainable operations are also a concern Use sparse access to objects and events rather than scheduled dataset transfers depending on the case

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Namespaces, catalogues, quotas Need for global namespaces: hierarchical (directories) and flat (GUIDs) Catalogue: should be simple and consistent with storage –LFC and AliEN File Catalogue meet the requirements –Lack of consistency is a serious issue ACLs: should be global (and no backdoors), support DNs and VOMS groups –And quotas, too Quotas still missing but should be easy to implement on top of catalogue

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Multicore and global namespaces Problems with current way to use multicore CPUs (one job per core) –Increasing memory needs –Increasing independent readers/writers to disk –Increasing number of incoherent jobs competing for resources Must learn how to use all cores by a single job –Will provide many opportunities for optimization Global namespace –Look for viable implementations

Experiment Support Second Day

CERN IT Department CH-1211 Geneva 23 Switzerland t ES File system tests and summary of IEEE MSST symposium HEPIX storage WG is benchmarking several storage systems against CMS (serial access) and ATLAS (random access) applications –AFS, GPFS, Lustre, xrootd, dCache, Hadoop –Metrics are events/s, MB/s Results “today” (work in progress) –GPFS excellent (but expensive) –AFS VICE/GPFS, AFS VICE/Lustre excellent Highlights from discussions at IEEE MSST at Lake Tahoe –HEP DM too complex, unreliable, not standard, not reusable, expensive to manage –Should use standard protocols and building blocks –NFS 4.1 very attractive –SSDs not yet ready for production

CERN IT Department CH-1211 Geneva 23 Switzerland t ES NFS 4.1 and xrootd NFS 4.1 is very attractive –Good for high latency, full security, standard protocol, pNFS scalable, industry support, available in OS, funded by EMI, simple migration path (for dCache)... xrootd is a well established solution for HEP use cases –Well integrated with ROOT –Catalogue consistency by definition –Seamless data access via LAN and WAN –Strongly plug-in based Support is best effort by experts Protocol partially coupled with implementation?

CERN IT Department CH-1211 Geneva 23 Switzerland t ES FTS and the file catalogues FTS limitations –The channel concept is becoming insufficient for any-to- any transfers: abandon it? –Easy to overload the storage: –FTS server depends on the link: use message queues to submit anywhere? –Have the system choosing the source site? –Allow to restart partial transfers? LFC: main issue is consistency (for any catalogue) –Solve it using a messaging system between the catalogues and the SEs? AliEN File Catalogue –Provides a UNIX-like global namespace with quotas –Includes a powerful metadata catalogue A comparison between LFC and AliEN FC is missing

CERN IT Department CH-1211 Geneva 23 Switzerland t ES CDN and SRM Investigate Content Distribution Networks –Network of disk caches where files are read from –Use Distributed Hash Tables for cache resolution –CoralCDN is a popular CDN, but there is no security SRM problems –Protocol development was rushed –Overly complex space management –Incoherent implementations –Addresses both DM and data access SRM future a subset of it? –Drop data access

CERN IT Department CH-1211 Geneva 23 Switzerland t ES ROOT and ARC Smart caching makes access via WAN possible and efficient –The TTreeCache reduces by a factor the number of network transactions –A cache on the local disk (or a proxy) would further improve performance Optimizing also for multicore machines In ARC, the CE can cache files –Can schedule dataset transfers on demand File location in caches stored in a global index using Bloom filters –But with the probability of some cache misses

CERN IT Department CH-1211 Geneva 23 Switzerland t ES xrootd at KIT and StoRM/GPFS/TSM at CNAF KIT is a successful example of integrating xrootd with a tape system –Scalability, load balancing, high availability –Also integrated with BestMAN SRM and GridFTP CNAF has completely moved to GPFS for storage –StoRM for SRM, GridFTP, TSM for tape, xrootd for ALICE –GPFS complex but extremely powerful –Performance most satisfactory

CERN IT Department CH-1211 Geneva 23 Switzerland t ES HDFS at Tier-2’s The filesystem component of Hadoop –Strong points are fault tolerance, scalability and easy management –Aggregates the WN disks Integrated with BestMAN SRM, GridFTP, xrootd, FUSE No security

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Conclusion What now? –Define demonstrators and corresponding metrics for success –Define a plan including resource needs, milestones –Track progress (Twiki, GDB meetings) –Conclusions by end of 2010

CERN IT Department CH-1211 Geneva 23 Switzerland t ES Demonstrators ATLAS Tier-3’s: use local storage as cache via xrootd Use CoralCDN for a proxy network, using HTTP via ROOT PanDA dynamic data placement: trigger on-demand replication to Tier-2’s and/or queue jobs where data are –Study an algorithm to make the optimal choice Use xrootd redirector layered on other SEs ARC caching Use messaging for catalogue-SE synchronization Compared study on catalogues Proxy caches in ROOT NFS 4.1 …