Ian Bird, CERN & WLCG CNAF, 19th November 2015

Slides:



Advertisements
Similar presentations
Asper School of Business University of Manitoba Systems Analysis & Design Instructor: Bob Travica System architectures Updated: November 2014.
Advertisements

Ian Bird WLCG Workshop Okinawa, 12 th April 2015.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CERN and Computing … … and Storage Alberto Pace Head, Data.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
Performance Concepts Mark A. Magumba. Introduction Research done on 1058 correspondents in 2006 found that 75% OF them would not return to a website that.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Workshop summary Ian Bird, CERN WLCG Workshop; DESY, 13 th July 2011 Accelerating Science and Innovation Accelerating Science and Innovation.
Evolution, by tackling new challenges| CHEP 2015, Japan | Patrick Fuhrmann | 16 April 2015 | 1 Patrick Fuhrmann On behave of the project team Evolution,
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Securely Synchronize and Share Enterprise Files across Desktops, Web, and Mobile with EasiShare on the Powerful Microsoft Azure Cloud Platform MICROSOFT.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Future home directories at CERN
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
LHC Computing, CERN, & Federated Identities
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Ian Bird, CERN WLCG Project Leader Amsterdam, 24 th January 2012.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
Evolution of storage and data management
WLCG IPv6 deployment strategy
AuraPortal Cloud Helps Empower Organizations to Organize and Control Their Business Processes via Applications on the Microsoft Azure Cloud Platform MICROSOFT.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
Virtualisation for NA49/NA61
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
NA61/NA49 virtualisation:
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Managing Storage in a (large) Grid data center
Future of WAN Access in ATLAS
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
Scientific Computing Strategy (for HEP)
StoRM Architecture and Daemons
Introduction to Data Management in EGI
Virtualisation for NA49/NA61
Dagmar Adamova, NPI AS CR Prague/Rez
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Thoughts on Computing Upgrade Activities
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
Veeam Backup Repository
WLCG Collaboration Workshop;
Large Scale Test of a storage solution based on an Industry Standard
Data Management cluster summary
Capitalize on modern technology
Running on the Powerful Microsoft Azure Platform,
Intelledox Infiniti Helps Organizations Digitally Transform Paper and Manual Business Processes into Intuitive, Guided User Experiences on Azure MICROSOFT.
LHC Data Analysis using a worldwide computing grid
Crypteron is a Developer-Friendly Data Breach Solution that Allows Organizations to Secure Applications on Microsoft Azure in Just Minutes MICROSOFT AZURE.
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
INFNGRID Workshop – Bari, Italy, October 2004
The LHCb Computing Data Challenge DC06
Presentation transcript:

Ian Bird, CERN & WLCG CNAF, 19th November 2015 Storage Evolution Ian Bird, CERN & WLCG CNAF, 19th November 2015

High Level View Today we have ~500 PB of storage (50% tape, 50% disk) ~60-70% of global WLCG spending is on disks! BUT: observation is that lots of data on disk is: Replicated very many times Often not touched (ever) Experiments find difficulty to make use of dispersed disk pools Would rather have fewer, larger pools Networks allow this now Can we afford more compute, by saving on disk ?? Computing models evolving to allow far more remote data E.g. how we use commercial clouds now relies on this Anticipate evolution of roles of centres according to this model Smaller Tier 2’s may become essentially “diskless” (except cache space) Ian Bird 19/11/2015

Storage Evolution for WLCG SRM SRM more and more only being used to interact with tapes Disk and Tape systems becoming separated (use groups schedule accesses) Disk pools Different cases Large pools (dCache, EOS) and today DPM, Storm Medium term storage – used for highly accessible and frequently accessed data Smaller “caches” (today DPM, etc) Used for temporary space – job caching for processing, analysis Future model may emphasise large (semi-permanent) pools which need real management tools Caches could simply be staging space accessed via http etc Ian Bird 19/11/2015

Access Protocols Today SRM, rfio, dcap, gridftp, xroot, Can we move towards? Xroot https (SRM if still needed for tape) Ian Bird 19/11/2015

Other interesting topics Sync’ing CERN strategy for home directories/cernbox/physics data Future of AFS Main use case is replaced by CVMFS, no other remaining uses that require the global nature Will never be IPV6 compliant – has a limited future Use of federated identities for access (EduGain) Ian Bird 19/11/2015

Vision for data management Examples from LHC experiment data models Two building blocks to empower data processing Data pools with different quality of services Tools for data transfer between pools Ian Bird 19/11/2015

A flexible architecture Disk pools (EOS) Arbitrary reliability, Arbitrary performance, High scalability Designed to sustain 1 KHz access Main platform for Physics Analysis at CERN Standard access interfaces Tapes (CASTOR Tape Archive) High reliability, Data Preservation Low cost, High throughput Reliable and Cheap (high latency) Fast and reliable (expensive) Fast and cheap (Unreliable) Cover the entire area of requirements Ian Bird 19/11/2015

Storage services for HEP  achieved  Storage services for HEP plan to improve Meet (and anticipate) requirements for LHC experiments Fast response to changing requirements Arbitrary Reliability / Arbitrary Performance Binary preservation, data archives, data integrity Unlimited scalability (objects, nb of files, metadata, …) Standard access methods and tools HTTP protocol, mounted file system, tools for data transfers and synchronization (high throughput, high latency, unreliable networks,…) Xroot guarantees maximum resource efficiency and performance Ensures that our services can use directly standard industrial services Ensure than other sciences can use directly our scientific services         Ian Bird 19/11/2015 

A global storage service… Access to the whole storage repository All Physics (and user) data at CERN Access to WLCG / HTTP federated storage worldwide Mounted online file access On Linux, Mac : Use FUSE or WebDAV On Windows : Webdav (…) Synced offline file access Supported on all platforms (Linux, Mac, Windows, IOS, Android) Web browser file access Available anywhere there is internet connectivity: Easy sharing to unregistered identities, Read/Write sharing, Bulk download Programmatic, high performance, custom access Xroot (and http) libraries for all relevant programming languages Ian Bird 19/11/2015

Architecture Client Web browser access Sync client (CERNBOX) (web access) CERNBOX web interface Sync client (CERNBOX) http OC sync protocol (Etag, X-OC-Mtime, If-Match, OC-FileId) (offline access) Castor Tape Archive Client (online access) http, fuse Webdav, xroot Clusters (Batch. Plus, Zenodo, Analytics, …) Application interfaces Disk Pools (EOS)

CERN Plans for 2016 Operate storage services at the highest quality and availability EOS as the strategic storage platform New, scalable name server (+ 2 orders of magnitude) Performance boost from new FUSE mount interface CERNBOX as the strategic service to access EOS storage New home directory service Offer migration paths out of AFS (90% could go to EOS) Evolve Castor to become “Tape Only” CTA – “CASTOR/CERN Tape Archive” Disk used only for internal data staging and buffers Make the CERN storage infrastructure a showcase for scientific computing Ian Bird 19/11/2015

CERN Plans for 2016 HTTP as a strategic data access protocol XROOT remains for “custom high performance” access Better and improved data transfer agents (FTS, …) CEPH the backend storage solution Based on commodity hardware Used by Openstack, Filers, … Hadoop Consolidate the existing clusters Hadoop on EOS ? Integrate with CERNBOX ? Collaboration with PH-SFT on “analysis from the web browser” Openlab new collaborations in 2015: Seagate, Comtrade new collaborations in the pipeline: DSI (?), Cloudera, … Ian Bird 19/11/2015

FTS Run2 Production > 2 million transfers/day Continuous improvements functionality (driven by experiments ) performance (dev. and infrastructure support) WLCG deployments at CERN, RAL, BNL, Fermi more transfers than FTS2, fewer instances Zero config Prototype implementation of Federated Identity management WebFTS as a demonstrator, crucial for the future of grid services CERNBox integration S3 support demonstrated by BNL DynaFed as a data bridge Plans: Intelligent transfer optimization using advanced analytics techniques Federated Identity User driven improvements Embedded FTS FTS as an internal component for complex data and storage system Ian Bird 19/11/2015

DPM and LFC 188 Instances managing 18% of WLCG data volume DM-lite (DPM core) performance improvements config support for sites SRM free operations HTTP interface Focus: Ease of maintenance and operation LFC has been selected by Belle II (against our advice ;-) ) signed MoU Plans: Remove legacy code ease maintenance and reduce potential security risks Explore different strategies for caching and federated access with the long term goal to gradually convert DPMs into caches Ian Bird 19/11/2015

Access components Dynafed Davix Canada ATLAS and LHCb for data federation Volunteer computing for data bridge functionality At BNL as a bridge to S3 storage Cloud integration ( Vcycle for status reports ) Davix Completion of Cloud support S3, Azure, Ceph.... Verified performance on par with xroot for analysis Johannes Elmsheuser for ATLAS at CHEP Plans: Depend on uptake, reach out to other sciences Ian Bird 19/11/2015

Convergence For smaller sites, envisage (requested also by experiments) that less storage is located there Mainly for caching, staging Currently existing DPM instances can become caches More lightweight “DPM” Also trial EOS WAN clusters i.e. a single head node with remote pools – reduces management needs at remote sites Eventually converge these two views Ian Bird 19/11/2015

Trends – Data centres Moving data around the world to 100’s of sites is unnecessarily expensive Much better to have large scale DC’s (still distributed but O(10) not O(100) ) – connected via v high bandwidth networks Bulk processing capability should be located close or adjacent to these Data access via the network – but in a truly “cloud-like” way – don’t move data out except the small data end-products CNAF; 19th Nov 2015 Ian Bird; CERN

Simulation Factory/HPC Client DC Compute 1-10 Tb/s Compute DC DC DC primarily academic, allow hybrid Data replicated across all Compute: commercial or private – cost/connectivity Data input from external sources: instruments, simulation factories Clients see as a single DC, access data products: existing, on-demand Allows new types of analysis models A Virtual Data Centre for Science? CNAF; 19th Nov 2015 Ian Bird; CERN

Planning for HL-LHC Foresee 2 technical activities: Medium term (Run 2, Run 3) Evolution of current models and infrastructure Set up technical groups to study the various aspects Longer term (Run 4) Need some brainstorming of new ideas – current models will not be sufficient Use WLCG workshop in February in Lisbon to initiate these activities One day on each topic – set up groups Anticipate being reviewed on this by LHCC starting in 2016 Computing is important component of upgrade costs Ian Bird; CERN CNAF; 19th Nov 2015