Federated Data Stores Volume, Velocity & Variety Future of Big Data Management Workshop Imperial College London June 27-28, 2013 Andrew Hanushevsky, SLAC.

Slides:



Advertisements
Similar presentations
Potential Data Access Architectures using xrootd OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
Advertisements

XRootD Release 4 And Beyond GSI Seminar Stanford University/SLAC July15, 2015 Andrew Hanushevsky, SLAC
Xrootd Roadmap Atlas Tier 3 Meeting University of Chicago September 12-13, 2011 Andrew Hanushevsky, SLAC
Distributed Xrootd Derek Weitzel & Brian Bockelman.
Xrootd Update OSG All Hands Meeting University of Nebraska March 19-23, 2012 Andrew Hanushevsky, SLAC
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Scalla Back Through The Future Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 8-April-10
XRootD Roadmap To Start The Second Decade Root Workshop Saas-Fee March 11-14, 2013 Andrew Hanushevsky, SLAC
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CERN and Computing … … and Storage Alberto Pace Head, Data.
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 19-August-2009 Atlas Tier 2/3 Meeting
Xrootd Demonstrator Infrastructure OSG All Hands Meeting Harvard University March 7-11, 2011 Andrew Hanushevsky, SLAC
Scalla/xrootd Andrew Hanushevsky SLAC National Accelerator Laboratory Stanford University 29-October-09 ATLAS Tier 3 Meeting at ANL
Scalla/xrootd Introduction Andrew Hanushevsky, SLAC SLAC National Accelerator Laboratory Stanford University 6-April-09 ATLAS Western Tier 2 User’s Forum.
11-July-2008Fabrizio Furano - Data access and Storage: new directions1.
FAX Redirection Topology Wei Yang 7/16/121. Redirector hardware at CERN Redundant redirectors for EU, UK, DE, FR – Redundant (the “+” sign below) VMs.
Xrootd Monitoring Atlas Software Week CERN November 27 – December 3, 2010 Andrew Hanushevsky, SLAC.
Data and Storage Evolution in Run 2 Wahid Bhimji Contributions / conversations / s with many e.g.: Brian Bockelman. Simone Campana, Philippe Charpentier,
July-2008Fabrizio Furano - The Scalla suite and the Xrootd1.
Redirector xrootd proxy mgr Redirector xrootd proxy mgr Xrd proxy data server N2N Xrd proxy data server N2N Global Redirector Client Backend Xrootd storage.
Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano
IN2P3Creating Federated Data Stores for the LHC Summary Andrew Hanushevsky SLAC National Accelerator Laboratory September 13-14, 2012 IN2P3,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Distributed database system
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
02-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Accelerating Debugging In A Highly Distributed Environment CHEP 2015 OIST Okinawa, Japan April 28, 2015 Andrew Hanushevsky, SLAC
Performance and Scalability of xrootd Andrew Hanushevsky (SLAC), Wilko Kroeger (SLAC), Bill Weeks (SLAC), Fabrizio Furano (INFN/Padova), Gerardo Ganis.
Xrootd Present & Future The Drama Continues Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University HEPiX 13-October-05
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Xrootd Update Alice Tier 1/2 Workshop Karlsruhe Institute of Technology (KIT) January 24-26, 2012 Andrew Hanushevsky, SLAC
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.
Scalla Advancements xrootd /cmsd (f.k.a. olbd) Fabrizio Furano CERN – IT/PSS Andrew Hanushevsky Stanford Linear Accelerator Center US Atlas Tier 2/3 Workshop.
XRootD & ROOT Considered Root Workshop Saas-Fee September 15-18, 2015 Andrew Hanushevsky, SLAC
1 Overall Architectural Design of the Earth System Grid.
XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,
Scalla Authorization xrootd /cmsd Andrew Hanushevsky SLAC National Accelerator Laboratory CERN Seminar 10-November-08
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Xrootd Proxy Service Andrew Hanushevsky Heinz Stockinger Stanford Linear Accelerator Center SAG September-04
SRM Space Tokens Scalla/xrootd Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-May-08
Scalla As a Full-Fledged LHC Grid SE Wei Yang, SLAC Andrew Hanushevsky, SLAC Alex Sims, LBNL Fabrizio Furano, CERN SLAC National Accelerator Laboratory.
Scalla + Castor2 Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 27-March-07 Root Workshop Castor2/xrootd.
Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.
11-June-2008Fabrizio Furano - Data access and Storage: new directions1.
Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
09-Apr-2008Fabrizio Furano - Scalla/xrootd status and features1.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Scalla Update Andrew Hanushevsky Stanford Linear Accelerator Center Stanford University 25-June-2007 HPDC DMG Workshop
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Planning an Active Directory Deployment Lesson 1.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)
SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
Federating Data in the ALICE Experiment
Global Data Access – View from the Tier 2
Introduction to Distributed Platforms
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
Introduction to Data Management in EGI
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Monitoring Of XRootD Federation
Support for ”interactive batch”
Summary of the dCache workshop
Presentation transcript:

Federated Data Stores Volume, Velocity & Variety Future of Big Data Management Workshop Imperial College London June 27-28, 2013 Andrew Hanushevsky, SLAC

June 27-28, 20132Workshop On the Future Of Big Data Management Big Data Access & The 3 V’s Volume Increasing amount of data No single site can host all of the data Velocity Increasing number of analysis jobs No single site can host all of the jobs Variety Increasing number of sites Introduces many different storage systems

June 27-28, 20133Workshop On the Future Of Big Data Management Data & Access & The World Data Many places Complete subsets Sometimes not Compute Many places Data co-located Sometimes not Data is distribute and many times replicated largely driven by computational needs

June 27-28, 20134Workshop On the Future Of Big Data Management Multiple Sites – Unified View Reality check… Multiple sites Different administrative domains How to logically combine all the storage? Provide storage access across multiple sites Requires a minimal set of rules Intersecting security model Promise of minimal service

June 27-28, 20135Workshop On the Future Of Big Data Management Data Storage Federations “A collection of disparate space resources managed by co-operating but independent administrative domains transparently accessible via a common name space.” Unifies storage access Independent of data and compute location

June 27-28, 20136Workshop On the Future Of Big Data Management XRootD A Solution Using XRootD 6 A system for scalable cluster data access Not a file system Not just for file systems To handle variety Used in HEP and Astrophysicsxrootdcmsd

May 15-17, 20137GoogleIO XRootD XRootD Synergistic Approach 7 Minimize latency Minimize hardware requirements Minimize human cost Maximize scalingVelocity Volume Variety Maximize utility

June 27-28, 20138Workshop On the Future Of Big Data Management Variety Via Plug-In Architecture 8 Storage System Storage System HDFS gpfs Lustre UFS, … Authentication krb5 sss x.509 … Clustering(cmsd) Authorization Entity Names Entity Names Logical File System dpm sfs sql … Protocol cms http xroot … Protocol Driver Any n protocols

June 27-28, 20139Workshop On the Future Of Big Data Management Volume Via B 64 Scaling Private Cluster GCE Ephemeral Storage SLAC xrootdcmsd xrootd cmsd xrootd cmsd 64 1 = 64 xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd 64 2 = 4096 xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd 64 3 = xrootd cmsd xrootd cmsd xrootd cmsd xrootd cmsd 64 4 = Manager (Root Node) Data Server (Leaf Nodes) Supervisors (Interior Nodes)xrootdcmsd xrootdcmsd cmsdxrootd

June 27-28, Workshop On the Future Of Big Data Management WYSIWYG Scalable Access redirect open() redirect open() xrootdcmsd xrootdcmsdxrootdcmsd 64 1 = 64xrootdcmsdxrootdcmsdxrootdcmsdxrootdcmsd 64 2 = 4096 Client open() cmsdxrootd Request routing is very different from traditional data management models

June 27-28, Workshop On the Future Of Big Data Management Real World Example (HEP) XRootD Federated ATLAS XRootD (FAX) Independent sites federated by region a b c c=max(a,b) Graphic courtesy of Rob Gardner)

June 27-28, Workshop On the Future Of Big Data Management ATLAS FAX Infrastructure (From Rob Gardner) Provides a global namespace Unifies dCache, DPM, Lustre/GPFS, Xrootd storage backends Xrootd an efficient protocol for WAN access Main Fall-back use case in production at many sites Regional redirection network provides lookup scalability Provides a global namespace Unifies dCache, DPM, Lustre/GPFS, Xrootd storage backends Xrootd an efficient protocol for WAN access Main Fall-back use case in production at many sites Regional redirection network provides lookup scalability A powerful capability which must be introduced to production carefully

June 27-28, Workshop On the Future Of Big Data Management HEP Deployment LHC ALICE Data catalog driven federation LHC ATLAS Regional topology LHC CMS Uniform topology LSST (Large Synoptic Sky Telescope) Clusters mySQL servers for parallel queries

June 27-28, Workshop On the Future Of Big Data Management Conclusion Federated storage is key for big data Distributed management + uniform access Preserves administrative autonomy Inherently scalable The whole is greater than the sum of its parts XRootD XRootD provides flexible federation Addresses volume, velocity, and variety Three main big data challenges

June 27-28, Workshop On the Future Of Big Data Management Acknowledgements Current Software Contributors ATLAS: Doug Benjamin, Patrick McGuigan, CERN: Lukasz Janyst, Andreas Peters, Justin Salmon Fermi: Tony Johnson JINR: Danila Oleynik, Artem Petrosyan Root: Gerri Ganis, Bertrand Bellenet, Fons Rademakers SLAC: Andrew Hanushevsky, Wilko Kroeger, Daniel Wang, Wei Yang UCSD: Matevz Tadel UNL: Brian Bockelman WLCG: Fabrizio Furano, David Smith US Department of Energy Contract DE-AC02-76SF00515 with Stanford University