BNL Box Hironori Ito Brookhaven National Laboratory

Slides:



Advertisements
Similar presentations
Objectives Overview Define an operating system
Advertisements

IPads in Education Part 2: Apps for Teachers Dr. Gita Phelps Information Systems & Computer Science GCSU.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
A crash course in njit’s Afs
Section 6.1 Explain the development of operating systems Differentiate between operating systems Section 6.2 Demonstrate knowledge of basic GUI components.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
SharePoint and SharePoint Online: Today and what's next? Presented by Luke Abeling – IT Platforms.
Explain the purpose of an operating system
DCE (distributed computing environment) DCE (distributed computing environment)
Component 4: Introduction to Information and Computer Science Unit 4: Application and System Software Lecture 3 This material was developed by Oregon Health.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
Introduction Thomson Chan Rosaryhill School
Campus File Services Past the Traditional File Server.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
FCNS – Technology NEW TECHNOLOGY SOFTWARE: COMING SOON!
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Store your files in the sky Intro to Cloud file storage.
1 NETE4631 Working with Cloud-based Storage Lecture Notes #11.
Securely Synchronize and Share Enterprise Files across Desktops, Web, and Mobile with EasiShare on the Powerful Microsoft Azure Cloud Platform MICROSOFT.
File sharing requirements of remote users G. Bagliesi INFN - Pisa EP Forum on File Sharing 18/6/2001.
Module 4 Planning for Group Policy. Module Overview Planning Group Policy Application Planning Group Policy Processing Planning the Management of Group.
BOINC: Progress and Plans David P. Anderson Space Sciences Lab University of California, Berkeley BOINC:FAST August 2013.
Future home directories at CERN
Sync and Exchange Research Data b2drop.eudat.eu This work is licensed under the Creative Commons CC-BY 4.0 licence B2DROP EUDAT’s Personal.
Chapter 8: Installing Linux The Complete Guide To Linux System Administration.
Chapter 9 Operating Systems Discovering Computers Technology in a World of Computers, Mobile Devices, and the Internet.
Introduction TO Network Administration
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
What’s New Data Loss Prevention 14. Information is Everywhere Brings Productivity, Agility, Convenience ……and Problems Copyright © 2015 Symantec Corporation.
ITS Lunch & Learn November 13, What is Office 365? Office 365 is Microsoft’s software as a service offering. It includes hosted and calendaring.
CloudBerry Explorer for S3. CB Explorer Free to use Browse and manage files PowerShell functions Open and edit files  CloudBerry Explorer is an easy.
Tech Tuesday.  Dropbox is a big name in cloud storage, having become one of the most frequently used file sharing platforms in the world. With improvements.
WHAT IS CLOUD COMPUTING? Pierce County Library System.
CHAPTER 7 Operating System Copyright © Cengage Learning. All rights reserved.
DISCOVERING COMPUTERS 2018 Digital Technology, Data, and Devices
Compute and Storage For the Farm at Jlab
2nd year Computer Science & Engineer
Unit 3 Virtualization.
File Syncing Technology Advancement in Seafile -- Drive Client and Real-time Backup Server Johnathan Xu CTO, Seafile Ltd.
Chapter 6: Securing the Cloud
File Management in the Cloud
Chapter 1: Introduction
Introduction to Operating Systems
Lecture 1-Part 2: Operating-System Structures
Web Application.
Section 6 Object Storage Gateway (RADOS-GW)
Jenny Pange University of Ioannina
BNL Tier1 Report Worker nodes Tier 1: added 88 Dell R430 nodes
Amazon Storage- S3 and Glacier
Virtual Network Computing
Experiences with http/WebDAV protocols for data access in high throughput computing
Simple Storage Service
ATLAS Sites Jamboree, CERN January, 2017
Artem Trunov and EKP team EPK – Uni Karlsruhe
Store, Share, Sync and Collaborate
XROOTd for cloud storage
Brookhaven National Laboratory Storage service Group Hironori Ito
Cloud based Open Source Backup/Restore Tool
TYPES OFF OPERATING SYSTEM
Introduction to Cloud Computing
Chromebooks and Cloud Computing
Chapter 4.
Interoperability of Digital Repositories
Face-Off: Cloud Storage Alternatives
SUSE Linux Enterprise Desktop Administration
Software - Operating Systems
Introducing Windows Operating Systems
Web Servers (IIS and Apache)
IT Office hours – 1 Data Sharing 101
Windows 10 An Operating System
Presentation transcript:

BNL Box Hironori Ito Brookhaven National Laboratory Spring HEPiX, Budapest, Hungary April, 2017

Concept All of us need the convenient method to transfer or access data in different systems Users might need to copy their analysis scripts and the data between their workstations and central analysis farm separated by different network and firewalls System administrators might need to transfer custom software packages to their systems for installations. In BNL RACF, AFS has been the storage of choice for moving small amount of data in/out of various systems. AFS limitation Being phased out Not really universally accessible. Not easiest one to use in various platform. Commercial cloud storage seems to be popular among some of users and sys-admins. Dropbox, Box, Amazon Cloud Drive, Google Drive, MS OneDrive, etc… Advantages of commercial cloud storage Already available for use Easy to use. All of then provide https- based access. Free (up to some level) Available in various platforms. Limitations Size/Cost/Performance. Archive Not really meant to stream data

Target users All users of BNL HEP/Nuclear physics comminities Sys-admins Users from different science domains than HEP NSLS-II (National user facility) Massive data producers for many beamlines by many users. Nano Center (National user facility) Another large data producers. Chemistry Biology Etc…

Target usage Transfer small data in & out of BNL between central interactive farms, workstations, laptops, and tablets/smart phones. Transfer large data in & out of BNL between detector data stores, central storage, remote storage of users. Access data to/from analysis computing farm Copy to scratch Stream data Archive data

BNL BOX Owncloud Software Clients are available in many popular platforms; Linux, Mac OS, MS Win, Android and IOS Extremely easy to use. Synchronize data automatically NOTE: Requires the same amount of storage in local and remote storage. Quota for each users Users can share data Ceph Storage Currently Infernalis. Targeting Kraken. Reliability CephFS 3.8PB Raw -> 7.5PB by the end of 2017 Performance 40Gbps for BNL Box

Diagram Owncloud CephFS Laptop / Desktop App Mobile APP Worker Nodes MS Win / Mac / Linux Worker Nodes Mobile APP Android / iOS Owncloud Custom easy-to-use copy-command Request to restore the data from archive Tape Restore Control Custom XROOTd Name-to-name Translation to isolate a user to own data XROOTd CephFS dCache HPSS Tape Archive

WebDAV access and Sync Default sync app seems to synchronize data at the top rate of about 100MB/s per client. (100MB/s = 360GB/Hr = 8.6TB) Sufficient for small data ~ less than TB. Most users won’t need or physically have higher throughputs in their systems. Spinning DiskIO on desktop(~100MB/s). Wifi N (max 300Mbps~40MB/s) LAN (1Gbps=120MB/s) Disks are not much larger (currently max at about 10TB) High demand users require higher throughput. 10TB or more. Owncloud supports standard WebDAV protocol Easy to write a custom copy tool. Easily achieve 150MB/s per single file transfer. Concurrent multiple transfer of files will results in obtaining desired throughputs. NOTE: Different SSL library seem to impact the observed throughput of WebDAV command. For an example, “curl” in RHEL 7 is compiled with NSS. This version of “curl” produces 1/5 of throughput of “curl” using OpenSSL.

Stream Access XROOTd and WebDAV can stream data Would like to separate the data-sync operations from the data-read access as much as reasonably possible. XROOTd can cleverly map user data in BNL Box in a very simple way. Owncloud web URL maps a user data by https://host/owncloud/index.php/apps/files/MYDATA This is different from how Owncloud physically stores user data in its storage as /base-directory/username/files/MYDATA XROOTd can clervery hide “username” of physical files by providing access by root://host/files/MYDATA Courtesy of Andrew Hanushevsky from XROOTd

Archive data Some users would like to archive or store data in the tape system. Will the data be read again? Difficulties Efficiency Read throughput Reading small fraction in many different tapes will results in low throughput. Seek is slow. Mounting a tape is very slow. Must write in a particular way to produce the good read-IO. Rule “/Tapes/” directory will be used to indicate data to be stored to the tape system. Files small than certain size (1GB) will be tarred to produce a large file Tar files smaller than 1GB will be archived to tape only after certain period. Once files are transferred to the archival system, they will be removed from “/Tapes/” directory. Reduce the usage of quota. Create index or individual local catalog file to record the data in the archival system. The above index will be synchronized by the owncloud to their local machine. Also update the central catalog for archived data Restore requests will be made through Web interface. Data will be restored to different directory.

Sample images Users only see their own directory.

Share data Users can share their data publicly or privately with password.

Users decide what to sync Using the provided app, users can decides what to sync automatically. For an example Data and Tapes directories are not synchronized. Codes, Documents, Photos directories are synchronized automatically. Desktop/Laptop apps are available in MS Win, Mac and Linux. The performance seems to be limited to the maximum of 100MB/s.

Conclusion Cloud storage could be potentially useful for data intensive scientific communities. BNL Box will provide our users with ability to store and access their data anywhere by the easy-to-use applications on various platforms. BNL Box allows the owners of the data to share with anyone without involvement of the system administrator.