TeraGrid Data Plan/Issues Phil Andrews In Xanadu did Kubla Khan A stately pleasure-dome decree: Where Alph, the sacred river, ran Through caverns measureless.

Slides:



Advertisements
Similar presentations
A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Advertisements

Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Distributed Processing, Client/Server and Clusters
TeraGrid Quarterly Meeting Dec 6-7, 2007 DVS GIG Project Year 4&5 Project List Kelly Gaither, DVS Area Director.
PNFS, 61 th IETF, DC1 pNFS: Requirements 61 th IETF – DC November 10, 2004.
Cloud Computing COMP 1631, Winter 2011 Yanggang Chen.
Chapter 7 LAN Operating Systems LAN Software Software Compatibility Network Operating System (NOP) Architecture NOP Functions NOP Trends.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
TeraGrid Archival Migration of Data to the XD Era Phil Andrews et al The Moving Finger writes; and having writ, Moves on; nor all your Piety nor Wit Shall.
(e)Science-Driven, Production- Quality, Distributed Grid and Cloud Data Infrastructure for the Transformative, Disruptive, Revolutionary, Next-Generation.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
2 June 2015 © Enterprise Storage Group, Inc. 1 The Case for File Server Consolidation using NAS Nancy Marrone Senior Analyst The Enterprise Storage Group,
Distributed Processing, Client/Server, and Clusters
Distributed components
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
File System Implementation
Massive High-Performance Global File Systems for Grid Computing -By Phil Andrews, Patricia Kovatch, Christopher Jordan -Presented by Han S Kim.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Simo Niskala Teemu Pasanen
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
File Systems (2). Readings r Silbershatz et al: 11.8.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
Tier 3g Infrastructure Doug Benjamin Duke University.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Almaden Rice University Nache: Design and Implementation of a Caching Proxy for NFSv4 Ajay Gulati, Rice University Manoj Naik, IBM Almaden Renu Tewari,
pNFS extension for NFSv4 IETF 61 November, 2004
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
October 2, 2015 pNFS extension for NFSv4 IETF-62 March 2005 Brent Welch
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
TeraGrid Privacy Policy: What is it and why are we doing it… Von Welch TeraGrid Quarterly Meeting March 6, 2008.
PNFS BOF FAST Sorin Faibish, EMC Mike Eisler, NetApp Brent Welch, Panasas Piyush Shivam, Sun Microsystems.
OSIsoft High Availability PI Replication
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
CFTP - A Caching FTP Server Mark Russell and Tim Hopkins Computing Laboratory University of Kent Canterbury, CT2 7NF Kent, UK 元智大學 資訊工程研究所 系統實驗室 陳桂慧.
TeraGrid Quarterly Meeting Arlington, VA Sep 6-7, 2007 NCSA RP Status Report.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
PNFS Birds-of-Feather FAST 2010: February 24 Sorin Faibish, EMC and pNFS friends.
Data Area Report Chris Jordan, Data Working Group Lead, TACC Kelly Gaither, Data and Visualization Area Director, TACC April 2009.
Network, Operations and Security Area Tony Rimovsky NOS Area Director
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
Review CS File Systems - Partitions What is a hard disk partition?
Tackling I/O Issues 1 David Race 16 March 2010.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
An Introduction to GPFS
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
GPFS Parallel File System
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Introduction to Data Management in EGI
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Research Data Archive - technology
Cloud Computing Dr. Sharad Saxena.
Comparison of LAN, MAN, WAN
Chapter 15: File System Internals
CS703 - Advanced Operating Systems
Presentation transcript:

TeraGrid Data Plan/Issues Phil Andrews In Xanadu did Kubla Khan A stately pleasure-dome decree: Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea. -Samuel T. Coleridge Users’ view: Data is stored somewhere; it must always be available, there must always be room for more, it must be easy to access, and it should be fast

2 Convenience requirements will always increase. TeraGrid Quarterly, Sep’07 Each generation of users requires more convenience than the former: thus we must always be adding new layers of software while maintaining and extending existing reliability and capability. Change is the only Constant – Heraclitis 535BC-475BC

3 Major User Data Access/Transfer GridFTP: well established; non-controversial a bit clunky and not very user-friendly. Requires scheduling extensions for parallel use. WAN file systems: inherently parallel (if enough client nodes) convenient and intuitive, users like the idea (Zimmerman survey). Big impingement on local system world TeraGrid Quarterly, Sep’07

4 Both approaches progressing GridFTP gaining direct access to archival systems, already there for SAM-QFS, HPSS with installation of 6.2 (later in ‘07). Working on scheduling extensions for parallel transfers WAN file systems becoming integrated with archival systems (GA for GPFS-HPSS later this year, some Lustre-HPSS in place) TeraGrid Quarterly, Sep’07

5 Pontifications: All archival system access will be indirect No-one will actually know where their data is Today’s middleware capabilities (replication, caching, etc.) will migrate into infrastructure, new ones (sblest.org, mutdb.org) will appearsblest.org Data availability will be much more important than individual computional access. Cost Recovery will be essentail TeraGrid Quarterly, Sep’07 'It's tough to make predictions," Yogi Berra once said, "especially about the future."

6 What do we need to do? Must extend WAN file system access. pNFS should eliminate licensing issues, caching extensions should improve reliability Must integrate data capabilities within the TeraGrid, e.g., federation of archival systems, crossmounting of file systems Data must be the equal of Computation Policies must catch up with Technology TeraGrid Quarterly, Sep’07 Understand that most problems are a good sign. Problems indicate that progress is being made, wheels are turning, you are moving toward your goals. Beware when you have no problems. Then you've really got a problem... Problems are like landmarks of progress – Scott Alexander Understand that most problems are a good sign. Problems indicate that progress is being made, wheels are turning, you are moving toward your goals. Beware when you have no problems. Then you've really got a problem... Problems are like landmarks of progress

7 New Policies? TeraGrid becomes a “sea” of Data; much cross-mounting & Federation of Archives Users are already becoming worried about long-term Data preservation Need “Communal Responsibility for Data” “Lloyds of London(1688)” approach? Not a single company but many syndicates with rollover of responsibility every 3 years TeraGrid Quarterly, Sep’07 all that is old is new again - traditional adage

8 Single Biggest Problem: How Do We Do Cost Recovery for Data Services? What we call “charging” is really “proportional allocation”: we need money! Delivering a Flop is a simultaneous transaction/purchase Storing Data is like writing an insurance policy: it’s a long term commitment of uncertain cost TeraGrid Quarterly, Sep’07

9 Data Charging Options 1) Don’t do it: could be overwhelmed, long term problems? 2) Simple yearly rate: could be wrong, how to connect to $$? 3) Charge by transaction: unable to predict, again $$? 4) Lloyds of London: RP’s “bid” to NSF for so much data stewardship, turns over at 3 (4?) year intervals, contract includes picking up existing data as well as new data. Data integrity is guaranteed as long as NSF continues funding. Depends on separate Data Stewardship funding by NSF and a pool of funded RP “syndicates” that will bid on providing Data Storage services. The TeraGrid contracts with the users and provides oversight. TeraGrid Quarterly, Sep’07 PLUS CA CHANGE, PLUS C'EST LA MEME CHOSE….

10 Near Term Implementations: Archival Federation: already populating STK Silo at PSC with remote backups from SDSC RP sites routinely use other TG RP sites for Archival MetaData backups HPSS and other archival systems moving towards more Federation Need to respond to User Requirement for more Global File Systems! TeraGrid Quarterly, Sep’07

11 How do we make WAN Global File Systems ubiquitous? Experience in production with GPFS-WAN Further adoption hindered by licensing issues and vendor specifics Would like to eliminate any vendor specifics at clients: keep them at the servers Aim of pNFS extension to NFS V4 Would really like clients on all nodes (can use tunneling if IP addresses invisible) TeraGrid Quarterly, Sep’07

12 Current gpfs-wan remote mounts: NCSA: production mount on Mercury ANL: production mount on TG cluster NCAR: production mount on front ends PSC: testing on BigBen TACC: tested on Maverick TeraGrid Quarterly, Sep’07 “within the Universe there is a Web which organizes events for the good of the whole” -Marcus Aurelius

13 What is pNFS? First Extension to NFSV4 Standard development at U. Michigan Parallel Clients for proprietary Parallel Servers Separate path for Metadata Should have similar performance to gpfs-wan Should eliminate need for client licensing (server vendor provides pNFS server code, local vendor provide pNFS clients) TeraGrid Quarterly, Sep’07

14 pNFS Model pNFS Client pNFS metadata NFSv4 data NFSv4 data NFSv4 data... NFSv4 + pNFS NFSv4 READ/WRITE Storage management protocol (out of spec) (control path) (data path)

15 pNFS/MPI-IO Integration pNFS client pNFS metadata NFSv4 data NFSv4 data NFSv4 data... NFSv4 + pNFS NFSv4 READ/WRITE Storage management protocol (out of spec) (control path) (data path) pNFS client... MPI-IO head node

16 pNFS Timetable In vendor “Bake-a-thon” right now IBM, Sun, Panasas, expect beta release next summer, production one year later Lustre promises support, no date yet SDSC-NCSA-ORNL-IBM demo at SC’07, others? TeraGrid Quarterly, Sep’07 “Cut the cackle and come to the hosses.” – Physical applications of the operational method, Jeffreys and Jeffreys

17 SC’07 Demo: pNFS SDSC/NCSA/ORNL/IBM TG Global Filesystem No GPFS license required for clients Should be as fast as gpfs-wan to gpfs clients GPFS-WAN Server SDSC pNFS Client TeraGrid Network pNFS Client pNFS Client

18 pNFS paradigm Server File System vendor (IBM, Lustre, Sun,….) provides pNFS server interface Client OS vendor (IBM, Linux, Sun, …) provides pNFS client software (NFS V4) No licenses needed by clients TeraGrid Quarterly, Sep’07 Children and lunatics cut the Gordian knot which the poet spends his life patiently trying to untie.- Jean Cocteau

19 SC07 pNFS/GPFS TeraGrid Bandwidth Challenge 150+ GPFS NSDs 0.75 PB LACHI SC Mbps 10 Gbps 72ms SCinet DNV NCA R ARSC pNFS/GPFS SDSC NCS A 70ms pNFS Clients 10 Gbps 34ms 10 Gbps 18ms pNFS Clients

20 pNFS Architecture NFSv4.1 Server/ Parallel FS Metadata Server Parallel FS I/O NFSv4.1 Metadata Parallel FS Storage Nodes NFSv4.1 Clients Parallel FS Management Protocol AIX Linux Sun

21 pNFS with GPFS State Server NFSv4 Parallel I/O NFSv4.1 Metadata GPFS NSD Servers/ SAN File-based NFSv4.1 Clients Mgmt Protocol GPFS Servers AIX Linux Sun Data Servers Remaining GPFS Servers pNFS client can mount and retrieve layout from any GPFS node Load balance metadata requests across cluster Any number of GPFS nodes can be pNFS data servers Metadata server creates layout to load-balance I/O requests across data servers Compute cluster can consist of pNFS or GPFS nodes

22 Bandwidth Challenge

23 NCSA SDSC I/O Performance ReadWrite u 10 Gbps link u 10 Gbps clients and servers u 62 ms RTT u pNFS uses 3 Data Servers and 1 Metadata server