Download presentation
Presentation is loading. Please wait.
Published byAnnabel Cummings Modified over 8 years ago
1
TeraGrid Data Plan/Issues Phil Andrews In Xanadu did Kubla Khan A stately pleasure-dome decree: Where Alph, the sacred river, ran Through caverns measureless to man Down to a sunless sea. -Samuel T. Coleridge Users’ view: Data is stored somewhere; it must always be available, there must always be room for more, it must be easy to access, and it should be fast
2
2 Convenience requirements will always increase. TeraGrid Quarterly, Sep’07 Each generation of users requires more convenience than the former: thus we must always be adding new layers of software while maintaining and extending existing reliability and capability. Change is the only Constant – Heraclitis 535BC-475BC
3
3 Major User Data Access/Transfer GridFTP: well established; non-controversial a bit clunky and not very user-friendly. Requires scheduling extensions for parallel use. WAN file systems: inherently parallel (if enough client nodes) convenient and intuitive, users like the idea (Zimmerman survey). Big impingement on local system world TeraGrid Quarterly, Sep’07
4
4 Both approaches progressing GridFTP gaining direct access to archival systems, already there for SAM-QFS, HPSS with installation of 6.2 (later in ‘07). Working on scheduling extensions for parallel transfers WAN file systems becoming integrated with archival systems (GA for GPFS-HPSS later this year, some Lustre-HPSS in place) TeraGrid Quarterly, Sep’07
5
5 Pontifications: All archival system access will be indirect No-one will actually know where their data is Today’s middleware capabilities (replication, caching, etc.) will migrate into infrastructure, new ones (sblest.org, mutdb.org) will appearsblest.org Data availability will be much more important than individual computional access. Cost Recovery will be essentail TeraGrid Quarterly, Sep’07 'It's tough to make predictions," Yogi Berra once said, "especially about the future."
6
6 What do we need to do? Must extend WAN file system access. pNFS should eliminate licensing issues, caching extensions should improve reliability Must integrate data capabilities within the TeraGrid, e.g., federation of archival systems, crossmounting of file systems Data must be the equal of Computation Policies must catch up with Technology TeraGrid Quarterly, Sep’07 Understand that most problems are a good sign. Problems indicate that progress is being made, wheels are turning, you are moving toward your goals. Beware when you have no problems. Then you've really got a problem... Problems are like landmarks of progress – Scott Alexander Understand that most problems are a good sign. Problems indicate that progress is being made, wheels are turning, you are moving toward your goals. Beware when you have no problems. Then you've really got a problem... Problems are like landmarks of progress
7
7 New Policies? TeraGrid becomes a “sea” of Data; much cross-mounting & Federation of Archives Users are already becoming worried about long-term Data preservation Need “Communal Responsibility for Data” “Lloyds of London(1688)” approach? Not a single company but many syndicates with rollover of responsibility every 3 years TeraGrid Quarterly, Sep’07 all that is old is new again - traditional adage
8
8 Single Biggest Problem: How Do We Do Cost Recovery for Data Services? What we call “charging” is really “proportional allocation”: we need money! Delivering a Flop is a simultaneous transaction/purchase Storing Data is like writing an insurance policy: it’s a long term commitment of uncertain cost TeraGrid Quarterly, Sep’07
9
9 Data Charging Options 1) Don’t do it: could be overwhelmed, long term problems? 2) Simple yearly rate: could be wrong, how to connect to $$? 3) Charge by transaction: unable to predict, again $$? 4) Lloyds of London: RP’s “bid” to NSF for so much data stewardship, turns over at 3 (4?) year intervals, contract includes picking up existing data as well as new data. Data integrity is guaranteed as long as NSF continues funding. Depends on separate Data Stewardship funding by NSF and a pool of funded RP “syndicates” that will bid on providing Data Storage services. The TeraGrid contracts with the users and provides oversight. TeraGrid Quarterly, Sep’07 PLUS CA CHANGE, PLUS C'EST LA MEME CHOSE….
10
10 Near Term Implementations: Archival Federation: already populating STK Silo at PSC with remote backups from SDSC RP sites routinely use other TG RP sites for Archival MetaData backups HPSS and other archival systems moving towards more Federation Need to respond to User Requirement for more Global File Systems! TeraGrid Quarterly, Sep’07
11
11 How do we make WAN Global File Systems ubiquitous? Experience in production with GPFS-WAN Further adoption hindered by licensing issues and vendor specifics Would like to eliminate any vendor specifics at clients: keep them at the servers Aim of pNFS extension to NFS V4 Would really like clients on all nodes (can use tunneling if IP addresses invisible) TeraGrid Quarterly, Sep’07
12
12 Current gpfs-wan remote mounts: NCSA: production mount on Mercury ANL: production mount on TG cluster NCAR: production mount on front ends PSC: testing on BigBen TACC: tested on Maverick TeraGrid Quarterly, Sep’07 “within the Universe there is a Web which organizes events for the good of the whole” -Marcus Aurelius
13
13 What is pNFS? First Extension to NFSV4 Standard development at U. Michigan Parallel Clients for proprietary Parallel Servers Separate path for Metadata Should have similar performance to gpfs-wan Should eliminate need for client licensing (server vendor provides pNFS server code, local vendor provide pNFS clients) TeraGrid Quarterly, Sep’07
14
14 pNFS Model pNFS Client pNFS metadata NFSv4 data NFSv4 data NFSv4 data... NFSv4 + pNFS NFSv4 READ/WRITE Storage management protocol (out of spec) (control path) (data path)
15
15 pNFS/MPI-IO Integration pNFS client pNFS metadata NFSv4 data NFSv4 data NFSv4 data... NFSv4 + pNFS NFSv4 READ/WRITE Storage management protocol (out of spec) (control path) (data path) pNFS client... MPI-IO head node
16
16 pNFS Timetable In vendor “Bake-a-thon” right now IBM, Sun, Panasas, expect beta release next summer, production one year later Lustre promises support, no date yet SDSC-NCSA-ORNL-IBM demo at SC’07, others? TeraGrid Quarterly, Sep’07 “Cut the cackle and come to the hosses.” – Physical applications of the operational method, Jeffreys and Jeffreys
17
17 SC’07 Demo: pNFS SDSC/NCSA/ORNL/IBM TG Global Filesystem No GPFS license required for clients Should be as fast as gpfs-wan to gpfs clients GPFS-WAN Server SDSC pNFS Client TeraGrid Network pNFS Client pNFS Client
18
18 pNFS paradigm Server File System vendor (IBM, Lustre, Sun,….) provides pNFS server interface Client OS vendor (IBM, Linux, Sun, …) provides pNFS client software (NFS V4) No licenses needed by clients TeraGrid Quarterly, Sep’07 Children and lunatics cut the Gordian knot which the poet spends his life patiently trying to untie.- Jean Cocteau
19
19 SC07 pNFS/GPFS TeraGrid Bandwidth Challenge 150+ GPFS NSDs 0.75 PB LACHI SC07 155 Mbps 10 Gbps 72ms SCinet DNV NCA R ARSC pNFS/GPFS SDSC NCS A 70ms pNFS Clients 10 Gbps 34ms 10 Gbps 18ms pNFS Clients
20
20 pNFS Architecture NFSv4.1 Server/ Parallel FS Metadata Server Parallel FS I/O NFSv4.1 Metadata Parallel FS Storage Nodes NFSv4.1 Clients Parallel FS Management Protocol AIX Linux Sun
21
21 pNFS with GPFS State Server NFSv4 Parallel I/O NFSv4.1 Metadata GPFS NSD Servers/ SAN File-based NFSv4.1 Clients Mgmt Protocol GPFS Servers AIX Linux Sun Data Servers Remaining GPFS Servers pNFS client can mount and retrieve layout from any GPFS node Load balance metadata requests across cluster Any number of GPFS nodes can be pNFS data servers Metadata server creates layout to load-balance I/O requests across data servers Compute cluster can consist of pNFS or GPFS nodes
22
22 Bandwidth Challenge
23
23 NCSA SDSC I/O Performance ReadWrite u 10 Gbps link u 10 Gbps clients and servers u 62 ms RTT u pNFS uses 3 Data Servers and 1 Metadata server
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.