Presentation is loading. Please wait.

Presentation is loading. Please wait.

On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory

Similar presentations


Presentation on theme: "On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory"— Presentation transcript:

1 On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory http://www.csm.ornl.gov/~vazhkuda vazhkudaiss@ornl.gvazhkudaiss@ornl.gov PDPTA June 21 th, 2004 Acknowledgments: ORNL Collaborators: Dr. Xiaosong Ma and Dr. Vincent Freeh (NCSU)

2 Outline Grid Storage Fabric Background The Evolving Computing Landscape—An Analogy Storage Scavenging of User Desktop Workstations Use Cases Related Work and Design Choices Architecture – Storage Layer – Management Layer – Current Status

3 Grid Storage Fabric Background Scientific discoveries driven by analyses of massively distributed, bulk data. – Proliferation of high-end mass storage systems, SANs and datacenters – Providers such as IBM, HP, Panasas, etc. Merits: – Excellent price/performance ratio – Good storage speeds and access control – Support intelligent parallel file systems – Optimized for wide-area, bulk transfers – Reliability! – Successful demonstration in production Grids: DOE Science Grid, Earth System Grid, TeraGrid, etc. Drawbacks: – Increasing deployment/maintenance/administrative costs – Specialized software and central points of failure – Costs and specialized features prohibit wider acceptability and limits to select few research labs & organizations – Aforementioned production Grids are hardly half-a-dozen sites…!! Meta-Message: If grids are to become prevalent and grow beyond the confines of a few organizations, exploiting commodity fabric features is absolutely essential!

4 The Evolving HPC Landscape Computing fabric for the Grid: Storage Fabric…?? Meta-Message: Proprietary systems are being replaced with commodity clusters, delivering new levels of performance and availability at dramatically affordable price point. Tightly Coupled Loosely Coupled Time flies… Supercomputers Beowulf Style Aggregating idle CPU cycles from Commodity PCs Loosely Coupled Tightly Coupled Time flies again… Datacenters RAID-like aggregation Aggregating idle storage space from Commodity PCs Volatility Trust Performance

5 Storage Scavenging of User Desktop Workstations Harnessing collective storage potential of individual workstations ~ Harnessing idle CPU cycles Why Storage Scavenging can be viable? – Economics of buying gigabytes of storage is increasingly affordable – Space usage to Available storage ratio is significantly low – Increasing numbers of workstations are online most of the time – Even a modest contribution (Contribution << Available) can amass collective, staggering aggregate storage! Concerns: – Vagaries of volatility… – Question of Trust: datasets on arbitrary user workstations – Performance of such aggregate storage Meta-Message: Despite the high maintenance and administrative costs, a factor that attracts the Grid community to high-end storage and data centers is their ability to deliver sustained high-throughput for data operations.

6 Use Cases Storage cloud as a: – Cache – Intermediate hop – Local, client-side scratch – Grid replica – RAS for Terascale Supercomputers

7 Related Work and Design Choices Related Work: – Network/Distributed File Systems (NFS, LOCUS) – Parallel File Systems (PVFS, XFS) – Serverless File Systems (FARSITE, xFS, GFS) – Peer-to-Peer Storage (OceanStore, PAST, CFS) – Grid Storage Services (LegionFS, SRB, IBP, SRM, GASS) Design Choices & Assumptions: – Scalability: O(100) or O(1000) – Commodity Components: Quality & Quantity – User Autonomy – Well connected & Secure – Heterogeneity – Large, “write once read many” datasets – Transparent – Grid Aware

8 Architecture Pool n Morsel Access, Data Integrity, NonInvasiveness Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Pool A Registration Storage Layer Pool m Registration Grid Data Access Tools Meta-Message: Imagine “Condor” for Storage.

9 Storage Layer Benefactors: – Morsels as a unit of contribution – Basic morsel operations as RPC services [new(), free(), get(), put()…] – Space Reclaim: User withdrawal Which morsels to relocate/evict? Which benefactor workstations to relocate to? – Data Integrity through checksums – Performance Traces Pools: – Benefactor registrations (soft state) – Dataset distributions – Metadata – Selection heuristics File 1: 1 23 File n: 1a 2a 3a 4a 2a1a 21 4a3a 23 2a1a 3a1

10 Management Layer Manager: – Pool registrations – Metadata: datasets-to-pools; pools-to-benefactors, etc. – Availability: Redundant Array of Replicated Morsels Minimum replication factor for morsels Where to replicate? Which morsel replica to choose from in response to user file fetches? – Grid Awareness: Information Providers Space reservations Transfer protocol agnostic – Transparent Access: Namespace

11 Current Status Application Proxy Manager Benefactor OS Benefactor OS ftp/GridFTP rpc (A) rpc (C) rpc (B) rpc (D) reserve(): cancel() store() : open(); benefactorID.put() retrieve(): open(); benefactorID.get() delete() new() free() get() put() rpc (A) services: – Create/delete files – Reserve…. rpc (B) services: – File fetches – Hints… rpc (C) services: – Control – Dataset distributions – Benefactor alerts, warnings, alarms to manager – ………………… rpc (D) services: – Morsel relocations – Status info – Load balancing – ………………… rpc (E) services: – Morsel relocations to different pools – Under direction of manager – …………………

12 Philosophical Musings... It’s all about commoditizing… – Quality – Trust – Performance What the scavenged storage “is not”: – Not a replacement to high-end storage What it “is”: – Low cost, fault-tolerant alternative to be used in conjunction with high-end storage

13 Further Information My Website: – http://www.csm.ornl.gov/~vazhkuda http://www.csm.ornl.gov/~vazhkuda


Download ppt "On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory"

Similar presentations


Ads by Google