Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

Similar presentations


Presentation on theme: "1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi."— Presentation transcript:

1 1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi 1 Sudharshan Vazhkudai 2 1. North Carolina State University 2. Oak Ridge National Laboratory September, 2004

2 2 Roadmap  Motivation  FreeLoader architecture  Design choices  Results  Future work

3 3 Motivation: Data Avalanche  More data to process Science, industry, government  Example: scientific data Better instruments More simulation power Higher resolution (Picture courtesy: Jim Gray, SLAC Data Management Workshop) Space Telescope P&E Gene Sequencer From http://www.genome.uci.edu/

4 4 Data acquisition and storage Data acquisition, reduction, analysis, visualization, storage Data Acquisition System Remote users with local computing and storage Remote storage Local users High Speed Network Metadata raw data Remote users Supercomputers

5 5 Remote Data Sources  Data serving at supercomputing sites Shared file systems – GPFS Archiving systems - HPSS  Data centers  Expensive, high-end solutions with guaranteed capacity and access rates  Tools used in access FTP, GridFTP Grid file systems Customized data migration program Web browser

6 6 User perspective  End user typically processes data locally Convenience and control Better CPU/memory configurations Problem 1: needs local space to hold data Problem 2: getting data from remote sources is slow  Central point of failure  High contention for resource, multiple incoming requests – availability is hit  Dataset characteristics Write-once, read-many access patterns Raw data often discarded Shared interest to same data among groups Primary copy archived elsewhere Squirrel – P2P web cache

7 7 Harnessing idle disk storage  Harnessing storage resources of individual workstations ~ Harnessing idle CPU cycles  LAN environments desktops with 100Mbps or Gbps connectivity Increasing hard disk capacities Increasing % of total is unused – 50% and upwards  Even with contribution << available - impressive aggregate storage  Increasing numbers of workstations are online most of the time  Access locality, aggregate I/O and network bandwidth, data sharing

8 8 Use Cases  FreeLoader storage cloud as a: Cache Local, client-side scratch Intermediate hop Grid replica

9 9 Intended Role of FreeLoader  What the scavenged storage “is not”: Not a replacement to high-end storage Not a file system Not intended for integrating resources at wide-area scale Does not emphasize replica discovery, routing protocol and consistency like P2P storage systems  What it “is”: Low-cost, best-effort alternative to remote high-end storage Intended to facilitate  transient access to large, read-only datasets  data sharing within administrative domain To be used in conjunction with higher-end storage systems

10 10 FreeLoader Architecture Pool n Morsel Access, Data Integrity, Non-invasiveness Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Pool A Registration Storage Layer Pool m Registration Grid Data Access Tools

11 11 Storage Layer  Donors/Benefactors: Morsels as a unit of contribution Basic morsel operations [new(), free(), get(), put()…] Space Reclaim:  User withdrawal / space shrinkage Data Integrity through checksums Performance history per benefactor  Pools: Benefactor registrations (soft state) Dataset distributions Proximity and performance characteristics dataset 1: 1 23 dataset n: 1a 2a 3a 4a 2a1a 21 4a3a 23 2a1a 3a1

12 12 Management Layer  Manager: Pool registrations Metadata: datasets-to-pools; pools-to- benefactors, etc. Availability:  Redundant Array of Replicated Morsels  Minimum replication factor for morsels  Where to replicate?  Which morsel replica to choose? Clients are oblivious to metadata – all metadata requests are sent to manager Cache replacement policy

13 13 Dataset Striping  Stripe datasets across benefactors Morsel doubles as basic unit of striping Manager decides the allocation of data blocks to morsels across benefactors  Multiple-fold benefits Higher aggregate access bandwidth Lowering impact per benefactor Load balancing  Greedy algorithm to make best use of available space  Stripe width and Stripe size can be varied as striping parameters

14 14 Client interface  Obtains metadata from the manager  Performs gets or puts directly to the benefactors  All control messages are exchanged via UDP  All data transfers – TCP  Morsel requests are sent to benefactors in parallel, striping strategy ensures these blocks are contiguous  Efficient buffering strategy : Buffer pool of size (stripesize+1)*stripewidth Double buffering scheme  Allows network and I/O to proceed in parallel  After pool is filled up, buffer contents are flushed to disk  Reduces disk seeks, waits for filled buffer contents to form contiguous blocks before writing to disk

15 15 Current Status Application Client Manager Benefactor OS Benefactor OS I/O interface UDP (A) UDP (C) UDP/TCP (B) reserve() cancel() store() retrieve() delete() open() close() read() write() new() free() get() put()  (A) services: Dataset creation/deletion Space reservation  (B) services: Dataset retrieval Hints  (C) services: Registration Benefactor alerts, warnings, alarms to manager  (D) services: Dataset store Morsel request UDP/TCP (D) Simple data striping

16 16 Results: Experiment Setup  FreeLoader prototype running at ORNL Client Box  AMD Athlon 700MHz  400MB memory  Gig-E card  Linux 2.4.20-8 Benefactors  Group of heterogeneous Linux workstations  Contributing 7GB-30GB each  100Mb cards

17 17 Data Sources  Local GPFS Attached to ORNL SCs Accessed through GridFTP 1MB TCP buffer, 4 parallel streams  Local HPSS Accessed through HSI client, highly optimized Hot: data in disk cache without tape unloading Cold: data purged, retrieval done in large intervals  Remote NFS At NCSU HPC center Accessed through GridFTP 1MB TCP buffer, 4 parallel streams  FreeLoader 1 MB morsel size for all experiments Varying configurations

18 18 Testbed

19 19 Best of class performance comparisons Throughput (MB/s)

20 20 Effect of stripe width variation ( stripe size=1 morsel)

21 21 Effect of stripe width variation ( stripe size=8 morsels)

22 22 Effect of stripe size variation ( stripe width=4 benefactors)

23 23 Impact Tests  How uncomfortable do the donors feel When running CPU intensive tasks? Disk intensive tasks? Network intensive?  A set of tests at NCSU Benefactor performing local tasks Client retrieving datasets at a given rate  Rate is varied to study the impact on user Pentium 4, 512MB memory, 100Mbps connectivity

24 24 CPU-intensive and Mixed Time (s)

25 25 Network-intensive Task Normalized Download Time

26 26 Disk-intensive Task Throughput (MB/s)

27 27 Sample application - formatdb  Subset of basic file APIs implemented  formatdb (NCBI) BLAST toolkit – preprocesses biological sequence database to create set of sequence and index files  Raw database is ideal candidate for caching on FreeLoader  formatdb not the ideal application for FreeLoader LocalNFS Benefactors Time (sec) 124 598585599563556

28 28 Significant results

29 29 Significant results – contd.  2x and 4x speedup wrt GPFS and HPSS  Management overhead is minimal  14% worst case performance hit for CPU intensive  <= 25% for network intensive tasks  formatdb – tests upper bound of FreeLoader’s internal overhead Same as local for 1 benefactor, 2 % slower than NFS 5% faster than NFS for 4 benefactors  10 MB/s performance gain for each benefactor added until saturation

30 30 Conclusions  Goal is to achieve saturation from the client side Striping helps achieve this  Low cost commodity parts  Harnessing idle disk bandwidth  Low impact on donor, controlled by throttling request rate  Better availability, more suitable for large transient data sets than regular FS

31 31 In-progress and Future Work In-progress  Windows support Future  Complete pool structure, registration  Intelligent data distribution, service profiling  Benefactor impact control, self-configuration  Naming and replication  Grid awareness Potential extensions  Harnessing local storage at cluster nodes?  Complementing commercial storage servers?

32 32 Further Information  http://www.csm.ornl.gov/~vazhkuda/Morsels/ http://www.csm.ornl.gov/~vazhkuda/Morsels/

33 33


Download ppt "1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi."

Similar presentations


Ads by Google