1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.

Slides:



Advertisements
Similar presentations
Distributed Processing, Client/Server and Clusters
Advertisements

1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Coupling Prefix Caching and Collective Downloads for.
Symantec De-Duplication Solutions Complete Protection for your Information Driven Enterprise Richard Hobkirk Sr. Pre-Sales Consultant.
Positioning Dynamic Storage Caches for Transient Data Sudharshan VazhkudaiOak Ridge National Lab Douglas ThainUniversity of Notre Dame Xiaosong Ma North.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
Rutgers PANIC Laboratory The State University of New Jersey Self-Managing Federated Services Francisco Matias Cuenca-Acuna and Thu D. Nguyen Department.
Introduction  What is an Operating System  What Operating Systems Do  How is it filling our life 1-1 Lecture 1.
1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Case Study - GFS.
File Systems (2). Readings r Silbershatz et al: 11.8.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Distributed File Systems
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 FreeLoader: Lightweight Data Management for Scientific Visualization Vincent Freeh 1 Xiaosong Ma 1,2 Nandan Tammineedi 1 Jonathan Strickland 1 Sudharshan.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY FreeLoader: Scavenging Desktop Storage Resources for.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
VMware vSphere Configuration and Management v6
GridNEWS: A distributed Grid platform for efficient storage, annotating, indexing and searching of large audiovisual news content Ioannis Konstantinou.
7. Grid Computing Systems and Resource Management
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
International Conference on Autonomic Computing Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging Jonathan Strickland (1) Vincent.
1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Tackling I/O Issues 1 David Race 16 March 2010.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
TYPES OFF OPERATING SYSTEM
Storage Virtualization
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
CLUSTER COMPUTING.
Optimizing End-User Data Delivery Using Storage Virtualization
Database System Architectures
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi 1 Sudharshan Vazhkudai 2 1. North Carolina State University 2. Oak Ridge National Laboratory September, 2004

2 Roadmap  Motivation  FreeLoader architecture  Design choices  Results  Future work

3 Motivation: Data Avalanche  More data to process Science, industry, government  Example: scientific data Better instruments More simulation power Higher resolution (Picture courtesy: Jim Gray, SLAC Data Management Workshop) Space Telescope P&E Gene Sequencer From

4 Data acquisition and storage Data acquisition, reduction, analysis, visualization, storage Data Acquisition System Remote users with local computing and storage Remote storage Local users High Speed Network Metadata raw data Remote users Supercomputers

5 Remote Data Sources  Data serving at supercomputing sites Shared file systems – GPFS Archiving systems - HPSS  Data centers  Expensive, high-end solutions with guaranteed capacity and access rates  Tools used in access FTP, GridFTP Grid file systems Customized data migration program Web browser

6 User perspective  End user typically processes data locally Convenience and control Better CPU/memory configurations Problem 1: needs local space to hold data Problem 2: getting data from remote sources is slow  Central point of failure  High contention for resource, multiple incoming requests – availability is hit  Dataset characteristics Write-once, read-many access patterns Raw data often discarded Shared interest to same data among groups Primary copy archived elsewhere Squirrel – P2P web cache

7 Harnessing idle disk storage  Harnessing storage resources of individual workstations ~ Harnessing idle CPU cycles  LAN environments desktops with 100Mbps or Gbps connectivity Increasing hard disk capacities Increasing % of total is unused – 50% and upwards  Even with contribution << available - impressive aggregate storage  Increasing numbers of workstations are online most of the time  Access locality, aggregate I/O and network bandwidth, data sharing

8 Use Cases  FreeLoader storage cloud as a: Cache Local, client-side scratch Intermediate hop Grid replica

9 Intended Role of FreeLoader  What the scavenged storage “is not”: Not a replacement to high-end storage Not a file system Not intended for integrating resources at wide-area scale Does not emphasize replica discovery, routing protocol and consistency like P2P storage systems  What it “is”: Low-cost, best-effort alternative to remote high-end storage Intended to facilitate  transient access to large, read-only datasets  data sharing within administrative domain To be used in conjunction with higher-end storage systems

10 FreeLoader Architecture Pool n Morsel Access, Data Integrity, Non-invasiveness Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Pool A Registration Storage Layer Pool m Registration Grid Data Access Tools

11 Storage Layer  Donors/Benefactors: Morsels as a unit of contribution Basic morsel operations [new(), free(), get(), put()…] Space Reclaim:  User withdrawal / space shrinkage Data Integrity through checksums Performance history per benefactor  Pools: Benefactor registrations (soft state) Dataset distributions Proximity and performance characteristics dataset 1: 1 23 dataset n: 1a 2a 3a 4a 2a1a 21 4a3a 23 2a1a 3a1

12 Management Layer  Manager: Pool registrations Metadata: datasets-to-pools; pools-to- benefactors, etc. Availability:  Redundant Array of Replicated Morsels  Minimum replication factor for morsels  Where to replicate?  Which morsel replica to choose? Clients are oblivious to metadata – all metadata requests are sent to manager Cache replacement policy

13 Dataset Striping  Stripe datasets across benefactors Morsel doubles as basic unit of striping Manager decides the allocation of data blocks to morsels across benefactors  Multiple-fold benefits Higher aggregate access bandwidth Lowering impact per benefactor Load balancing  Greedy algorithm to make best use of available space  Stripe width and Stripe size can be varied as striping parameters

14 Client interface  Obtains metadata from the manager  Performs gets or puts directly to the benefactors  All control messages are exchanged via UDP  All data transfers – TCP  Morsel requests are sent to benefactors in parallel, striping strategy ensures these blocks are contiguous  Efficient buffering strategy : Buffer pool of size (stripesize+1)*stripewidth Double buffering scheme  Allows network and I/O to proceed in parallel  After pool is filled up, buffer contents are flushed to disk  Reduces disk seeks, waits for filled buffer contents to form contiguous blocks before writing to disk

15 Current Status Application Client Manager Benefactor OS Benefactor OS I/O interface UDP (A) UDP (C) UDP/TCP (B) reserve() cancel() store() retrieve() delete() open() close() read() write() new() free() get() put()  (A) services: Dataset creation/deletion Space reservation  (B) services: Dataset retrieval Hints  (C) services: Registration Benefactor alerts, warnings, alarms to manager  (D) services: Dataset store Morsel request UDP/TCP (D) Simple data striping

16 Results: Experiment Setup  FreeLoader prototype running at ORNL Client Box  AMD Athlon 700MHz  400MB memory  Gig-E card  Linux Benefactors  Group of heterogeneous Linux workstations  Contributing 7GB-30GB each  100Mb cards

17 Data Sources  Local GPFS Attached to ORNL SCs Accessed through GridFTP 1MB TCP buffer, 4 parallel streams  Local HPSS Accessed through HSI client, highly optimized Hot: data in disk cache without tape unloading Cold: data purged, retrieval done in large intervals  Remote NFS At NCSU HPC center Accessed through GridFTP 1MB TCP buffer, 4 parallel streams  FreeLoader 1 MB morsel size for all experiments Varying configurations

18 Testbed

19 Best of class performance comparisons Throughput (MB/s)

20 Effect of stripe width variation ( stripe size=1 morsel)

21 Effect of stripe width variation ( stripe size=8 morsels)

22 Effect of stripe size variation ( stripe width=4 benefactors)

23 Impact Tests  How uncomfortable do the donors feel When running CPU intensive tasks? Disk intensive tasks? Network intensive?  A set of tests at NCSU Benefactor performing local tasks Client retrieving datasets at a given rate  Rate is varied to study the impact on user Pentium 4, 512MB memory, 100Mbps connectivity

24 CPU-intensive and Mixed Time (s)

25 Network-intensive Task Normalized Download Time

26 Disk-intensive Task Throughput (MB/s)

27 Sample application - formatdb  Subset of basic file APIs implemented  formatdb (NCBI) BLAST toolkit – preprocesses biological sequence database to create set of sequence and index files  Raw database is ideal candidate for caching on FreeLoader  formatdb not the ideal application for FreeLoader LocalNFS Benefactors Time (sec)

28 Significant results

29 Significant results – contd.  2x and 4x speedup wrt GPFS and HPSS  Management overhead is minimal  14% worst case performance hit for CPU intensive  <= 25% for network intensive tasks  formatdb – tests upper bound of FreeLoader’s internal overhead Same as local for 1 benefactor, 2 % slower than NFS 5% faster than NFS for 4 benefactors  10 MB/s performance gain for each benefactor added until saturation

30 Conclusions  Goal is to achieve saturation from the client side Striping helps achieve this  Low cost commodity parts  Harnessing idle disk bandwidth  Low impact on donor, controlled by throttling request rate  Better availability, more suitable for large transient data sets than regular FS

31 In-progress and Future Work In-progress  Windows support Future  Complete pool structure, registration  Intelligent data distribution, service profiling  Benefactor impact control, self-configuration  Naming and replication  Grid awareness Potential extensions  Harnessing local storage at cluster nodes?  Complementing commercial storage servers?

32 Further Information 

33