1 FreeLoader: Lightweight Data Management for Scientific Visualization Vincent Freeh 1 Xiaosong Ma 1,2 Nandan Tammineedi 1 Jonathan Strickland 1 Sudharshan.

Slides:



Advertisements
Similar presentations
Remote Visualisation System (RVS) By: Anil Chandra.
Advertisements

Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
SALSA HPC Group School of Informatics and Computing Indiana University.
High Performance Computing Course Notes Grid Computing.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Coupling Prefix Caching and Collective Downloads for.
Chapter 1: Introduction
Positioning Dynamic Storage Caches for Transient Data Sudharshan VazhkudaiOak Ridge National Lab Douglas ThainUniversity of Notre Dame Xiaosong Ma North.
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
High Performance Storage System Harry Hulen
1 Use of SRMs in Earth System Grid Arie Shoshani Alex Sim Lawrence Berkeley National Laboratory.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
File and Object Replication in Data Grids Chin-Yi Tsai.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Comparison of Distributed Operating Systems. Systems Discussed ◦Plan 9 ◦AgentOS ◦Clouds ◦E1 ◦MOSIX.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Framework for Evaluating Distributed Smalltalk Interface Jan Lukeš Czech Technical University.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Introduction to Microsoft Windows 2000 Integrated support for client/server and peer-to-peer networks Increased reliability, availability, and scalability.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY FreeLoader: Scavenging Desktop Storage Resources for.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
GVis: Grid-enabled Interactive Visualization State Key Laboratory. of CAD&CG Zhejiang University, Hangzhou
Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
7. Grid Computing Systems and Resource Management
International Conference on Autonomic Computing Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging Jonathan Strickland (1) Vincent.
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
Parallel IO for Cluster Computing Tran, Van Hoai.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
Tackling I/O Issues 1 David Race 16 March 2010.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Clouds , Grids and Clusters
Introduction to Data Management in EGI
Grid Computing.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
IS 4506 Server Configuration (HTTP Server)
Optimizing End-User Data Delivery Using Storage Virtualization
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets A.Chervenak, I.Foster, C.Kesselman, C.Salisbury,
Presentation transcript:

1 FreeLoader: Lightweight Data Management for Scientific Visualization Vincent Freeh 1 Xiaosong Ma 1,2 Nandan Tammineedi 1 Jonathan Strickland 1 Sudharshan Vazhkudai 2 1. North Carolina State University 2. Oak Ridge National Laboratory September, 2004

2 Roadmap  Motivation  FreeLoader architecture  Initial design and optimization  Preliminary results  In-progress and future work

3 Motivation: Data Avalanche  More data to process Science, industry, government  Example: scientific data Better observational instruments Better experimental instruments More simulation power (Picture courtesy: Jim Gray, SLAC Data Management Workshop) Space Telescope P&E Gene Sequencer From

4 Motivation: Needs for Remote Data Data acquisition, reduction, analysis, visualization, storage Data Acquisition System Remote users with local computing and storage Remote storage Local users High Speed Network Metadata raw data Remote users Supercomputers

5 Motivation: Remote Data Sources  Supercomputing centers Shared file systems Archiving systems  Data centers  Internet World Wide Telescope Virtual Observatory NCBI bio databases  Tools used in access FTP, GridFTP Grid file systems Customized data migration program Web browser

6 Motivation: Insufficient Local Storage  End user consumes data locally Convenience and control Better CPU/memory configurations Problem 1: needs local space to hold data Problem 2: getting data from remote sources is slow  Dataset characteristics Write-once, read-many (or a few) Raw data often discarded Shared interest to same data among groups Primary copy archived somewhere

7 Condor for Storage?  Harnessing storage resources of individual workstations ~ Harnessing idle CPU cycles

8 Why would it work, and work well?  Average workstations have more and more GBs  And half of the space is idle! Even a modest contribution (Contribution << Available) can amass collective, staggering aggregate storage!  Increasing numbers of workstations are online most of the time [desk-top grid research]  Access locality, aggregate I/O and network bandwidth, data sharing

9 Use Cases  FreeLoader storage cloud as a: Cache Local, client-side scratch Intermediate hop Grid replica RAS for Terascale Supercomputers

10 Related Work and Design Issues  Related Work: Network/Distributed File Systems (NFS, LOCUS) Parallel File Systems (PVFS, XFS) Serverless File Systems (FARSITE, xFS, GFS) Peer-to-Peer Storage (OceanStore, PAST, CFS) Grid Storage Services (LegionFS, SRB, IBP, SRM, GASS)  Design Issues & Assumptions: Scalability: O(100) or O(1000) Commodity Components User Autonomy Security and trust Heterogeneity Large, “write once read many” datasets Transparent Naming Grid Aware

11 Intended Role of FreeLoader  What the scavenged storage “is not”: Not a replacement to high-end storage Not a file system Not intended for integrating resources at wide-area scale  What it “is”: Low-cost, best-effort alternative to scientific data sources Intended to facilitate  transient access to large, read-only datasets  data sharing within administrative domain To be used in conjunction with higher-end storage systems

12 FreeLoader Architecture Pool n Morsel Access, Data Integrity, Non-invasiveness Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Management Layer Data Placement, Replication, Grid Awareness, Metadata Management Pool A Registration Storage Layer Pool m Registration Grid Data Access Tools

13 Storage Layer  Benefactors: Morsels as a unit of contribution Basic morsel operations [new(), free(), get(), put()…] Space Reclaim:  User withdrawal / space shrinkage Data Integrity through checksums Performance history  Pools: Benefactor registrations (soft state) Dataset distributions Metadata Selection heuristics dataset 1: 1 23 dataset n: 1a 2a 3a 4a 2a1a 21 4a3a 23 2a1a 3a1

14 Management Layer  Manager: Pool registrations Metadata: datasets-to-pools; pools-to-benefactors, etc. Availability:  Redundant Array of Replicated Morsels  Minimum replication factor for morsels  Where to replicate?  Which morsel replica to choose? Grid Awareness:  Information Providers  Space reservations  Transfer protocols Transparent Access:  Namespace

15 Dataset Striping  Stripe datasets across benefactors Morsel doubles as basic unit of striping  Multiple-fold benefits Higher aggregate access bandwidth Better resource usage Lowering impact per benefactor  Tradeoff between access rates and availability  Need to consider Heterogeneity, network connections Working together with replication Serving partial datasets

16 Current Status Application Client Manager Benefactor OS Benefactor OS I/O interface UDP (A) UDP (C) UDP/TCP (B) reserve() cancel() store() retrieve() delete() open() close() read() write() new() free() get() put()  (A) services: Dataset creation/deletion Space reservation  (B) services: Dataset retrieval Hints  (C) services: Registration Benefactor alerts, warnings, alarms to manager  (D) services: Dataset store Morsel request UDP/TCP (D) Simple data striping

17 Preliminary Results: Experiment Setup  FreeLoader prototype running at ORNL Client Box  AMD Athlon 700MHz  400MB memory  Gig-E card  Linux Benefactors  Group of heterogeneous Linux workstations  Contributing 7GB-30GB each  100Mb cards

18 Sample Data Sources  Local GPFS Attached to ORNL SPs Accessed through GridFTP 1MB TCP buffer, 4 parallel streams  Local HPSS Accessed through HSI client, highly optimized Hot: data in disk cache without tape unloading Cold: data purged, retrieval done in large intervals  Remote NFS At NCSU HPC center Accessed through GridFTP 1MB TCP buffer, 4 parallel streams

19 FreeLoader Data Retrieval Performance Throughput (MB/s)

20 Impact Tests  How uncomfortable donors may feel?  A set of tests at NCSU Benefactor performing local tasks Client retrieving datasets at a given rate

21 CPU-intensive Task Time (s)

22 Network-intensive Task Normalized Download Time

23 Disk-intensive Task Throughput (MB/s)

24 Mixed Task: Linux Kernel Compilation Time (s)

25 In-progress and Future Work In-progress  APIs for use as scratch space  Windows support Future  Complete pool structure, registration  Intelligent data distribution, service profiling  Benefactor impact control, self-configuration  Naming and replication  Grid awareness Potential extensions  Harnessing local storage at cluster nodes?  Complementing commercial storage servers?

26 Further Information 