Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.

Slides:



Advertisements
Similar presentations
Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Advertisements

2  Industry trends and challenges  Windows Server 2012: Modern workstyle, enabled  Access from virtually anywhere, any device  Full Windows experience.
Distributed Xrootd Derek Weitzel & Brian Bockelman.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Storage Today Victor Hatridge – CIO Nashville Electric Service (615)
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Coupling Prefix Caching and Collective Downloads for.
Positioning Dynamic Storage Caches for Transient Data Sudharshan VazhkudaiOak Ridge National Lab Douglas ThainUniversity of Notre Dame Xiaosong Ma North.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
High Performance Cooperative Data Distribution [J. Rick Ramstetter, Stephen Jenks] [A scalable, parallel file distribution model conceptually based on.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
1 The Google File System Reporter: You-Wei Zhang.
1 FreeLoader: borrowing desktop resources for large transient data Vincent Freeh 1 Xiaosong Ma 1,2 Stephen Scott 2 Jonathan Strickland 1 Nandan Tammineedi.
Alok 1Northwestern University Access Patterns, Metadata, and Performance Alok Choudhary and Wei-Keng Liao Department of ECE,
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
On-demand Grid Storage Using Scavenging Sudharshan Vazhkudai Network and Cluster Computing, CSMD Oak Ridge National Laboratory
Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.
Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.
High Performance I/O and Data Management System Group Seminar Xiaosong Ma Department of Computer Science North Carolina State University September 12,
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
So far we have covered … Basic visualization algorithms Parallel polygon rendering Occlusion culling They all indirectly or directly help understanding.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott and Christian Engelmann Computer Science.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
1 FreeLoader: Lightweight Data Management for Scientific Visualization Vincent Freeh 1 Xiaosong Ma 1,2 Nandan Tammineedi 1 Jonathan Strickland 1 Sudharshan.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Serverless Network File Systems Overview by Joseph Thompson.
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY FreeLoader: Scavenging Desktop Storage Resources for.
A Study of Caching in Parallel File Systems Dissertation Proposal Brad Settlemyer.
The Global Land Cover Facility is sponsored by NASA and the University of Maryland.The GLCF is a founding member of the Federation of Earth Science Information.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott Christian Engelmann Computer Science Research.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Presented by Robust Storage Management in the Machine Room and Beyond Sudharshan Vazhkudai Computer Science Research Group Computer Science and Mathematics.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Module 9 Planning and Implementing Monitoring and Maintenance.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
4/26/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
International Conference on Autonomic Computing Governor: Autonomic Throttling for Aggressive Idle Resource Scavenging Jonathan Strickland (1) Vincent.
Tackling I/O Issues 1 David Race 16 March 2010.
OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY The stagesub tool Sudharshan S. Vazhkudai Computer Science Research Group CSMD Oak Ridge National.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Jialin Liu, Surendra Byna, Yong Chen Oct Data-Intensive Scalable Computing Laboratory (DISCL) Lawrence Berkeley National Lab (LBNL) Segmented.
Compute and Storage For the Farm at Jlab
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Hadoop Aakash Kag What Why How 1.
Large-scale file systems and Map-Reduce
Joseph JaJa, Mike Smorul, and Sangchul Song
So far we have covered … Basic visualization algorithms
Introduction to MapReduce and Hadoop
Auburn University COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques (2) Dr. Xiao Qin Auburn University.
An Introduction to Computer Networking
Optimizing End-User Data Delivery Using Storage Virtualization
CS 345A Data Mining MapReduce This presentation has been altered.
5/7/2019 Map Reduce Map reduce.
IBM Tivoli Storage Manager
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory Department of Computer Science North Carolina State University Contributors Sudharshan Vazhkudai Stephen Scott

Beckerman_ Ma_FL_0611  More data will be produced by PetaFlop machines  High-end computing (HEC) storage systems face severe performance and reliability challenges  Storage failures: significant contributor to system down time  There is a need to reduce I/O related job failures/resubmissions Storage availability challenge – data producer side System# CPUsMTBF/IOutage source ASCI Q hrsStorage, CPU ASCI White hrsStorage, CPU Google reboots/dayStorage, mem Microscopic view  3 to 7% per year of disk failures; 3 to 16% per year of controller failures; up to 12% per year of SAN switches  Ten times the rate expected from the disk vendor specification sheets

Beckerman_ Ma_FL_0611 Storage availability challenge – data consumer side  Data eventually get analyzed and visualized at desktop machines  These are indispensable human interaction devices  They are highly limited in both storage capacity and performance  Meanwhile, many nearby workstations have unused disk space and idle system resources  Large scientific datasets can be striped to donated space  Access can be much faster than using local disk  However, donated nodes can become unavailable at any time

Beckerman_ Ma_FL_0611 Common point: Need to address availability of transient data  Scientific data in HEC settings are different from general-purpose data  They are produced and access in a distributed manner  Supercomputers are neither the source nor the destination of the data  Input data are pre-staged, read-only  Output data are offloaded after run, write-once during simulation  Output data become read-only during analysis  Data are analyzed with temporal and space locality  Storage space is a precious shared resource  New technology is needed for improving data availability

Beckerman_ Ma_FL_0611 FreeLoader overview Enabling trends: Unused storage: More than 50% of desktop storage is unused Immutable data: Data are usually write-once, read-many, with remote source copies Connectivity: Well connected, secure LAN settings FreeLoader aggregate storage cache: Scavenges O(GB) of contributions from desktops Parallel I/O environment across loosely-connected workstations, aggregating I/O as well as network bandwidth NOT a file system, but a low-cost, local storage solution enabling client-side caching and locality

Beckerman_ Ma_FL_0611 Dealing with unreliable FreeLoader nodes Collective downloading Combination of two techniques:  Utilizing multiple nodes in patching missing data  Each node retrieving long, contiguous file segment  Local data shuffling to desired striping pattern Prefix caching  Only storing part of datasets  Overlapping tail-patching with serving cached prefix

Beckerman_ Ma_FL_0611 Extending FreeLoader fault-tolerence to machine room scenario  Node-local storage can be efficiently aggregated  Temporary scratch space to store transient job data  Augmenting parallel file system metadata  Extend file system metadata to include recovery hints  Information regarding “source” and “sink” of user’s job data  Sample metadata items: URIs, credentials, etc  Metadata automatically extracted from job scripts Enables elegant, automatic data recovery and offloading without manual intervention 7 Ma_FL_0611

Beckerman_ Ma_FL_0611 Online data recovery  Motivation  Staged input data have natural redundancy and are immutable  RAID does not help with I/O node failures  Idea: patch data from the staging source  How ?  Enhance parallel file systems with “recovery metadata”  Deploy multiple nodes for parallel patching  Preliminary results  Large remote I/O requests (256MB) and local shuffling can be overlapped with client activity  Data reconstruction from ORNL to PSC shows good scalability 8 Ma_FL_0611

Beckerman_ Ma_FL_0611 Eager offloading of result data  Offloading result-data equally important for end-user visualization and interpretation  Storage system failure and purging of scratch space can cause loss of result-data  Eager offloading:  Transparent data migration using destination embedded in metadata  Data offloading can be overlapped with computation  Can failover to intermediate storage/archives  Needs coordination with parallel file system and job management tools 9 Ma_FL_0611

Beckerman_ Ma_FL_0611 Summary: System overview Supercomputer Center End-user/Mirror Sites Parallel file system Archives Source copy of dataset accessed gsiftp://source/dataset Computer nodes I/O access Failure Online reconstruction of staged data Result-data offloading Data Source Caches near by io-node

Beckerman_ Ma_FL_0611 Contacts Xiaosong Ma Assistant Professor, Department of Computer Science North Carolina State University Joint Faculty, Computer Science and Mathematics Oak Ridge National Laboratory (919) Sudharshan S. Vazhkudai Computing & Computational Sciences Directorate Computer Science and Mathematics Oak Ridge National Laboratory (865) Ma_FL_0611