Storage Issues. Replica Placement Most existing works focus on how to place replica with low cost. Maybe it is safer that we separate the replicas as.

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Davide Frey, Anne-Marie Kermarrec, Konstantinos Kloudas INRIA Rennes, France Plug.
CLOUD COMPUTING FOR MOBILE USERS: CAN OFFLOADING COMPUTATION SAVE ENERGY? Purdue University.
EC2 demystification, server power efficiency, disk drive reliability CSE 490h, Autumn 2008.
Tradeoffs in Scalable Data Routing for Deduplication Clusters FAST '11 Wei Dong From Princeton University Fred Douglis, Kai Li, Hugo Patterson, Sazzala.
EndRE: An End-System Redundancy Elimination Service.
IT Equipment Efficiency Peter Rumsey, Rumsey Engineers.
3 3 3 CHAPTER System Software. 3 © The McGraw-Hill Companies, Inc Objectives System software Programs, Functions, Categories Utilities Device drivers.
Object-based Image Representation Dr. B.S. Manjunath Sitaram Bhagavathy Shawn Newsam Baris Sumengen Vision Research Lab University of California, Santa.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
1 File Management in Representative Operating Systems.
1© Copyright 2012 EMC Corporation. All rights reserved. WAN Optimized Replication of Backup Datasets Using Stream-Informed Delta Compression Philip Shilane,
1© Copyright 2013 EMC Corporation. All rights reserved. EMC AVAMAR FOR NAS ENVIRONMENTS Backup, recovery, and disaster recovery for network-attached storage.
Upgrading the Platform - How to Get There!
Multi-level Selective Deduplication for VM Snapshots in Cloud Storage Wei Zhang*, Hong Tang †, Hao Jiang †, Tao Yang*, Xiaogang Li †, Yue Zeng † * University.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
UC Santa Cruz Providing High Reliability in a Minimum Redundancy Archival Storage System Deepavali Bhagwat Kristal Pollack Darrell D. E. Long Ethan L.
I mage is a visual representation of an object or scene or person produced on a surface. I mage is a visual representation of an object or scene or person.
DETECTING NEAR-DUPLICATES FOR WEB CRAWLING Authors: Gurmeet Singh Manku, Arvind Jain, and Anish Das Sarma Presentation By: Fernando Arreola.
1 The Google File System Reporter: You-Wei Zhang.
Redundant Array of Independent Disks
Objectives Learn what a file system does
Virtualization. Virtualization  In computing, virtualization is a broad term that refers to the abstraction of computer resources  It is "a technique.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 2.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Virtualization By Tim Ausburn & James Cantrell. Virtualization: Why? Reduce IT Costs Server consolidation Application Isolation Increase Server Utilization.
DATA DEDUPLICATION By: Lily Contreras April 15, 2010.
Data Center Back-up: Data Sustainability on a Budget Mike DeNapoli Enterprise Systems Engineer Double-Take Software.
Demystifying Deduplication. Global SMB Event Marketing 2 APPROACH: What is deduplication? Eliminate redundant data Start with the backup environment as.
Barracuda Message Archiver. Integrated hardware and software Archiving and policy management Search and retrieval Internal storage and support for external.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan Presenter: Youngki, Lee.
Improving Content Addressable Storage For Databases Conference on Reliable Awesome Projects (no acronyms please) Advanced Operating Systems (CS736) Brandon.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
1 CloudVS: Enabling Version Control for Virtual Machines in an Open- Source Cloud under Commodity Settings Chung-Pan Tang, Tsz-Yeung Wong, Patrick P. C.
Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,
WS2012 File and Storage Services Management Name Jeff Alexander Technical Evangelist – Windows Infrastructure Microsoft Australia
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups Chun-Ho Ng, Patrick P. C. Lee The Chinese University of Hong Kong.
FYP Briefing Presentation Building an Efficient IaaS: - Let’s become experts in cloud computing! April 15, 2010.
IT Professionals David Tesar | Microsoft Technical Evangelist David Aiken | Microsoft Group Technical Product Manager 07 | High Availability and Load Balancing.
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited.
Multimedia System Dave Chung 9/94. Technology Trends Multimedia workstations with audio and video processing capability Multimedia workstations with audio.
Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
1 #compromisenothing ©Copyright 2014 Tegile Systems Inc. All Rights Reserved. Company Confidential Think And not Or.
ARIZONA DIGITAL GOVERNMENT SUMMIT Phoenix, AZ May 28-29, 2008 The Road to a Virtualized Desktop Neal Puff Chief Information Officer Yuma County, AZ
Performance. Performance Performance is a critical issue especially in a multi-user environment. Benchmarking is one way of testing this.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Storage Issues. Last Time Deduplication storage ◦ Read performance is critical to reconstruct the original data stream.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Using Deduplicating Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, Robert Ricci University of Utah.
Virtual Desktop Infrastructure Service. A desktop that follows you from place to place and device to device  Access your desktop from anywhere with Internet.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
It’s September, Do You Know If You Have Securely Backed-Up Your Data?
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Bup: the git-based backup system Avery Pennarun
Tools for identifying duplicate files and known software files
Exploiting Sharing for Data Center Consolidation
Measurement-based Design
Demystifying Deduplication
Slingshot: Deploying Stateful Services in Wireless Hotspots
Deduplication in Storage Systems
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 4 Storage Prof. Zhang Gang School of.
Flat Datacenter Storage
IT Equipment Efficiency
IT Equipment Efficiency
The Greening of IT November 1, 2007.
Presentation transcript:

Storage Issues

Replica Placement Most existing works focus on how to place replica with low cost. Maybe it is safer that we separate the replicas as far as possible? ◦ In same server => server crash ◦ In same rack = > rack failure ◦ In same datacenter = > earthquake or other cataclysms Consider both distance and cost.

Data Deduplication Data deduplication is a specialized data compression technique for eliminating coarse-grained redundant data. ◦ Improve storage utilization. Issues: ◦ How to improve the duplication detection and chuck existence querying efficiency.  Efficient chunking, faster hash indexing, locality- preserving index catching, and efficient bloom filters …etc. ◦ Compressing the unique chunks and performing (fixed-size) large writes through containers or similar structures.

Read Performance of Deduplication Storage Publication of David H. C. Du, HPCC’11. Read performance is critical to reconstruct the original data stream.

Read Performance of Deduplication Storage(Cont.) One example is to store images of VMs(process/memory/disk) to shared network storage. ◦ VM images of idle desktops are migrated to network storage for energy saving.

Benchmarks Filebench  x.php Phoronix Test Suite – disk test suite  Bonnie++ 