Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 4 Storage Prof. Zhang Gang gzhang@tju.edu.cn School of.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

Data Storage Solutions Module 1.2. Data Storage Solutions Upon completion of this module, you will be able to: List the common storage media and solutions.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism: Computer.
The University of Adelaide, School of Computer Science
1 Lecture 26: Storage Systems Topics: Storage Systems (Chapter 6), other innovations Final exam stats:  Highest: 95  Mean: 70, Median: 73  Toughest.
Windows Home Server Presented to you by: Ben Haff HTM 304.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism: Computer.
Bill Wrobleski Director, Technology Infrastructure ITS Infrastructure Services.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Server Hardware Chapter 22 Release 22/10/2010Jetking Infotrain Ltd.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
01 NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY Nutanix: bringing compute and storage together Mohit Aron, Co-founder & CTO.
Chapter 2 Computer Clusters Lecture 2.2 Computer Cluster Architectures.
Hadoop Basics -Venkat Cherukupalli. What is Hadoop? Open Source Distributed processing Large data sets across clusters Commodity, shared-nothing servers.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
The exponential growth of data –Challenges for Google,Yahoo,Amazon & Microsoft in web search and indexing The volume of data being made publicly available.
Serverless Network File Systems Overview by Joseph Thompson.
ITGS Networks. ITGS Networks and components –Server computers normally have a higher specification than regular desktop computers because they must deal.
Chabot College Chapter 8 Review Semester IIIELEC Semester III ELEC
Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
By: Kevin Arnold. Simple Definition Brief History RAID Levels Comparison Benefits, Disadvantages Cost Uses Conclusion Questions? Sources.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 10: Mass-Storage Systems.
Warehouse Scaled Computers
Prof. Zhang Gang School of Computer Sci. & Tech.
Backing Up Workstations: How to Protect Yourself on the Cheap
High Availability Linux (HA Linux)
Dedicated Servers vs Cloud Hosting
File Share Dependencies
Chilimbi, et al. (2014) Microsoft Research
Software Defined Storage
Vladimir Stojanovic & Nicholas Weaver
Cluster Disks and Cluster File Storage
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 14 The Roofline Visual Performance Model Prof. Zhang Gang
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 11 Amazon Web Services Prof. Zhang Gang
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 7 Physical Infrastructure of WSC Prof. Zhang Gang
Introduction to HDFS: Hadoop Distributed File System
Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 13 Using Energy Efficiently Inside the Server Prof. Zhang.
Prof. Zhang Gang School of Computer Sci. & Tech.
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 13 SIMD Multimedia Extensions Prof. Zhang Gang School.
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 22 Similarities & Differences between Vector Arch & GPUs Prof. Zhang Gang.
Prof. Zhang Gang School of Computer Sci. & Tech.
Introduction to Networks
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Prof. Zhang Gang School of Computer Sci. & Tech.
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures Topic 17 NVIDIA GPU Computational Structures Prof. Zhang Gang
Module – 7 network-attached storage (NAS)
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Cloud Computing Data Centers
Windows Server 2016 Software Defined Storage
Lecture 18 Warehouse Scale Computing
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
Consistency and Replication
The Memory B. Ramamurthy C B. Ramamurthy.
UNIT IV RAID.
Internet and Web Simple client-server model
Cloud Computing Data Centers
Lecture 18 Warehouse Scale Computing
Lecture 18 Warehouse Scale Computing
CMSC Cluster Computing Basics
THE GOOGLE FILE SYSTEM.
Designing Database Solutions for SQL Server
Presentation transcript:

Chapter 6 Warehouse-Scale Computers to Exploit Request-Level and Data-Level Parallelism Topic 4 Storage Prof. Zhang Gang gzhang@tju.edu.cn School of Computer Sci. & Tech. Tianjin University, Tianjin, P. R. China

Storage Storage options: Use disks inside the servers, or Network attached storage (NAS) through Infiniband The NAS solution is generally more expensive per terabyte of storage, but it provides many features, including RAID techniques to improve dependability of the storage WSCs generally rely on local disks

Storage Google File System (GFS) uses local disks and maintains at least three replicas This redundancy covers not just local disk failures, but also power failures to racks and to whole clusters The eventual consistency flexibility of GFS lowers the cost of keeping replicas consistent, which also reduces the network bandwidth requirements of the storage