Unistore: A Unified Storage Architecture for Cloud Computing

Slides:



Advertisements
Similar presentations
SLA-Oriented Resource Provisioning for Cloud Computing
Advertisements

Differentiated I/O services in virtualized environments
2. Computer Clusters for Scalable Parallel Computing
Ceph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Priya Bhat, Yonggang Liu, Jing Qin.
Features Scalability Availability Latency Lifecycle Data Integrity Portability Manage Services Deliver Features Faster Create Business Value.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
What is it? What kind of system need it?. Distributing system, cloud system etc.
Windows 7 Windows Server 2008 R2 VirtualizationVirtualization Heterogeneous Server Environment Inventory Linux, Unix & VMware Windows 7 & Server 2008.
Agenda Master Expert Associat e Microsoft Certified Solutions Master (MCSM) Microsoft Certified Solutions Expert (MCSE) Microsoft Certified Solutions.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Toolbox for Dimensioning Windows Storage Systems Jalil Boukhobza, Claude Timsit 12/09/2006 Versailles Saint Quentin University.
1 NETWORKED EMBEDDED SYSTEMS SRIKANTH SUBRAMANIAN.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 7 2/23/2015.
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Ceph Storage in OpenStack Part 2 openstack-ch,
Exploring the Applicability of Scientific Data Management Tools and Techniques on the Records Management Requirements for the National Archives and Records.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Comparison of Distributed Operating Systems. Systems Discussed ◦Plan 9 ◦AgentOS ◦Clouds ◦E1 ◦MOSIX.
GreenSched: An Energy-Aware Hadoop Workflow Scheduler
SODA File Server Physical Machine VMHost File Server App File CSV NTFS/REFS RDR Volume Partition Spaces SAS/SCSIIDE FCiSCSI JBODSAN Target FC SAN.
Ceph: A Scalable, High-Performance Distributed File System
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
Hyper-V Performance, Scale & Architecture Changes Benjamin Armstrong Senior Program Manager Lead Microsoft Corporation VIR413.
Windows Azure Conference 2014 LAMP on Windows Azure.
NTU Cloud 2010/05/30. System Diagram Architecture Gluster File System – Provide a distributed shared file system for migration NFS – A Prototype Image.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Advanced Database Concepts
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
Features Scalability Manage Services Deliver Features Faster Create Business Value Availability Latency Lifecycle Data Integrity Portability.
Parallel IO for Cluster Computing Tran, Van Hoai.
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
© 2010 VMware Inc. All rights reserved Why Virtualize? Beng-Hong Lim, VMware, Inc.
Decentralized Distributed Storage System for Big Data Presenter: Wei Xie Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department.
Using Pattern-Models to Guide SSD Deployment for Big Data in HPC Systems Junjie Chen 1, Philip C. Roth 2, Yong Chen 1 1 Data-Intensive Scalable Computing.
April 9-10, 2015 Texas Tech University Semiannual Meeting Unistore: A Unified Storage Architecture for Cloud Computing Project Members: Wei Xie,
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Chen Qian, Xin Li University of Kentucky
Unistore: Project Updates
Introduction to Operating Systems
BD-Cache: Big Data Caching for Datacenters
Introduction to Load Balancing:
Efficient data maintenance in GlusterFS using databases
Parallel-DFTL: A Flash Translation Layer that Exploits Internal Parallelism in Solid State Drives Wei Xie1 , Yong Chen1 and Philip C. Roth2 1. Texas Tech.
Curator: Self-Managing Storage for Enterprise Clusters
Flash Storage 101 Revolutionizing Databases
Using OpenStack to Measure OpenStack Cinder Performance
BD-CACHE Big Data Caching for Datacenters
From Algorithm to System to Cloud Computing
DuraStore – Achieving Highly Durable Data Centers
An Adaptive Data Separation Aware FTL for Improving the Garbage Collection Efficiency of Solid State Drives Wei Xie and Yong Chen Texas Tech University.
DADA – Dynamic Allocation of Disk Area
Elastic Consistent Hashing for Distributed Storage Systems
Jiang Zhou, Wei Xie, Dong Dai, and Yong Chen
Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems Sungjoon Koh, Jie Zhang, Miryeong.
A Survey on Distributed File Systems
Comparison of the Three CPU Schedulers in Xen
Replication Middleware for Cloud Based Storage Service
Unistore: Project Updates
12/3/2018 Desktop Virtualization Corey Hynes Kyle Rosenthal President Technical Lead HynesITe Inc Spider Consulting @windowspcguy.
CLUSTER COMPUTING.
Speaker: Jin-Wei Lin Advisor: Dr. Ho-Ting Wu
CPU SCHEDULING.
Image Magick in the Cloud Scalable Image Processing Service
Specialized Cloud Architectures
Forward-Looking Statements
Progress Report 2017/02/08.
Presentation transcript:

Unistore: A Unified Storage Architecture for Cloud Computing Wei Xie, Jiang Zhou, and Yong Chen Unistore: A Unified Storage Architecture for Cloud Computing Unistore: review of project plan Sheepdog: object storage Benchmark Based on sheepdog distributed store for virtual machine Optimization for heterogeneous storage (SSDs and HDDs) Optimization for heterogeneous workload Planned testing tools fio, dd for generating synthetic workload Real workload benchmark iostat Comparison with other product GlusterFS Ceph Object storage Workload characterization Hot/cold data detection and separation Multiple bloom filter [1] Temporal locality [2] I/O size, write/read ratio, inter-arrival time, queue depth, latency, and IOPS Online vs.. offline workload tracing and characterization Sheepdog: gateway Responsible for where to store objects, or data placement Consistent hashing Add/remove node not significantly change mapping I/O load balance How to make consistent hashing support heterogonous device? Two hash rings for HDD and SSD, respectively Planned Schedule Year: 2015 Q1: investigation and survey about Unistore Q2: characterization component development of Unistore Q3: metadata management of Unistore Q4: data distribution management of Unistore Year: 2016 Q1: VM image store and loading Q2: advanced functions of Unistore Q3: performance optimization of Unistore Q4: module integration and system benchmarking Initial Characterization Result Plan to implement online hot/cold data detection Hot data store on SSD, cold on HDD Initial result collected from OLTP I/O trace. Write_ops Write_to_HDD Write_to_SSD finished on-going to-be-done Sheepdog: component Cluster manager QEMU block Driver Object storage Gateway Object manager Deployment We are grateful to the Nimboxx and the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project. Acknowledgements Environment 3 CentOS 6.5 virtual machines on iMac workstation Sheepdog built on the 3 virtual machine and form a cluster Use corosync to manage the cluster (can switch to zookeeper if necessary) Will migrate to a real Linux cluster later (for testing)