Unistore: Project Updates

Slides:



Advertisements
Similar presentations
SLA-Oriented Resource Provisioning for Cloud Computing
Advertisements

Thoughts on Shared Caches Jeff Odom University of Maryland.
Abstract HyFS: A Highly Available Distributed File System Jianqiang Luo, Mochan Shrestha, Lihao Xu Department of Computer Science, Wayne State University.
Ceph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Priya Bhat, Yonggang Liu, Jing Qin.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
st International Conference on Parallel Processing (ICPP)
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
Experience with K42, an open- source, Linux-compatible, scalable operation-system kernel IBM SYSTEM JOURNAL, VOL 44 NO 2, 2005 J. Appovoo 、 M. Auslander.
12006/9/26 Load Balancing in Dynamic Structured P2P Systems Brighten Godfrey, Karthik Lakshminarayanan, Sonesh Surana, Richard Karp, Ion Stoica INFOCOM.
PETAL: DISTRIBUTED VIRTUAL DISKS E. K. Lee C. A. Thekkath DEC SRC.
ObliviStore High Performance Oblivious Cloud Storage Emil StefanovElaine Shi
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie 1 Rice University Rice University HUST Presentation.
Naixue GSU Slide 1 ICVCI’09 Oct. 22, 2009 A Multi-Cloud Computing Scheme for Sharing Computing Resources to Satisfy Local Cloud User Requirements.
The Autonomic Cloud An ASCENS case study Future Emerging Technologies.
An Architecture for Video Surveillance Service based on P2P and Cloud Computing Yu-Sheng Wu, Yue-Shan Chang, Tong-Ying Juang, Jing-Shyang Yen speaker:
Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2
Korea Univ B-Fetch: Branch Prediction Directed Prefetching for In-Order Processors 컴퓨터 · 전파통신공학과 최병준 1 Computer Engineering and Systems Group.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
The Center for Autonomic Computing is supported by the National Science Foundation under Grant No NSF CAC Seminannual Meeting, October 5 & 6,
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
Network Computing Laboratory Experiment Tutorial Network Computing Lab
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Distributed Database Systems Overview
MSN 数学媒体与信息存储 1/27 Zhuo Liu, Fei Wu, Xiao Qin, Changsheng Xie, Jian Zhou, and Jianzong Wang TRACER: A Trace Replay Tool to Evaluate Energy-Efficiency of.
VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.
Ceph: A Scalable, High-Performance Distributed File System
Embedded System Lab. 정범종 A_DRM: Architecture-aware Distributed Resource Management of Virtualized Clusters H. Wang et al. VEE, 2015.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
The IEEE International Conference on Cluster Computing 2010
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
Architecture for Resource Allocation Services Supporting Interactive Remote Desktop Sessions in Utility Grids Vanish Talwar, HP Labs Bikash Agarwalla,
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Decentralized Distributed Storage System for Big Data Presenter: Wei Xie Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department.
File system: Ceph Felipe León fi Computing, Clusters, Grids & Clouds Professor Andrey Y. Shevel ITMO University.
Fast Data Analysis with Integrated Statistical Metadata in Scientific Datasets By Yong Chen (with Jialin Liu) Data-Intensive Scalable Computing Laboratory.
Using Pattern-Models to Guide SSD Deployment for Big Data in HPC Systems Junjie Chen 1, Philip C. Roth 2, Yong Chen 1 1 Data-Intensive Scalable Computing.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
April 9-10, 2015 Texas Tech University Semiannual Meeting Unistore: A Unified Storage Architecture for Cloud Computing Project Members: Wei Xie,
Unistore: Project Updates
Jonathan Walpole Computer Science Portland State University
Data Management on Opportunistic Grids
Introduction to Distributed Platforms
Efficient data maintenance in GlusterFS using databases
Curator: Self-Managing Storage for Enterprise Clusters
BD-CACHE Big Data Caching for Datacenters
DuraStore – Achieving Highly Durable Data Centers
Unistore: A Unified Storage Architecture for Cloud Computing
An Adaptive Data Separation Aware FTL for Improving the Garbage Collection Efficiency of Solid State Drives Wei Xie and Yong Chen Texas Tech University.
Elastic Consistent Hashing for Distributed Storage Systems
Jiang Zhou, Wei Xie, Dong Dai, and Yong Chen
Understanding System Characteristics of Online Erasure Coding on Scalable, Distributed and Large-Scale SSD Array Systems Sungjoon Koh, Jie Zhang, Miryeong.
Supporting Fault-Tolerance in Streaming Grid Applications
2019-TTU-1: Visualizing, Monitoring, and Automating Data Centers
What is the Azure SQL Datawarehouse?
SQL 2014 In-Memory OLTP What, Why, and How
Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform
Proposal for Term Project Operating Systems, Fall 2018
Resource Allocation for Distributed Streaming Applications
Database System Architectures
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Unistore: Project Updates Presenter: Wei Xie Project Members: Jiang Zhou, Mark Reyes, Jason Noble, David Cauthron and Yong Chen Data-Intensive Scalable Computing Laboratory(DISCL) Computer Science Department Texas Tech University We are grateful to the Nimboxx and the Cloud and Autonomic Computing site at Texas Tech University for the valuable support for this project

Unistore Overview To build a unified storage architecture (Unistore) for Cloud storage systems with the co-existence and efficient integration of heterogeneous HDDs and SCM (Storage Class Memory) devices Prototype development based on Sheepdog and/or Ceph Data distribution management and workload characterization Data Placement Component Characterization Component I/O Pattern Random/Sequential Read/write Hot/cold Workloads Access patterns guide ----- Meeting Notes (4/8/15 14:37) ----- Objective slide. ----- Meeting Notes (4/8/15 15:09) ----- unitore: design, idea implement on sheepdog title change to unistore objective and design Prototype developement based on sheepdog distributed store for vm. Component reflect objective. Placement Algorithm Modified Consistent Hash Devices Characteristics Bandwidth Throughput Replication Algorithm Pattern-directed Replication

Project Updates 2 new papers completed and submitted: Pattern-directed Replication Scheme to ICPP2016 SUORA to SC16 2 new papers under preparation: Version consistent hash Tiered-CRUSH: in preparation for SoCC16 Simulation code and prototype developed to evaluate the proposed schemes Proposed schemes improve the performance of heterogeneous storage systems and maintain the balanced storage utilization

Tiered-CRUSH CRUSH (algorithm used in Ceph) ensures data placed across multiple independent locations to improve data availability Tiered-CRUSH integrates storage tiering into the CRUSH data placement Access frequency of object recorded per volume, hotter data more likely to be placed on faster tiers Implemented in a benchmarks tool compiled with the CRUSH library functions Simulation showed that data distribution uniformity can be maintained Simulation shows 1.5 to 2X improvement in overall bandwidth in our experimental settings

Version Consistent Hashing Scheme Build versions into the virtual nodes Avoid data migration when adding nodes or node fails Maintain efficient data lookup Data lookup algorithm Performance improvement: v1: 1,2 v2: 4,1 v3: 4,6 v4: 4,6 Lookup locations: {4, 6, 1, 2}

Pattern-directed Replication Trace object I/O requests when executing applications Trace analysis, correlation finding and object grouping Reorganize objects for replication in the background for better performance

Conclusions Tiered-CRUSH algorithm achieves better IO performance and higher data availability at the same time for heterogeneous storage system Version consistent hashing scheme for improving manageability of data center PRS for high performance data replication by reorganizing the placement of data replications

On-going/Future Work Implement/evaluate the proposed schemes in Sheepdog (currently simulation-based) Instrument Sheepdog with Lttng framework for more flexible tracing capability

http://cac.ttu.edu/, http://discl.cs.ttu.edu/ Thank You Please visit: http://cac.ttu.edu/, http://discl.cs.ttu.edu/ Acknowledgement: The CAC@TTU is funded by the National Science Foundation under grants IIP-1362134 and IIP-1238338.

Please take a moment to fill out your L.I.F.E. forms. http://www.iucrc.com Select “Cloud and Autonomic Computing Center” then select “IAB” role. What do you like about this project? What would you change? (Please include all relevant feedback.)