Http://secret.cis.uab.edu | http://thecenter.uab.edu/ A Comparative Study on I/O Performance between Compute and Storage Optimized Instances of Amazon.

Slides:



Advertisements
Similar presentations
Storing Data: Disks and Files: Chapter 9
Advertisements

Query Processing and Optimizing on SSDs Flash Group Qingling Cao
Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute.
FAWN: Fast Array of Wimpy Nodes A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott.
OPNET Technologies, Inc. Performance versus Cost in a Cloud Computing Environment Yiping Ding OPNET Technologies, Inc. © 2009 OPNET Technologies, Inc.
VSphere vs. Hyper-V Metron Performance Showdown. Objectives Architecture Available metrics Challenges in virtual environments Test environment and methods.
1 Distributed Systems Meet Economics: Pricing in Cloud Computing Hadi Salimi Distributed Systems Lab, School of Computer Engineering, Iran University of.
KMemvisor: Flexible System Wide Memory Mirroring in Virtual Environments Bin Wang Zhengwei Qi Haibing Guan Haoliang Dong Wei Sun Shanghai Key Laboratory.
XENMON: QOS MONITORING AND PERFORMANCE PROFILING TOOL Diwaker Gupta, Rob Gardner, Ludmila Cherkasova 1.
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
Introduction to Systems Architecture Kieran Mathieson.
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
MODULE 9: SCALING THE ENVIRONMENT. Agenda CP storage in a production environment – Understanding IO by Tier Designing for multiple CPs Storage sizing.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Buying a Laptop. 3 Main Components The 3 main components to consider when buying a laptop or computer are Processor – The Bigger the Ghz the faster the.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
Evaluation of Windows 7 RC Build 7100 By Muswera Walter Supervisor: Mr John Ebden Consultants: Billy Morgan and Jill Japp.
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Middleware Enabled Data Sharing on Cloud Storage Services Jianzong Wang Peter Varman Changsheng Xie 1 Rice University Rice University HUST Presentation.
Continuous resource monitoring for self-predicting DBMS Dushyanth Narayanan 1 Eno Thereska 2 Anastassia Ailamaki 2 1 Microsoft Research-Cambridge, 2 Carnegie.
A MAZON W EB S ERVICES Reza Yousefzadeh 12/9/2014.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Privacy-Preserving Public Auditing for Secure Cloud Storage
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
On the Varieties of Clouds for Data Intensive Computing 董耀文 Antslab Robert L. Grossman University of Illinois at Chicago And Open Data.
Storage Management in Virtualized Cloud Environments Sankaran Sivathanu, Ling Liu, Mei Yiduo and Xing Pu Student Workshop on Frontiers of Cloud Computing,
Ragib Hasan University of Alabama at Birmingham CS 491/691/791 Fall 2012 Lecture 4 09/10/2013 Security and Privacy in Cloud Computing.
Eneryg Efficiency for MapReduce Workloads: An Indepth Study Boliang Feng Renmin University of China Dec 19.
Network Computing Laboratory Experiment Tutorial Network Computing Lab
Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data Nihat Altiparmak, Ali Saman Tosun The University of Texas at San.
DBI313. MetricOLTPDWLog Read/Write mixMostly reads, smaller # of rows at a time Scan intensive, large portions of data at a time, bulk loading Mostly.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
CISC Machine Learning for Solving Systems Problems Presented by: Alparslan SARI Dept of Computer & Information Sciences University of Delaware
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
Online Music Store. MSE Project Presentation III
Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto.
GEM: A Framework for Developing Shared- Memory Parallel GEnomic Applications on Memory Constrained Architectures Mucahid Kutlu Gagan Agrawal Department.
SpotADAPT: Spot-Aware (re-)Deployment of Analytical Processing Tasks on Amazon EC2 by Dalia Kaulakiene, Aalborg University (Denmark) Christian Thomsen,
Cloud Computing for the Automated Assignment of Broadband Rotational Spectra: Porting Autofit to Amazon EC2 A thesis by Aaron C. Olinger.
Bio-IT World Conference and Expo ‘12, April 25, 2012 A Nation-Wide Area Networked File System for Very Large Scientific Data William K. Barnett, Ph.D.
Harnessing the Cloud for Securely Outsourcing Large- Scale Systems of Linear Equations.
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
The Processor & its components. The CPU The brain. Performs all major calculations. Controls and manages the operations of other components of the computer.
Maximizing Performance – Why is the disk subsystem crucial to console performance and what’s the best disk configuration. Extending Performance – How.
Online Newspaper CMS 1 Date: 27/12/2012. Contents Introduction Project Management Requirement Specifications Design Description Test Documentation Summary.
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
Dec 14, 2014, Harvard University
Advanced Algorithms Analysis and Design
Understanding and Improving Server Performance
AWS Integration in Distributed Computing
Collecting, cataloguing and searching performance information of Cloud resources. Olaf Elzinga.
Evolutionary Technique for Combinatorial Reverse Auctions
Distributed Network Traffic Feature Extraction for a Real-time IDS
Lecture 16: Data Storage Wednesday, November 6, 2006.
Windows Server* 2016 & Intel® Technologies
Installation and database instance essentials
Cloud Big Data Decision Support System for Machine Learning on AWS
Jiang Zhou, Wei Xie, Dong Dai, and Yong Chen
April 30th – Scheduling / parallel
Objective of This Course
Zhen Xiao, Qi Chen, and Haipeng Luo May 2013
Outline Motivation and background Read Write
COMP4442 Cloud Computing: Assignment 1
CherryPick: Adaptively Unearthing the Best
The 100 TB Sorting Competition - A Quick Review
Efficient Migration of Large-memory VMs Using Private Virtual Memory
Presentation transcript:

http://secret.cis.uab.edu | http://thecenter.uab.edu/ A Comparative Study on I/O Performance between Compute and Storage Optimized Instances of Amazon EC2 Abu Awal Md Shoeb, Ragib Hasan, Md. Haque, and Meng Hu University of Alabama at Birmingham, USA http://secret.cis.uab.edu | http://thecenter.uab.edu/ IEEE CLOUD2014

The Center The UAB Center for Information Assurance and Joint Forensics Research (CIA|JFR) A DHS/NSA Center of Excellence in Information Assurance Research (CAE-R) http://TheCenter.uab.edu/ SECRETLab: http://secret.cis.uab.edu

Outline Introduction Problem of Cloud Computing Motivation Available EC2 Families Benchmark Tool Experimental Setup Results Future Direction Conclusion

What is the Problem? Along with the Security issues, clients face following problems: Clients can not verify whether they really Consume the resources as claimed by the provider Clients ca not ensure whether the cloud provider provides Resources efficiently Moreover, Amazon itself recommends to do load test before selecting an instance for specific application/services (http://aws.amazon.com/ec2/instance- types/#measuring-performance)

Why is it Difficult to Solve? Clients have little access to architectural info due to the black-box nature of clouds Providers are reluctant to share information about actual location of the VM inside a physical machine Clients can not trace resource usage whether it is properly attributed to correct customer or not

Cloud Usability Issues

Our Hypothesis Amazon’s Claim- Storage Optimized instances should have the highest I/O performance with low cost Our Hypothesis- Two instances, one from Compute Optimized and one from Storage Optimized family, have been taken to compare I/O performance. In our experiment, Amazon’s Claim has been disproved almost in 50% cases! Need to maintain a blacklist of phishing sites

Different Families in EC2 General Purpose- provides a balance of compute, memory, and network resources Compute Optimized- provides the highest performing processors and the lowest price/compute performance available in EC2 Memory Optimized- optimized for memory- intensive applications and have the lowest cost per GiB of RAM Storage Optimized- provides very fast SSD- backed storage for very high random I/O performance at a low cost Need to maintain a blacklist of phishing sites

Networking Performance Compute and Storage Optimized Instances Group Instance Type vCPU Memory (GiB)  Storage (GB) Networking Performance Physical Processor Clock Speed (GHz) Compute Optimized c3.large 2 3.75 2 x 16 SSD Moderate Intel Xeon E5-2680 v2 2.8 c3.xlarge 4 7.5 2 x 40 SSD c3.2xlarge 8 15 2 x 80 SSD High c3.4xlarge 16 30 2 x 160 SSD c3.8xlarge 32 60 2 x 320 SSD 10 Gigabit Storage i2.xlarge 30.5 1 x 800 SSD Intel Xeon E5-2670 v2 2.5 i2.2xlarge 61 2 x 800 SSD i2.4xlarge 122 4 x 800 SSD i2.8xlarge 244 8 x 800 SSD hs1.8xlarge 117 24 x 2,048 10 Gibabit Intel Xeon Family

Variables that Affect the I/O Performance Instance Type – Performance varies even two different type instances have same configuration Time – Performance of same instance varies over the time File Size – I/O performance of same instance varies with different file sizes Need to maintain a blacklist of phishing sites

About Benchmark Tool We use standard benchmark tool available in http://www.roylongbottom.org.uk/ We modified the tool (i.e. file size, number of files to be read, write, and delete) for our experiment It measures disk write and read speeds of multiple files at different block sizes Need to maintain a blacklist of phishing sites

Experimental Setup Disk Performance Benchmark Random and Sequential Read, Write, and Delete With different number of files and file sizes (1KB to 160MB) Number of Instances and Location Around 10 instances (5 instances of each group C3 and I2) US East Virginia Number of Benchmark Execution 20 times for each instance Time Frame Through one month

Comparison Summary : Compute VS Storage Optimized Instances Operations Type c3.xlarge i2.xlarge Read Random 46% 54% Sequential 40% 60% Write 50% 51% 49% Delete 92% 8%

Performance Variation – I2 Read Operation

Performance Variation–Random Write

Future Direction Benchmark for other nstances from same group Bechmark for instances located in other regions Benchmark with different file type and storage

Conclusion Our Findings Performance of different instances of same group is not same Performance of an instance varies over time Performance of I2 instances varies with file sizes Instance from appropriate group might not be the best for your application. In our case, Compute Optimized (C3) performed better I/O operation than Storage Optimized instance (I2)

For further queries contact Thank You Q&A For further queries contact shoeb@cis.uab.edu

Why Amazon EC2 Well Known Some instances can be used freely Comparatively less expensive than other clouds Our lab had some funds from Amazon Why we used Amazon for our research/experiment