Presenter: Zhengyu Yang

Slides:

Advertisements

Similar presentations

System Center 2012 R2 Overview

Advertisements

Emerging Platform#6: Cloud Computing B. Ramamurthy 6/20/20141 cse651, B. Ramamurthy.

 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)

Virtualization for Cloud Computing

New Challenges in Cloud Datacenter Monitoring and Management

Virtual Clusters Supporting MapReduce in the Cloud Jonathan Klinginsmith School of Informatics and Computing.

Introduction To Windows Azure Cloud

1EMC CONFIDENTIAL—INTERNAL USE ONLY Why EMC for SQL Performance Optimization.

EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.

FlashSystem family 2014 © 2014 IBM Corporation IBM® FlashSystem™ V840 Product Overview.

Challenges towards Elastic Power Management in Internet Data Center.

Server Virtualization

VMware vSphere Configuration and Management v6

Full and Para Virtualization

MidVision Enables Clients to Rent IBM WebSphere for Development, Test, and Peak Production Workloads in the Cloud on Microsoft Azure MICROSOFT AZURE ISV.

Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.

Web Technologies Lecture 13 Introduction to cloud computing.

BIG DATA/ Hadoop Interview Questions.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Deploying Docker Datacenter on AWS © 2016, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Topics 1. Introduction 2. Virtualization Frameworks 3. Data Processing Engine 4. Evaluation 5. Conclusions and Future Work.

Connected Infrastructure

Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.

Organizations Are Embracing New Opportunities

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING

How Alluxio (formerly Tachyon) brings a 300x performance improvement to Qunar’s streaming processing Xueyan Li (Qunar) & Chunming Li (Garena)

BD-Cache: Big Data Caching for Datacenters

Smart Building Solution

DocFusion 365 Intelligent Template Designer and Document Generation Engine on Azure Enables Your Team to Increase Productivity MICROSOFT AZURE APP BUILDER.

Introduction to Distributed Platforms

Measurement-based Design

Running virtualized Hadoop, does it make sense?

Virtual laboratories in cloud infrastructure of educational institutions Evgeniy Pluzhnik, Evgeniy Nikulchev, Moscow Technological Institute

StratusLab Final Periodic Review

StratusLab Final Periodic Review

Free Cloud Management Portal for Microsoft Azure Empowers Enterprise Users to Govern Their Cloud Spending and Optimize Cloud Usage and Planning MICROSOFT.

Trial.iO Makes it Easy to Provision Software Trials, Demos and Training Environments in the Azure Cloud in One Click, Without Any IT Involvement MICROSOFT.

Anything But, Troubleshooting when it’s not SQL Server

Smart Building Solution

Sebastian Solbach Consulting Member of Technical Staff

Couchbase Server is a NoSQL Database with a SQL-Based Query Language

Grid Computing.

HPE Persistent Memory Microsoft Ignite 2017

Connected Infrastructure

PA an Coordinated Memory Caching for Parallel Jobs

LECTURE 34: WEB PROGRAMMING FOR SCALE

Software Engineering Introduction to Apache Hadoop Map Reduce

Software Architecture in Practice

Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.

Take Control of Insurance Product Management: Build, Test, and Launch Any Product Globally 10x Faster, 10x More Cheaply with INSTANDA on Azure Partner.

Microsoft Azure Platform Powers New Elements Constellation Software Suite to Deliver Invaluable Insights From Your Data for Marketing and Sales MICROSOFT.

Assessing the Performance Impact of Scheduling Policies in Spark

LECTURE 32: WEB PROGRAMMING FOR SCALE

Zadara™ Virtual Private Storage Arrays™: High Performance, High Availability NAS & SAN by the Hour, with Private, Dedicated Resources MICROSOFT AZURE.

Data Security for Microsoft Azure

Excelian Grid as a Service Offers Compute Power for a Variety of Scenarios, with Infrastructure on Microsoft Azure and Costs Aligned to Actual Use MICROSOFT.

LECTURE 33: WEB PROGRAMMING FOR SCALE

CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.

Partner Logo Azure Provides a Secure, Scalable Platform for ScheduleMe, an App That Enables Easy Meeting Scheduling with People Outside of Your Company.

Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.

CS110: Discussion about Spark

Specialized Cloud Mechanisms

AdQ is Azure-Powered Pre-Roll Ad Management Software That Improves Pre-Roll Ad Performance, Increases Profits, and Optimizes User Experience MICROSOFT.

AWS Cloud Computing Masaki.

Syllabus and Introduction Keke Chen

Internet and Web Simple client-server model

Guarantee Hyper-V, System Center Performance and Autoscale to Microsoft Azure with Application Performance Control System from VMTurbo MICROSOFT AZURE.

Open Source Activity Showcase Computational Storage SNIA SwordfishTM

LECTURE 33: WEB PROGRAMMING FOR SCALE

Big-Data Analytics with Azure HDInsight

Presentation transcript:

Presenter: Zhengyu Yang Advisor: Ningfang Mi Presenter: Zhengyu Yang Electrical and Computer Engineering Department Northeastern University ningfang@ece.neu.edu

Data Creation Percentage Data Capacity Catastrophe! * IBM and SINTEF ICT Data Creation Percentage Available Storage Inventory 92% (2016~2017) Data creation is exploding. 92% of the world’s data was created in the last two years alone. At the current rate, the world’s data storage capacity will be overtaken by this spring. If we do nothing, a data capacity catastrophe is no longer a joke.

Data Access People can’t wait! Patient Time People can’t wait! Baby Boomers ~ 30 sec Millennials ~ 5 sec Generation Z ~ 1 sec Generation X ~ 15 sec Generation ? ? Besides, capacity, people also request their files can be downloaded nighening-fast everywhere, anytime, any devices. Researchers found that people’s patient time are different across generations. Generation Z, the iphone generation can no longer tolerate larger than 1 sec responses. How about the future? VR, even 1 sec is too much. So we conclude that people can’t wait!

Backend Infrastructure Framework for Applications in Big Data Era (2017) User Web Virtualized Servers It’s all about resource management! SQL/NoSQL Database Real Time Batch Streaming Data Process Engine Machine Learning Analytics Delay Delay Delay Delay Let’s take a look at the backend infrastructure of cloud computing and see what is the main bottleneck. User applications send requests through the cloud, and the datacenter has thousands of virtualized servers hosting the backend programs to serve user requests. Inside these VMs, data process engines needs to talk with NoSQL databases, machine learning apps, and even real time batch streaming apps. All these apps triggers huge amount of I/Os and I/O in fact is the bottleneck of the cloud computing, and solving that can shorten the waiting time.

4 Research Focus Framework for Applications in Big Data Era (2017) To investigate a new resource management layer between diverse applications and heterogeneous servers in a large scale cluster system. In such a cluster system, a large variety of applications are running.: data processing or image processing applications, web applications… Different applications often show different features of their workloads: CPU-intensive, I/O intensive,… or light and stable traffic load or heavy load with high variance across time… for example, we monitored HP email servers and backupservers and found very clearly usage patterns of these servers. E.g., a big spike can be found during morning for email servers while the traffic load significantly increases during midnight for backup servers。 On the other hand, the backend servers are not homogeneous any more. Hterogeneous hardware platforms are found in large scale data centers because there are a large number of hardware architectures in terms of specific speeds and capacities of the processor, memory, storage and networking subsystems. Maintaining such a large system with high efficiency and QoS at low cost is an inherently difficult problem. Co-scheduling a large set of applications can incur severe resource contention. Simultaneously launching jobs from different applications during a short time period can immediately cause a significant burst, which further aggravates resource competition and load unbalancing in data centers. This motivates us to develop new techniques for capacity planning and resource management of such cluster systems to improve system performance and resource utilization and provide high QoS, especially under temporal dependent workloads. A large data center typically hosts tens of thousands of applications with diverse workloads each day. These applications need different performance management solutions to meet their varying resource and performance requirements

Efficiency Improvement for Data Processing Platforms 5 Framework for Applications in Big Data Era (2017) LsPS - Job size based scheduler HaSTE - YARN scheduler TuMM – Slef-adjusting slot configuration OpRM - Idleness management AutoPath – Spark scheduler Scheduler sCache– Spark RDD caching Resource Management 5 5

Flash-based Storage Management Framework for Applications in Big Data Era (2017) [1] SSD-HDD Caching [2] All-Flash Tiering [3] Deduplication [4] Reliability GREM AutoTiering ElasticDedup AutoReplica [5] Datacenter Cost [6] I/O Stack [7] Compute vs. Cache [8] Docker vs. VM minTCO H-NVMe SparkCache DockerVMSpark 1st gen datacenter is all-HDD. Since 2008, SSD is used as cache for SSD-HDD hybrid datacenter. With the SSD price decreasing and SSD capacity increasing, All-Flash datacenter comes earlier than pp’s expectation. For example, TLC SSD replace HDD, MLC SSD replace TLC SSD and NVMe SSD top end for high performance. Is SSD always good? The answer is no. First focus on hybrid, for performance, we present GREM, a SSD dynamic partition solution to share SSD for multiple VMs. For reliability, we present AT, replication manager for hybrid cluster to recover from disasters.

Big data era and popular big data platforms: 2018 REU Research Project Framework for Applications in Big Data Era (2017) Big data era and popular big data platforms: 1. Hadoop MapReduce: typical two-stage process 2. Spark: DAGs (directed acyclic graph) with multiple processing stages 3. Ex: Iterative machine learning algorithms (K-means) 4. Complex dependency and memory access for intermediate data Project goals: 1. Construct a distributed, virtualized environment for extensively running a variety of data processing applications 2. Deploy emerging storage (e.g., NVMe) devices to accelerate in-memory processing required by Spark-based applications 3. Understand Spark I/O access patterns and the interference between Spark scheduling and memory management 4. Develop new resource management solutions for improving efficiency of Spark-based applications

NUCSRL Lab Members Current Ph.D and M.S. Students Framework for Applications in Big Data Era (2017) Current Ph.D and M.S. Students Graduated Ph.D. Students (VMware) (Uber) Graduated M.Sc. Students (Motorola) (EMC) (Acer) (Virgin HealthMiles) (Microsoft) (Seagate) (Amazon) (Amazon)