Luis Russi¹, Carlos R. Senna¹, Edmundo R. M. Madeira¹, Xuan Liu², Shuai Zhao², and Deep Medhi² Hadoop-in-a-Hybrid-Cloud GEC21 The 21st GENI Engineering.

Slides:



Advertisements
Similar presentations
Cloud computing is used to describe a variety of computing concepts that involve a large number of computers connected through a real-time communication.
Advertisements

System Center 2012 R2 Overview
HPC Pack On-Premises On-premises clusters Ability to scale to reduce runtimes Job scheduling and mgmt via head node Reliability HPC Pack Hybrid.
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
Microsoft Ignite /16/2017 2:42 PM
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
M.A.Doman Model for enabling the delivery of computing as a SERVICE.
EXTENDING SCIENTIFIC WORKFLOW SYSTEMS TO SUPPORT MAPREDUCE BASED APPLICATIONS IN THE CLOUD Shashank Gugnani Tamas Kiss.
Reproducible Environment for Scientific Applications (Lab session) Tak-Lon (Stephen) Wu.
Windows Azure Conference 2014 Oracle on Windows Azure.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
Cloud Computing Why is it called the cloud?.
Ocean Observatories Initiative Common Execution Environment Kate Keahey OOI Cyberinfrastructure Life Cycle Objectives Milestone Review, Release 1 San Diego,
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Building service testbeds on FIRE D5.2.5 Virtual Cluster on Federated Cloud Demonstration Kit August 2012 Version 1.0 Copyright © 2012 CESGA. All rights.
Cloud as a Service Chetan Shinde Column Software Technologies Pvt. Ltd.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Microsoft Azure Virtual Machines. Networking Compute Storage Virtual Machine Operating System Applications Data & Access Runtime Provision & Manage.
UI and Data Entry UI and Data Entry Front-End Business Logic Mid-Tier Data Store Back-End.
Software Architecture
Software Engineering for Business Information Systems (sebis) Department of Informatics Technische Universität München, Germany wwwmatthes.in.tum.de Data-Parallel.
Towards Establishing a Local ORCA Instance Shade EL-Hadik Deniz Gurkan University of Houston 7th GENI Engineering Conference 03/16/2010 GEC7 – ORCA-D.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Grids, Clouds and the Community. Cloud Technology and the NGS Steve Thorn Edinburgh University Matteo Turilli, Oxford University Presented by David Fergusson.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
FutureGrid Connection to Comet Testbed and On Ramp as a Service Geoffrey Fox Indiana University Infra structure.
WNoDeS – Worker Nodes on Demand Service on EMI2 WNoDeS – Worker Nodes on Demand Service on EMI2 Local batch jobs can be run on both real and virtual execution.
Sponsored by the National Science Foundation Systematic Experimentation, Automation, and Scaling Up Sarah Edwards, GENI Project Office.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
ServiceSs, a new programming model for the Cloud Daniele Lezzi, Rosa M. Badia, Jorge Ejarque, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group.
Ubuntu, SUSE, OpenSUSE, CentOS & Oracle EL + hundreds on VM Depot Bring your own framework! Ecosystem Supported Microsoft 1st Party Support.
GAAIN Virtual Appliances: Virtual Machine Technology for Scientific Data Analysis Arihant Patawari USC Stevens Neuroimaging and Informatics Institute July.
Windows Azure Virtual Machines Anton Boyko. A Continuous Offering From Private to Public Cloud.
Microsoft Azure Active Directory. AD Microsoft Azure Active Directory.
NTU Cloud 2010/05/30. System Diagram Architecture Gluster File System – Provide a distributed shared file system for migration NFS – A Prototype Image.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Michał Jankowski, Paweł Wolniewicz, Jiří Denemark, Norbert Meyer,
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Virtual cloud R 陳昌毅 R 顏昭恩 R 黃伯淳 2010/06/03.
Cloud computing Cloud Computing1. NIST: Five essential characteristics On-demand self-service Computing capabilities, disks are demanded over the network.
A Technical Overview Bill Branan DuraCloud Technical Lead.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Running Big Data on the EGI Federated Cloud Javier Lopez Cacheiro, Álvaro.
GIMI Tutorial GIMI Team GEC 16, Salt Lake City, March 19 th 1.
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
OpenNebula: Experience at SZTAKI Peter Kacsuk, Sandor Acs, Mark Gergely, Jozsef Kovacs MTA SZTAKI EGI CF Helsinki.
EGI-InSPIRE RI EGI Webinar EGI-InSPIRE RI Porting your application to the EGI Federated Cloud 17 Feb
Cloudsim: simulator for cloud computing infrastructure and modeling Presented By: SHILPA V PIUS 1.
StratusLab is co-funded by the European Community’s Seventh Framework Programme (Capacities) Grant Agreement INFSO-RI Demonstration StratusLab First.
1 Cloud paradigm, standards and middleware for PGS * ESRIN *
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Information Initiative Center, Hokkaido University North 11, West 5, Sapporo , Japan Tel, Fax: General.
Introductory Tutorial: OpenStack, Chef, Hadoop, Hbase, Pig I590 Data Science Curriculum Big Data Open Source Software and Projects September Geoffrey.
In Depth Azure StackIn Depth Azure Stack Resource Providers Damian Flynn MVP Daniel Savage Microsoft.
Project Cumulus Overview March 15, End Goal Unified Public & Private PaaS for GlassFish/Java EE Simplify deployment of Java EE Apps on top of.
Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.
Infrastructure Orchestration to Optimize Testing
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Virtualization in the gLite Grid Middleware software process
Azure IaaS 101.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Microsoft Virtual Academy
Module 01 ETICS Overview ETICS Online Tutorials
Cloud Computing: Concepts
Harrison Howell CSCE 824 Dr. Farkas
Presentation transcript:

Luis Russi¹, Carlos R. Senna¹, Edmundo R. M. Madeira¹, Xuan Liu², Shuai Zhao², and Deep Medhi² Hadoop-in-a-Hybrid-Cloud GEC21 The 21st GENI Engineering Conference Oct 20-23, Bloomington, IN, USA ¹Institute of Computing, State University of Campinas – Brazil ²University of Missouri–Kansas City – USA

Agenda Motivation and Objectives Proposed Architecture – Web Cloud Portal – Execution Engine – Execution Service Why using GENI Testbed GEC

Motivation and Objectives Why – Hadoop installed in a private cloud may not have sufficient resources for all types of computational requirements – Need a seamless environment where Hadoop in a private cloud can access resources in other clouds Hybird Cloud An architecture to make the orchestration of Hadoop applications in hybrid clouds – Automatic preparation of a cross-domain cluster – Provisioning files – Making the results available to the user GEC

Cont.. Execution of Hadoop applications in hybrid cloud is not easy! – Spends time – Needs technical knowledge – Continuous evaluation of cloud resources – On-demand preparation of public cloud resources – Hybrid cloud requires an appropriate model that combines performance with minimal cost GENI platforms allows us to test out the Hadoop in a hybrid cloud concept

The Proposed Architecture GEC HM – Hadoop Master Node HW – Hadoop Worker Nodes

Web Cloud Portal GEC User interface Management of files (application, data and submission) Simple XML-Based submission file – Number of Virtual Machines (VM) – Image identification (Hadoop Master and Workers) – Requirements of VMs (memory, disk, flavor, etc) Organizing the application workspace

Orchestration Engine GEC Prepares working place in the private cloud’s storage Creates an Execution Service Instance (ESI) already associated with this cloud storage area Releases the ESI to manage the application execution (asynchronously) Copies the resulting files from the cloud storage to the user’s working place Eliminates ESI Notifies WCP

Execution Engine GEC ES Instance interacts with the private cloud monitoring system to evaluate the computational resources conditions Checks for extra resources from the public cloud (if needed) Automatic Hadoop Cluster preparation (Master and Workers) Makes a copy of the resulting files from the HDFS to the cloud storage accessible by the Orchestration Engine Eliminates all involved VMs Notifies the Orchestration Engine about the end of processes Monitors all stages of processing

Great environment for testing the Hybrid Cloud High speed networks Provisinable environments for cloud computing Public cloud deployment Cluster installation automation API integration Why using GENI?

UNICAMP-UMKC Hybrid Testbed GEC Word Count Java software prototype

Initial Results GEC Deploy exoGENI virtual machines with Hadoop Include the UMKC compute node at the UNICAMP cloud controller GRE Tunnel established between UMKC and UNICAMP

Future Work GEC  ExoGENI virtual machines and cloud Hadoop cluster joint  Execute the Wordcount Hadoop application at the cluster (  Integrate GENI API to the private cloud framework

Luis Russi¹, Carlos R. Senna¹, Edmundo R. M. Madeira¹, Xuan Liu², Shuai Zhao², and Deep Medhi² ¹Institute of Computing, State University of Campinas – Brazil ²University of Missouri–Kansas City – USA Thank you! Hadoop-in-a-Hybrid-Cloud