Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France

Slides:



Advertisements
Similar presentations
Virtualization, Cloud Computing, and TeraGrid Kate Keahey (University of Chicago, ANL) Marlon Pierce (Indiana University)
Advertisements

Wei Lu 1, Kate Keahey 2, Tim Freeman 2, Frank Siebenlist 2 1 Indiana University, 2 Argonne National Lab
Sponsors and Acknowledgments This work is supported in part by the National Science Foundation under Grants No. OCI , IIP and CNS
Science Clouds: Early Experiences in Cloud Computing for Scientific Applications Chicago, October 2008 Kate Keahey, Renato Figueiredo, Jose Fortes, Tim.
Nimbus or an Open Source Cloud Platform or the Best Open Source EC2 No Money Can Buy ;-) Kate Keahey Tim Freeman University of Chicago.
Advanced Computing and Information Systems laboratory Virtual Private Clusters: Virtual Appliances and Networks in the Cloud Renato Figueiredo ACIS Lab.
Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
Locality-Aware Dynamic VM Reconfiguration on MapReduce Clouds Jongse Park, Daewoo Lee, Bokyeong Kim, Jaehyuk Huh, Seungryoul Maeng.
A Hadoop Overview. Outline Progress Report MapReduce Programming Hadoop Cluster Overview HBase Overview Q & A.
Cloud Computing Imranul Hoque. Today’s Cloud Computing.
Future Grid Early Projects 2010 User Advisory Board Meeting Pittsburgh, PA.
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Advanced Computing and Information Systems laboratory Educational Virtual Clusters for On- demand MPI/Hadoop/Condor in FutureGrid Renato Figueiredo Panoat.
Aneka: A Software Platform for .NET-based Cloud Computing
Cloud Computing and Virtualization Sorav Bansal CloudCamp 2010 IIT Delhi.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 4.
B UILDING M ULTI - TIER W EB A PPLICATIONS IN V IRTUAL E NVIRONMENTS.
Virtualization for Cloud Computing
Architecture overview 6/03/12 F. Desprez - ISC Cloud Context : Development of a toolbox for deploying application services providers with a hierarchical.
Cloud Computing Why is it called the cloud?.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Ocean Observatories Initiative Common Execution Environment Kate Keahey OOI Cyberinfrastructure Life Cycle Objectives Milestone Review, Release 1 San Diego,
Virtual Infrastructure in the Grid Kate Keahey Argonne National Laboratory.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 7 2/23/2015.
OOI CI R2 Life Cycle Objectives Review Aug 30 - Sep Ocean Observatories Initiative OOI CI Release 2 Life Cycle Objectives Review CyberPoPs & Network.
Nimbus & OpenNebula Young Suk Moon. Nimbus - Intro Open source toolkit Provides virtual workspace service (Infrastructure as a Service) A client uses.
A Performance Evaluation of Azure and Nimbus Clouds for Scientific Applications Radu Tudoran KerData Team Inria Rennes ENS Cachan 10 April 2012 Joint work.
Experimenting with FutureGrid CloudCom 2010 Conference Indianapolis December Geoffrey Fox
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Software Architecture
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative Common Execution Environment Kate Keahey, Tim Freeman, Alex Clemesha, John Bresnahan, David.
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Grids, Clouds and the Community. Cloud Technology and the NGS Steve Thorn Edinburgh University Matteo Turilli, Oxford University Presented by David Fergusson.
Magellan: Experiences from a Science Cloud Lavanya Ramakrishnan.
SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Design Discussion Rain: Dynamically Provisioning Clouds within FutureGrid PI: Geoffrey Fox*, CoPIs: Kate Keahey +, Warren Smith -, Jose Fortes #, Andrew.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
1/23 Distributed Systems Architecture Research Group Universidad Complutense de Madrid Nuevos modelos de provisión de recursos para infrestructuras GRID:
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Grid testing using virtual machines Stephen Childs*, Brian Coghlan, David O'Callaghan, Geoff Quigley, John Walsh Department of Computer Science Trinity.
OpenNebula: Experience at SZTAKI Peter Kacsuk, Sandor Acs, Mark Gergely, Jozsef Kovacs MTA SZTAKI EGI CF Helsinki.
Group # 14 Dhairya Gala Priyank Shah. Introduction to Grid Appliance The Grid appliance is a plug-and-play virtual machine appliance intended for Grid.
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
Instituto de Biocomputación y Física de Sistemas Complejos Cloud resources and BIFI activities in JRA2 Reunión JRU Española.
Nimbus Update March 2010 OSG All Hands Meeting Kate Keahey Nimbus Project University of Chicago Argonne National Laboratory.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Structured Container Delivery Oscar Renalias Accenture Container Lead (NOTE: PASTE IN PORTRAIT AND SEND BEHIND FOREGROUND GRAPHIC FOR CROP)
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
New Paradigms: Clouds, Virtualization and Co.
Cloud Challenges C. Loomis (CNRS/LAL) EGI-TF (Amsterdam)
Introduction to Distributed Platforms
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Cloud Computing with Nimbus
Cloud Computing Dr. Sharad Saxena.
Outline Virtualization Cloud Computing Microsoft Azure Platform
Sky Computing on FutureGrid and Grid’5000
Virtualization, Cloud Computing, and TeraGrid
Using and Building Infrastructure Clouds for Science
Sky Computing on FutureGrid and Grid’5000
Presentation transcript:

Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France

INTRODUCTION TO SKY COMPUTING

U M R I R I S A IaaS clouds On demand/elastic model Pay as you go Access to virtual machines with administrator privileges –Portable execution stack Commercial providers –Amazon EC2 => “infinite” resource pool (e.g. 10K cores) Scientific clouds => limited number of resources –Science Clouds –FutureGrid Nimbus clouds

U M R I R I S A Sky Computing Federation of multiple IaaS clouds Creates large scale infrastructures Allows to run software requiring large computational power

U M R I R I S A Sky Computing Benefits Single networking context –All-to-all connectivity Single security context –Trust between all entities Equivalent to local cluster –Compatible with legacy code

LARGE-SCALE SKY COMPUTING EXPERIMENTS

U M R I R I S A Sky Computing Toolkit Nimbus –Resource management –Contextualization (Context Broker) ViNe –All-to-all connectivity Hadoop –Task distribution –Fault tolerance –Resource dynamicity

U M R I R I S A Context Broker Service to configure a complete cluster with different roles Supports clusters distributed on multiple clouds (e.g. Nimbus and Amazon EC2) VMs contact the context broker to –Learn their role –Learn about other VMs in the cluster Ex. : Hadoop master + Hadoop slaves Hadoop slaves configured to contact the master Hadoop master configured to know the slaves

U M R I R I S A Cluster description hadoop-master fc8-i386-nimbus-blast-cluster public hadoop_master hadoop_slave hadoop-slaves fc8-i386-nimbus-blast-cluster public hadoop_slave

U M R I R I S A ViNe Project of the University of Florida (M. Tsugawa et al.) High performance virtual network All-to-all connectivity ViNe router ViNe router ViNe router

U M R I R I S A Hadoop Open-source MapReduce implementation Heavy industrial use (Yahoo, Facebook…) Efficient framework for distribution of tasks Built-in fault-tolerance Distributed file system (HDFS)

U M R I R I S A Sky Computing Architecture IaaS Software ViNe Distributed Application Distributed Application Hadoop MapReduce App

U M R I R I S A Grid’5000 Overview Distributed over 9 sites in France ~1500 nodes, ~5500 CPUs Study of large scale parallel/distributed systems Features –Highly reconfigurable Environment deployment over bare hardware Can deploy many different Linux distributions Even other OS such as FreeBSD –Controlable –Monitorable (metrics access) Experiments on all layers –network, OS, middleware, applications

U M R I R I S A Grid’5000 Node Distribution

U M R I R I S A FutureGrid: a Grid Testbed NSF-funded experimental testbed ~5000 cores 6 sites connected by a private network

U M R I R I S A Resources used in Sky Computing Experiments 3 FutureGrid sites (US) with Nimbus installations –UCSD (San Diego) –UF (Florida) –UC (Chicago) Grid’5000 sites (France) –Lille (contains a white-listed gateway to FutureGrid) –Rennes, Sophia, Nancy, etc. Grid’5000 is fully isolated from the Internet –One machine white-listed to access FutureGrid –ViNe queue VR (Virtual Router) for other sites

U M R I R I S A ViNe Deployment Topology SD UF UC Lille Rennes Sophia All-to-all connectivity! White-listed Queue VR Grid’5000 firewall

U M R I R I S A Experiment scenario Hadoop sky computing virtual cluster already running in FutureGrid (SD, UF, UC) Launch BLAST MapReduce job Start VMs on Grid’5000 resources –With contextualization to join the existing cluster Automatically extend the Hadoop cluster –Number of nodes increases TaskTracker nodes (Map/Reduce tasks execution) DataNode nodes (HDFS storage) –Hadoop starts distributing tasks in Grid’5000 –Job completes faster!

U M R I R I S A Job progress with cluster extension

U M R I R I S A Scalable Virtual Cluster Creation (1/3) Standard Nimbus propagation: scp Nimbus Repository Nimbus Repository scp VMM A VMM B VM1 VM2 VM3 VM4 scp

U M R I R I S A Scalable Virtual Cluster Creation (2/3) Pipelined Nimbus propagation: Kastafior/TakTuk Nimbus Repository Nimbus Repository VMM A VMM B VM1 VM2 VM3 VM4 Propagate

U M R I R I S A Scalable Virtual Cluster Creation (3/3) Leverage Xen Copy-on-Write (CoW) capabilities VMM A Local cache CoW Image 4 CoW Image 4 Backing file VMM B CoW Image 3 CoW Image 3 Backing file Local cache CoW Image 2 CoW Image 2 Backing file CoW Image 1 CoW Image 1 Backing file

U M R I R I S A

CONCLUSION

U M R I R I S A Conclusion Sky Computing to create large scale distributed infrastructures Our approach relies on –Nimbus for resource management, contextualization and fast cluster instantiation –ViNe for all-to-all connectivity –Hadoop for dynamic cluster extension Provides both infrastructure and application elasticity

U M R I R I S A Ongoing & Future Works Elastic MapReduce implementation leveraging Sky Computing infrastructures (presented at CCA ‘11) Migration support in Nimbus –Leverage spot instances in Nimbus

U M R I R I S A Acknowledgments Tim Freeman, John Bresnahan, Kate Keahey, David LaBissoniere (Argonne/University of Chicago) Maurício Tsugawa, Andréa Matsunaga, José Fortes (University of Florida) Thierry Priol, Christine Morin (INRIA)

THANK YOU! QUESTIONS?