Clouds from FutureGrid’s Perspective

Slides:



Advertisements
Similar presentations
Education and training on FutureGrig Salt Lake City, Utah July 18 th 2011 Presented by Renato Figueiredo
Advertisements

SALSA HPC Group School of Informatics and Computing Indiana University.
Towards a Virtual European Supercomputing Infrastructure Vision & issues Sanzio Bassini
SCALABLE PARALLEL COMPUTING ON CLOUDS : EFFICIENT AND SCALABLE ARCHITECTURES TO PERFORM PLEASINGLY PARALLEL, MAPREDUCE AND ITERATIVE DATA INTENSIVE COMPUTATIONS.
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Authors: Thilina Gunarathne, Tak-Lon Wu, Judy Qiu, Geoffrey Fox Publish: HPDC'10, June 20–25, 2010, Chicago, Illinois, USA ACM Speaker: Jia Bao Lin.
3DAPAS/ECMLS panel Dynamic Distributed Data Intensive Analysis Environments for Life Sciences: June San Jose Geoffrey Fox, Shantenu Jha, Dan Katz,
Big Data and Clouds: Challenges and Opportunities NIST January Geoffrey Fox
X-Informatics Cloud Technology (Continued) March Geoffrey Fox Associate.
1 1 Hybrid Cloud Solutions (Private with Public Burst) Accelerate and Orchestrate Enterprise Applications.
FutureGrid SOIC Lightning Talk February Geoffrey Fox
Science of Cloud Computing Panel Cloud2011 Washington DC July Geoffrey Fox
Clouds for Sensors and Data Intensive Applications May st International Workshop on Data-intensive Process Management.
Experimenting with FutureGrid CloudCom 2010 Conference Indianapolis December Geoffrey Fox
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox
OpenQuake Infomall ACES Meeting Maui May Geoffrey Fox
Science Applications on Clouds June Cloud and Autonomic Computing Center Spring 2012 Workshop Cloud Computing: from.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox
FutureGrid Connection to Comet Testbed and On Ramp as a Service Geoffrey Fox Indiana University Infra structure.
Image Generation and Management on FutureGrid CTS Conference 2011 Philadelphia May Geoffrey Fox
SALSA HPC Group School of Informatics and Computing Indiana University.
Some remarks on Use of Clouds to Support Long Tail of Science July XSEDE 2012 Chicago ILL July 2012 Geoffrey Fox.
SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox
SALSASALSASALSASALSA Clouds Ball Aerospace March Geoffrey Fox
X-Informatics MapReduce February Geoffrey Fox Associate Dean for Research.
Programming Models for Technical Computing on Clouds and Supercomputers (aka HPC) May Cloud Futures 2012 May 7–8,
SALSASALSASALSASALSA Cloud Panel Session CloudCom 2009 Beijing Jiaotong University Beijing December Geoffrey Fox
Internet of Things (Smart Grid) Storm Archival Storage – NOSQL like Hbase Streaming Processing (Iterative MapReduce) Batch Processing (Iterative MapReduce)
Virtual Appliances CTS Conference 2011 Philadelphia May Geoffrey Fox
A scalable and flexible platform to run various types of resource intensive applications on clouds ISWG June 2015 Budapest, Hungary Tamas Kiss,
Computing Research Testbeds as a Service: Supporting large scale Experiments and Testing SC12 Birds of a Feather November.
Recipes for Success with Big Data using FutureGrid Cloudmesh SDSC Exhibit Booth New Orleans Convention Center November Geoffrey Fox, Gregor von.
Security: systems, clouds, models, and privacy challenges iDASH Symposium San Diego CA October Geoffrey.
Cloud Computing Paradigms for Pleasingly Parallel Biomedical Applications Thilina Gunarathne, Tak-Lon Wu Judy Qiu, Geoffrey Fox School of Informatics,
Big Data to Knowledge Panel SKG 2014 Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China August Geoffrey Fox
HPC in the Cloud – Clearing the Mist or Lost in the Fog Panel at SC11 Seattle November Geoffrey Fox
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Big Data Workshop Summary Virtual School for Computational Science and Engineering July Geoffrey Fox
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Panel: Beyond Exascale Computing
Organizations Are Embracing New Opportunities
Private Public FG Network NID: Network Impairment Device
Hadoop Aakash Kag What Why How 1.
Department of Intelligent Systems Engineering
Introduction to Distributed Platforms
Geoffrey Fox, Shantenu Jha, Dan Katz, Judy Qiu, Jon Weissman
MapReduce and Data Intensive Applications XSEDE’12 BOF Session
FutureGrid Computing Testbed as a Service
iSERVOGrid Architecture Working Group Brisbane Australia June
Digital Science Center Overview
Cloud Computing Dr. Sharad Saxena.
I590 Data Science Curriculum August
Data Science Curriculum March
Biology MDS and Clustering Results
Sky Computing on FutureGrid and Grid’5000
Twister4Azure : Iterative MapReduce for Azure Cloud
Cloud DIKW based on HPC-ABDS to integrate streaming and batch Big Data
Gateway and Web Services
Big Data Architectures
Cyberinfrastructure and PolarGrid
Services, Security, and Privacy in Cloud Computing
PolarGrid and FutureGrid
Panel on Research Challenges in Big Data
Digital Science Center
Sky Computing on FutureGrid and Grid’5000
Cloud versus Cloud: How Will Cloud Computing Shape Our World?
Big Data, Simulations and HPC Convergence
Convergence of Big Data and Extreme Computing
Presentation transcript:

Clouds from FutureGrid’s Perspective April 4 2012 Geoffrey Fox gcf@indiana.edu Director, Digital Science Center, Pervasive Technology Institute Associate Dean for Research and Graduate Studies,  School of Informatics and Computing Indiana University Bloomington

What is FutureGrid? The implementation of mission includes The FutureGrid project mission is to enable experimental work that advances: Innovation and scientific understanding of distributed computing and parallel computing paradigms, The engineering science of middleware that enables these paradigms, The use and drivers of these paradigms by important applications, and, The education of a new generation of students and workforce on the use of these paradigms and their applications. The implementation of mission includes Distributed flexible hardware with supported use Identified IaaS and PaaS “core” software with supported use Outreach ~4500 cores in 5 major sites

Distribution of FutureGrid Technologies and Areas 190 Projects

Recent FutureGrid Projects

Templated Dynamic Provisioning Abstract Specification of image mapped to various HPC and Cloud environments

Using Clouds in a Nutshell High Throughput Computing; pleasingly parallel; grid applications Multiple users (long tail of science) and usages (parameter searches) Internet of Things (Sensor nets) as in cloud support of smart phones (Iterative) MapReduce including “most” data analysis Exploiting elasticity and platforms (HDFS, Queues ..) Use services, portals (gateways) and workflow Good Strategies: Build the application as a service; Build on existing cloud deployments such as Hadoop; Use PaaS if possible; Design for failure; Use as a Service (e.g. SQLaaS) where possible; Address Challenge of Moving Data

4 Forms of MapReduce (a) Map Only (d) Loosely Synchronous   (a) Map Only (d) Loosely Synchronous (c) Iterative MapReduce (b) Classic MapReduce Input map reduce Iterations Output Pij BLAST Analysis Parametric sweep Pleasingly Parallel High Energy Physics (HEP) Histograms Distributed search Classic MPI PDE Solvers and particle dynamics Domain of MapReduce and Iterative Extensions MPI Expectation maximization Clustering e.g. Kmeans Linear Algebra, Page Rank

Performance – Kmeans Clustering Task Execution Time Histogram Number of Executing Map Task Histogram Right(c): Twister4Azure executing Map Task histogram for 128 million data points in 128 Azure small instances Figure 5. KMeansClustering Scalability. Left(a): Relative parallel efficiency of strong scaling using 128 million data points. Center(b): Weak scaling. Workload per core is kept constant (ideal is a straight horizontal line). Strong Scaling with 128M Data Points Weak Scaling

Outreach etc. Programming Paradigms for Technical Computing on Clouds and Supercomputers (Fox and Gannon) http://grids.ucs.indiana.edu/ptliupages/publications/Cloud%20Programming%20Paradigms_for__Futures.pdf Science Cloud Summer School July 30-August 3 ~10 Faculty and Students from MSI’s (sent by ADMI) Part of virtual summer school in computational science and engineering and expect over 200 participants spread over 10 sites Science Cloud and MapReduce XSEDE Community groups