Big Data Open Source Software and Projects ABDS in Summary IV: Level 7 I590 Data Science Curriculum August 15 2014 Geoffrey Fox

Slides:



Advertisements
Similar presentations
OGF29 – Cloud Standards Interoperability Demo OCCI, CDMI & OpenNebula Chicago, June 20-22, 2010.
Advertisements

Cloud Computing Development. Shallow Introduction.
Cloud computing is used to describe a variety of computing concepts that involve a large number of computers connected through a real-time communication.
Big Data Open Source Software and Projects ABDS in Summary XIV: Level 14B I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary I I590 Data Science Curriculum August Geoffrey Fox
An Approach to Secure Cloud Computing Architectures By Y. Serge Joseph FAU security Group February 24th, 2011.
Big Data Open Source Software and Projects ABDS in Summary XIX: Layer 14B Data Science Curriculum March Geoffrey Fox
Cloud Interoperability
Big Data Open Source Software and Projects ABDS in Summary XVI: Layer 13 Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary II: Layers 3 to 4 Data Science Curriculum March Geoffrey Fox
Unified Logs and Reporting for Hybrid Centralized Management
Big Data Open Source Software and Projects ABDS in Summary XXII: Layer 15B Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XVII: Layer 13 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary IV: Layer 5 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XIII: Level 14A I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary VI: Layer 6 Part 2 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary XXI: Layer 15B Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary III: Layer 5-Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary IX: Level 11C I590 Data Science Curriculum August Geoffrey Fox
Windows Azure Pack Service Provider Foundation 2012 R2 Windows Server 2012 R2 Virtual Machine Manager 2012 R2 Damian Flynn MVP System Center
Big Data Open Source Software and Projects Unit 0 Part B: Class Introduction Data Science Curriculum March Geoffrey Fox
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
CloudStack and Big Data Sebastien May 22 nd 2013 LinuxTag, Berlin.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Opensource for Cloud Deployments – Risk – Reward – Reality
X-Informatics Cloud Technology (Continued) March Geoffrey Fox Associate.
Cloud based storage. Cloud Storage Storage accessed by a web service API It is a block storage, it exposes its storage to clients as Raw storage that.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 7 2/23/2015.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Interoperability in the Cloud By Alex Espinoza
1 NETE4631 Using Google Web Services and Using Microsoft Cloud Services Lecture Notes #7.
Creative Commons Attribution-ShareAlike 3.0 OSCON Presented by Mark R. Hinkle VP of Community
Cloud Standard API and Contextualization
BIG DATA APPLICATIONS & ANALYTICS LOOKING AT INDIVIDUAL HPCABDS SOFTWARE LAYERS 1/26/2015 Cloud Computing Software 1 Geoffrey Fox January BigDat.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Protect Your Business-Critical Data in the Cloud with SoftNAS, a Full-Featured, Highly Available Solution for the Agile Microsoft Azure Platform MICROSOFT.
Big Data Open Source Software and Projects ABDS in Summary I: Layers 1 to 2 Data Science Curriculum March Geoffrey Fox
FutureGrid Connection to Comet Testbed and On Ramp as a Service Geoffrey Fox Indiana University Infra structure.
Big Data Open Source Software and Projects ABDS in Summary XVIII: Layer 14A Data Science Curriculum March Geoffrey Fox
Techcello Provides SaaS Lifecycle Management Solution to “SaaS-ify” Your Application Efficiently on the Powerful Microsoft Azure Cloud Platform MICROSOFT.
Actualog Social PIM Helps Companies to Manage and Share Product Information Using Secure, Scalable Ease of Microsoft Azure MICROSOFT AZURE ISV PROFILE:
Powered by Microsoft Azure, PointMatter Is a Flexible Solution to Move and Share Data between Business Groups and IT MICROSOFT AZURE ISV PROFILE: LOGICMATTER.
Recipes for Success with Big Data using FutureGrid Cloudmesh SDSC Exhibit Booth New Orleans Convention Center November Geoffrey Fox, Gregor von.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Web Technologies Lecture 13 Introduction to cloud computing.
Bring Your Own Security (BYOS™): Deploy Applications in a Manageable Java Container with Waratek Locker on Microsoft Azure MICROSOFT AZURE ISV PROFILE:
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Panel Discussion Software Defined Ecosystems June BigSystem Software-Defined Ecosystems at HPDC Vancouver Canada Geoffrey Fox.
Big Data Open Source Software and Projects ABDS in Summary II: Layer 5 I590 Data Science Curriculum August Geoffrey Fox
Servizi di brokering Valerio Venturi CCR Giornata di formazione dedicata al Cloud Computing 6 Febbraio 2013.
DreamFactory for Microsoft Azure Is an Open Source REST API Platform That Enables Mobilization of Data in Minutes across Frameworks and Storage Methods.
Project Cumulus Overview March 15, End Goal Unified Public & Private PaaS for GlassFish/Java EE Simplify deployment of Java EE Apps on top of.
Course: Cluster, grid and cloud computing systems Course author: Prof
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
VIRTUALIZATION & CLOUD COMPUTING
FedCloud Blueprint Update
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Free Cloud Management Portal for Microsoft Azure Empowers Enterprise Users to Govern Their Cloud Spending and Optimize Cloud Usage and Planning MICROSOFT.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.
I590 Data Science Curriculum August
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
Introduction to Apache
Big Data Open Source Software and Projects ABDS in Summary I
Department of Intelligent Systems Engineering
Big Data, Simulations and HPC Convergence
Microsoft Azure Services Platform
I590 Data Science Curriculum August
Presentation transcript:

Big Data Open Source Software and Projects ABDS in Summary IV: Level 7 I590 Data Science Curriculum August Geoffrey Fox School of Informatics and Computing Digital Science Center Indiana University Bloomington

HPC-ABDS Layers 1)Message Protocols 2)Distributed Coordination: 3)Security & Privacy: 4)Monitoring: 5)IaaS Management from HPC to hypervisors: 6)DevOps: 7)Interoperability: 8)File systems: 9)Cluster Resource Management: 10)Data Transport: 11)SQL / NoSQL / File management: 12)In-memory databases&caches / Object-relational mapping / Extraction Tools 13)Inter process communication Collectives, point-to-point, publish-subscribe 14)Basic Programming model and runtime, SPMD, Streaming, MapReduce, MPI: 15)High level Programming: 16)Application and Analytics: 17)Workflow-Orchestration: Here are 17 functionalities. Technologies are presented in this order 4 Cross cutting at top 13 in order of layered diagram starting at bottom

Libvirt libvirt is an open source LGPL API, daemon and management tool for managing low-level platform virtualization. It can be used to manage KVM, Xen, VMware ESX, QEMU and other virtualization technologies. – libvirt APIs are widely used in the orchestration layer of hypervisors in the development of a cloud-based solution. libvirt itself is a C library, but it has bindings in other languages, notably in Python, Perl, OCaml, Ruby, Java, and PHP. libvirt for these programming languages is composed of wrappers around another class/package called libvirtmod. libvirtmod's implementation is closely associated with its counterpart in C/C++ in syntax and functionality.

Apache Libcloud Python library for interacting with many of the popular cloud service providers using a unified API. (One Interface To Rule Them All) More than 30 supported providers total OpenStack, OpenNebula, Amazon, Google etc. (~all except Azure) Four main APIs: – Cloud Servers and Block Storage - services such as Amazon EC2 and Rackspace CloudServers – Cloud Object Storage and CDN - services such as Amazon S3 and Rackspace CloudFiles – Load Balancers as a Service - services such as Amazon Elastic Load Balancer and GoGrid LoadBalancers – DNS as a Service - services such as Amazon Route 53 Google DNS, and Zerigo

TOSCA Topology and Orchestration Specification for Cloud Applications (TOSCA), open.org/tosca/TOSCA/v1.0/os/TOSCA-v1.0-os.html, is an OASIS standard language to describe a topology of cloud based web services, their components, relationships, and the processes that manage them. The TOSCA standard includes specifications to describe processes that create or modify web services. open.org/tosca/TOSCA/v1.0/os/TOSCA-v1.0-os.html It specifies system (computers, their properties, networks, storage) and is used to guide automated deployments OASIS is a major standards organization Related is Amazon AWS CloudFormation Template, which is a JSON data standard to allow cloud application administrators to define a collection of related AWS resources – OpenStack Heat has a similar specification WS-BPEL specifies software running on system (the workflow)

Apache Whirr Apache Whirr provides cloud-neutral libraries for running cloud services. Whirr uses Apache JClouds as its foundation, to eliminate when possible cloud-specific idiosyncrasies and maximize portability. Main features include: – A cloud-neutral way to run services – A common service API – Smart defaults to get a system running quickly Whirr began in 2007 as a set of scripts (originally in Bash, later in Python) to run Hadoop clusters on Amazon EC2. Those scripts were expanded to add features and support additional cloud providers. Whirr became an Apache Incubator project in 2010, at which time it was converted to Java, with Apache jclouds multi-cloud toolkit as its provisioning library. Whirr became an Apache Top- Level Project in 2011.

OCCI Open Cloud Computing Interface This comes from Open Grid Forum and provides an open API API that acts as a service front-end to an IaaS provider’s internal infrastructure management framework at level of Amazon EC2 interface. OCCI provides commonly understood semantics, syntax and a means of management in the domain of consumer-to-provider IaaS. It covers management of the entire life-cycle of OCCI-defined model entities and is compatible with existing standards including the Open Virtualization Format (OVF) and the Cloud Data Management Interface (CDMI). OpenNebula, CloudStack and OpenStack have OCCI interfaces

Apache JClouds supports cloud (VM) interoperability The portable Compute interface allows users to provision their infrastructure in any cloud provider including deployment configuration, provisioning and bootstrap. BlobStore interface, users can easily store objects in a wide range of blob store providers, regardless of how big the objects to manage are, or how many files are there. Load Balancer abstraction provides a common interface to configure the load balancers in any cloud that supports them. Just define the load balancer and the nodes that should join it, DNS, firewall, storage, configuration management, image management, provider specific APIs Supported clouds include Amazon, CloudStack, Docker, Google Compute Engine, HP, OpenStack, Rackspacehttps://jclouds.apache.org/reference/providers/

CDMI Cloud Data Management Interface A Cloud Storage standard from SNIA (Storage Networking Industry Association CDMI defines RESTful HTTP operations for assessing the capabilities of the cloud storage system – Allocating and accessing containers and objects – Managing users and groups – Implementing access control – Attaching metadata, making arbitrary queries, using persistent queues, specifying retention intervals and holds for compliance purposes, using a logging facility, billing – Moving data between cloud systems – Exporting data via other protocols such as iSCSI and NFS. Transport security is obtained via TLS