Big Data Open Source Software and Projects ABDS in Summary IX: Layer 9 Data Science Curriculum March 5 2015 Geoffrey Fox

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

NGS computation services: API's,
Distributed Data Processing
Big Data Open Source Software and Projects ABDS in Summary XIV: Level 14B I590 Data Science Curriculum August Geoffrey Fox
Resource Management with YARN: YARN Past, Present and Future
Technical Architectures
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Big Data Open Source Software and Projects ABDS in Summary XVI: Layer 13 Part 1 Data Science Curriculum March Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary II: Layers 3 to 4 Data Science Curriculum March Geoffrey Fox
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Workload Management Massimo Sgaravatto INFN Padova.
Big Data Open Source Software and Projects ABDS in Summary XIII: Level 14A I590 Data Science Curriculum August Geoffrey Fox
Big Data Open Source Software and Projects ABDS in Summary VI: Level 9 I590 Data Science Curriculum August Geoffrey Fox
PRASHANTHI NARAYAN NETTEM.
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
A Platform for Fine-Grained Resource Sharing in the Data Center
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
December 8 & 9, 2005, Austin, TX SURA Cyberinfrastructure Workshop Series: Grid Technology: The Rough Guide Configuring Resources for the Grid Jerry Perez.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
DISTRIBUTED COMPUTING
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Introduction to Cloud Computing
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
BIG DATA APPLICATIONS & ANALYTICS LOOKING AT INDIVIDUAL HPCABDS SOFTWARE LAYERS 1/26/2015 Cloud Computing Software 1 Geoffrey Fox January BigDat.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Big Data Open Source Software and Projects ABDS in Summary I: Layers 1 to 2 Data Science Curriculum March Geoffrey Fox
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Big Data Open Source Software and Projects ABDS in Summary XVIII: Layer 14A Data Science Curriculum March Geoffrey Fox
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Big Data Open Source Software and Projects ABDS in Summary IV: Level 7 I590 Data Science Curriculum August Geoffrey Fox
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Recipes for Success with Big Data using FutureGrid Cloudmesh SDSC Exhibit Booth New Orleans Convention Center November Geoffrey Fox, Gregor von.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Panel Discussion Software Defined Ecosystems June BigSystem Software-Defined Ecosystems at HPDC Vancouver Canada Geoffrey Fox.
Big Data Open Source Software and Projects ABDS in Summary II: Layer 5 I590 Data Science Curriculum August Geoffrey Fox
Next Generation of Apache Hadoop MapReduce Owen
Part III BigData Analysis Tools (YARN) Yuan Xue
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
BIG DATA/ Hadoop Interview Questions.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
OpenPBS – Distributed Workload Management System
Introduction to Distributed Platforms
University of Technology
Distributed System Concepts and Architectures
湖南大学-信息科学与工程学院-计算机与科学系
Saranya Sriram Developer Evangelist | Microsoft
I590 Data Science Curriculum August
Presentation transcript:

Big Data Open Source Software and Projects ABDS in Summary IX: Layer 9 Data Science Curriculum March Geoffrey Fox School of Informatics and Computing Digital Science Center Indiana University Bloomington

Functionality of 21 HPC-ABDS Layers 1)Message Protocols: 2)Distributed Coordination: 3)Security & Privacy: 4)Monitoring: 5)IaaS Management from HPC to hypervisors: 6)DevOps: 7)Interoperability: 8)File systems: 9)Cluster Resource Management: 10)Data Transport: 11)A) File management B) NoSQL C) SQL 12)In-memory databases&caches / Object-relational mapping / Extraction Tools 13)Inter process communication Collectives, point-to-point, publish-subscribe, MPI: 14)A) Basic Programming model and runtime, SPMD, MapReduce: B) Streaming: 15)A) High level Programming: B) Application Hosting Frameworks 16)Application and Analytics: 17)Workflow-Orchestration: Here are 21 functionalities. (including 11, 14, 15 subparts) 4 Cross cutting at top 17 in order of layered diagram starting at bottom

Hadoop YARN Google Omega, FaceBook Corona Apache Hadoop YARN (Yet Another Resource Negotiator) came about as part of a major redesign of Hadoop in 2011 to address some of its limitations. The resource management and the job scheduling and monitoring functions, both of which had previously been performed by the Hadoop JobTracker, were implemented through the ResourceManager, ApplicationMaster and NodeManager processes within YARN. YARN has made it possible to use Hadoop for non-MapReduce jobs. It also eliminates the acknowledged performance bottleneck associated with the JobTracker of the previous Hadoop releases, allowing more scalability. YARN, which has been a sub-project of Apache Hadoop since 2012, represents a key point in Hadoop’s evolution, but it also adds additional complexity to a development environment which was already considered challenging to use. Facebook Corona the-hood-scheduling-mapreduce-jobs-more-efficiently-with- corona/ https:// the-hood-scheduling-mapreduce-jobs-more-efficiently-with- corona/ Google Omega content/uploads/2013/paper/Schwarzkopf.pdfhttp://eurosys2013.tudos.org/wp- content/uploads/2013/paper/Schwarzkopf.pdf

More on Yarn The Yarn ResourceManager has two main components: Scheduler and ApplicationsManager. The Scheduler is responsible for allocating resources to the various running applications subject to familiar constraints of capacities, queues etc. The Scheduler is pure scheduler in the sense that it performs no monitoring or tracking of status for the application. Also, it offers no guarantees about restarting failed tasks either due to application failure or hardware failures. The Scheduler performs its scheduling function based the resource requirements of the applications; it does so based on the abstract notion of a resource Container which incorporates elements such as memory, cpu, disk, network etc. In the first version, only memory is supported. The Scheduler has a pluggable policy plug-in, which is responsible for partitioning the cluster resources among the various queues, applications etc. The current Map-Reduce schedulers such as the CapacityScheduler and the FairScheduler would be some examples of the plug-in. The CapacityScheduler supports hierarchical queues to allow for more predictable sharing of cluster resources The ApplicationsManager is responsible for accepting job-submissions, negotiating the first container for executing the application specific ApplicationMaster and provides the service for restarting the ApplicationMaster container on failure. The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler. The per-application ApplicationMaster has the responsibility of negotiating appropriate resource containers from the Scheduler, tracking their status and monitoring for progress.

Llama Open Source Llama (Low Latency Application Master) is a tool from Cloudera that is used to interface between Impala and Yarn Llama acquires processes from yarn but manages them as long lived and able to be switched between different services This allows Impala to run 1000s of concurrent queries, many of them running with few-seconds (and even sub- second) latencies. Note the first application master created from a client process takes around 900 ms to be ready to submit resource requests. However subsequent application masters created from the same client process take a mean of 20 ms. Llama allows one to exploit this low latency

Apache Mesos Apache Mesos is a cluster manager that supports and simplifies the process of running multiple Hadoop jobs, or other high- performance applications, on a dynamically shared pool of resources. Three use cases: – organizations that use multiple cluster applications can use Mesos to share nodes between them to improve resource utilization as well as simplify system administration – Provide a common interface for many cluster programming frameworks (Apache Hama, Microsoft Dryad, and Google's Pregel and Caffeine) – Allow multiple instances of a framework on the same cluster, to support workload isolation and allow incremental deployment of upgrades Mesos, which started as a research project at UC Berkeley, became an Apache incubator project in 2011 and has since graduated to a top level project.

Apache Helix (LinkedIn) Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix provides the following features: – Automatic assignment of resource/partition to nodes – Node failure detection and recovery – Dynamic addition of Resources – Dynamic addition of nodes to the cluster – Pluggable distributed state machine to manage the state of a resource via state transitions – Automatic load balancing and throttling of transitions Helix manages the state of a resource by supporting a pluggable distributed state machine. One can define the state machine table along with the constraints for each state. Here are some common state models used: – Master, Slave – Online, Offline – Leader, Standby

Celery Celery is an asynchronous task queue/job queue environment based on RabbitMQ or equivalent and written in Python Task queues are used as a mechanism to distribute work across threads or machines. A task queue’s input is a unit of work called a task. Dedicated worker processes constantly monitor task queues for new work to perform. Celery communicates via messages, usually using a broker to mediate between clients and workers. To initiate a task, a client adds a message to the queue, which the broker then delivers to a worker. A Celery system can consist of multiple workers and brokers controlled by 1 or more clients, giving way to high availability and horizontal scaling. It is used in a similar way to Azure worker roles and Azure Queues of-Queues-with-Mark-Simms of-Queues-with-Mark-Simms This area called Many task Computing or “farming” or Master (client) – worker pattern

(Open)PBS Portable Batch System (or simply PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources. It is often used in conjunction with UNIX cluster environments. PBS is supported as a job scheduler mechanism by several meta schedulers including Moab by Cluster Resources (which became Adaptive Computing Enterprises Inc.) and GRAM (Grid Resource Allocation Manager), a component of the Globus Toolkit. The following versions of PBS are currently available: – OpenPBS — original open source version released in 1998 (not actively developed but can be found at – TORQUE — a fork of OpenPBS that is maintained by Adaptive Computing Enterprises, Inc. (formerly Cluster Resources, Inc.). See later slide – PBS Professional (PBS Pro) — the commercial version of PBS offered by Altair Engineering – Slurm – see it in next slide

Slurm Slurm is an open source C based job scheduler from HPC community with similar functionalities to OpenPBS Simple Linux Utility for Resource Management (SLURM) is a free and open- source job scheduler for the Linux kernel used by many of the world's supercomputers and computer clusters. – Best for high end supercomputers as scales best It provides three key functions. – First, it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. – Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job such as MPI) on a set of allocated nodes. – Finally, it arbitrates contention for resources by managing a queue of pending jobs. SLURM is the workload manager on roughly half of the TOP500 supercomputers, including Tianhe-2 that (as of June 2013) is the world's fastest computer. SLURM uses a best fit algorithm based on Hilbert curve scheduling or fat tree network topology in order to optimize locality of task assignments on parallel computers

SGE (Sun, Oracle Grid Engine) Oracle Grid Engine, previously known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments in Europe) or GRD (Global Resource Director), was a grid computing computer cluster software system (otherwise known as batch-queuing system), developed (acquired from Europe company) and supported by Sun Microsystems and later Oracle. There have been open source versions and multiple commercial versions of this technology, initially from Sun, later from Oracle and then from Univa Corporation. – On October 22, 2013 Univa announced it acquired the intellectual property and trademarks for the Grid Engine technology and that Univa will take over support. The original Grid Engine open-source project website closed in 2010, but versions of the technology are still available under its original Sun Industry Standards Source License. – Those projects were forked from the original project code and are known as Son of Grid Engine and Open Grid Scheduler. Grid Engine is typically used on a computer farm or high-performance computing (HPC) cluster and is responsible for accepting, scheduling, dispatching, and managing the remote and distributed execution of large numbers of standalone, parallel or interactive user jobs. It also manages and schedules the allocation of distributed resources such as processors, memory, disk space, and software licenses. Grid Engine used to be the foundation of the Sun Grid utility computing system, made available over the Internet in the United States in 2006, later becoming available in many other countries and having been an early version of a public Cloud Computing facility predating Amazon AWS, for instance. Similar to SLURM, PBS etc. in capability

Moab Maui Torque compares solutions Open source Maui extended in commercial Moab system and both compete with Slurm and SGE Both build on Open source Torque (Terascale Open-source Resource and QUEue Manager) which is based on PBS and provides core scheduling capability All 3 come from TORQUE provides enhancements over standard OpenPBS in the following areas: Fault Tolerance – Additional failure conditions checked/handled – Node health check script support Scheduling Interface – Extended query interface providing the scheduler with additional and more accurate information – Extended control interface allowing the scheduler increased control over job behavior and attributes – Allows the collection of statistics for completed jobs Scalability – Significantly improved server to MOM communication model – Ability to handle larger clusters (over 15 TF/2,500 processors) – Ability to handle larger jobs (over 2000 processors) – Ability to support larger server messages Usability: Extensive logging additions

HTCondor (originally Condor) Apache license HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, and/or to farm out work to idle desktop computers – so- called cycle scavenging. HTCondor runs on Linux, Unix, Mac OS X, FreeBSD, and contemporary Windows operating systems. HTCondor can seamlessly integrate both dedicated resources (rack- mounted clusters) and non-dedicated desktop machines (cycle scavenging) into one computing environment. HTCondor features include "DAGMan" which provides a mechanism to describe job dependencies.

Globus Toolkit GT 6.0 released September 2014 This is famous software that underlies many Grid computing activities and focuses on enabling the building of distributed systems It includes the following standards – Open Grid Services Architecture (OGSA) – Open Grid Services Infrastructure (OGSI), originally intended to form the basic “plumbing” layer for OGSA, but has been superseded by WSRF and WS-Management. – Web Services Resource Framework (WSRF) – Job Submission Description Language (JSDL) – Distributed Resource Management Application API (DRMAA) – WS-Management – WS-BaseNotification – SOAP – Web Services Description Language – Grid Security Infrastructure (GSI) Globus Online (layer 10) started as part of Globus Toolkit GRAM (Grid Resource Allocation Manager) is a major component of GT as are Security, information systems and distributed data management

HPC Pilot Jobs A pilot job is a type of multilevel scheduling, in which a resource is acquired by an application so that the application can schedule work into that resource directly, rather than going through a local job scheduler, which might lead to queue waits for each work unit. This term comes from the Condor High-Throughput Computing System, in which Condor GlideIns provides this functionality. Other examples of pilot jobs are: the BigJob implemented in SAGA (layer 7), Swift Coasters as part of the Swift parallel scripting system, the Falkon lightweight task execution framework, and HTCaaS. Pilot jobs are most often used on systems that have queues, as part of their purpose is, in some sense, to avoid multiple waits in these queues. These are most often found in parallel computing systems, but pilot jobs are usually part of a distributed application, and are many times associated with Many-task and High Throughput computing. Can be extended as in Pilot Data which are abstractions allowing interoperability by disassociating user of data from controller of data