Sahara Project Onboarding Telles Nobrega

Slides:



Advertisements
Similar presentations
Three Perspectives & Two Problems Shivnath Babu Duke University.
Advertisements

© 2013 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. Jim Donahue | Principal Scientist Adobe Systems Technology Lab Flint: Making.
HPC Pack On-Premises On-premises clusters Ability to scale to reduce runtimes Job scheduling and mgmt via head node Reliability HPC Pack Hybrid.
© Hortonworks Inc Running Non-MapReduce Applications on Apache Hadoop Hitesh Shah & Siddharth Seth Hortonworks Inc. Page 1.
Hadoop: The Definitive Guide Chap. 2 MapReduce
Overview of Hadoop for Data Mining Federal Big Data Group confidential Mark Silverman Treeminer, Inc. 155 Gibbs Street Suite 514 Rockville, Maryland
Hadoop Ecosystem Overview
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
Big Data and Hadoop and DLRL Introduction to the DLRL Hadoop Cluster Sunshin Lee and Edward A. Fox DLRL, CS, Virginia Tech Feb. 18, 2015 presentation for.
Maven and Stack Starter Michael Youngstrom. Notes This is a training NOT a presentation Please ask questions Prerequisites – Introduction to Java Stack.
Software Architecture
Cloud Distributed Computing Platform 2 Content of this lecture is primarily from the book “Hadoop, The Definite Guide 2/e)
Luis Russi¹, Carlos R. Senna¹, Edmundo R. M. Madeira¹, Xuan Liu², Shuai Zhao², and Deep Medhi² Hadoop-in-a-Hybrid-Cloud GEC21 The 21st GENI Engineering.
Data and SQL on Hadoop. Cloudera Image for hands-on Installation instruction – 2.
Artifact Management Managing Maven and other artifacts in Perforce.
Programming in Hadoop Guangda HU Huayang GUO
Nov 2006 Google released the paper on BigTable.
Filtering, aggregating and histograms A FEW COMPLETE EXAMPLES WITH MR, SPARK LUCA MENICHETTI, VAG MOTESNITSALIS.
Spark and Jupyter 1 IT - Analytics Working Group - Luca Menichetti.
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
Part III BigData Analysis Tools (Storm) Yuan Xue
Learn. Hadoop Online training course is designed to enhance your knowledge and skills to become a successful Hadoop developer and In-depth knowledge of.
ORNL is managed by UT-Battelle for the US Department of Energy Spark On Demand Deploying on Rhea Dale Stansberry John Harney Advanced Data and Workflows.
Hadoop Introduction. Audience Introduction of students – Name – Years of experience – Background – Do you know Java? – Do you know linux? – Any exposure.
Argus EMI Authorization Integration
Daisy4nfv: An Installer Based upon Open Source Project – Daisy & Kolla
ONAP on Vagrant for ONAPers
Admin Console for Glassfish v2
Daniel Templeton, Cloudera, Inc.
Hadoop Architecture Mr. Sriram
BD-Cache: Big Data Caching for Datacenters
Introduction to Distributed Platforms
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
Unit 2 Hadoop and big data
Script IBM SPSS & Apache Spark.
How to download, configure and run a mapReduce program In a cloudera VM Presented By: Mehakdeep Singh Amrit Singh Chaggar Ranjodh Singh.
Working With Azure Batch AI
HTCondor and LSST Stephen Pietrowicz Senior Research Programmer National Center for Supercomputing Applications HTCondor Week May 2-5, 2017.
Provisioning of RAC Database on configured Stack
Spark Presentation.
Data Platform and Analytics Foundational Training
Pyspark 최 현 영 컴퓨터학부.
Continuous Deployment tool
Adding Salt to your Debian systems
INDIGO – DataCloud PaaS
HDInsight makes Hadoop Easy
Onboarding Session Victoria Martinez de la Cruz (vkmc)
JD Edwards Support and Oracle Cloud Infrastructure: A Successful Path to Oracle Cloud
Tech Inside Extended Document Management System (EDMS)
Cloud Distributed Computing Environment Hadoop
CS6604 Digital Libraries IDEAL Webpages Presented by
CS110: Discussion about Spark
Data science and machine learning at scale, powered by Jupyter
Hadoop for SQL Server Pros
OpenStack-alapú privát felhő üzemeltetés
Introduction to Apache
SAHARA Project overview and update May 2018
Presented By - Avinash Pawar
Overview of big data tools
Execution Framework: Hadoop 2.x
Databricks: the new kid on the block
Setup Sqoop.
Hackfest April 2017 Orange labs. Paris
Hadoop Installation Fully Distributed Mode
EN Software Carpentry Python – A Crash Course Esoteric Sections Compiled Languages.
HOW TO ADD LIVE CHAT TO YOUR WOOCOMMERCE STORE. TABLE OF CONTENT Create Account on HappyFox Chat Integrate HappyFox WooCommerce on Your Store Connect.
Overview on CI Use JJB (Jenkins Job Builder) to manage Jenkins jobs.
SQL Server 2019 Bringing Apache Spark to SQL Server
Efficient big data in openstack*
Presentation transcript:

Sahara Project Onboarding Telles Nobrega

Big Data Processing as a Service Sahara Overview Big Data Processing as a Service Big Data processing framework provision Ambari Cloudera Vanilla (upstream hadoop) MapR Spark Storm EDP (Elastic Data Processing) Running jobs on those frameworks

Sahara Overview Namings Node Group Templates Services Specific roles of a cluster instance Node Group Templates Describes a group of nodes within a cluster Cluster Templates Describes a group of Node Group templates to form a cluster Job binary Job executable (jar, .py) Job template Describes a job to be run Data Sources Source used pull data into Sahara or store data out of Sahara

Sahara Overview Image Generation Sahara Image Pack Sahara Image Elements tox -e venv -- sahara-image-create -p spark -s [1.3.1|1.6.0|2.1.0|2.2.0] # spark standalone tox -e venv -- sahara-image-create -p vanilla -v 2.7.1 -s [1.6.0|2.1.0|2.2.0] # spark on vanilla Sahara Image Pack tox -e image -- sahara-image-pack --image CentOS.qcow2 \ --config-file etc/sahara/sahara.conf \ cdh 5.7.0 [cdh 5.7.0 specific arguments, if any]

Architecture

Architecture

Sahara Repos Current sahara - https://git.openstack.com/openstack/sahara sahara-image-elements - https://git.openstack.com/openstack/sahara-image-elements python-saharaclient - https://git.openstack.com/openstack/python-saharaclient sahara-dashboard - https://git.openstack.com/openstack/sahara-dashboard sahara-extra - https://git.openstack.com/openstack/sahara-extra Planned sahara-plugins

Running Sahara # first terminal $ sahara-venv/bin/sahara-api --config-file sahara-venv/etc/sahara.conf # second terminal $ sahara-venv/bin/sahara-engine --config-file sahara-venv/etc/sahara.conf

Running Sahara Tests Unit Tests (from sahara) tox -e py27 tox -e py35 tox -e pep8 ... Scenarios (From sahara-tests) sahara-scenario {posargs} Tempests Install sahara tempest plugin from sahara-tests Follow the default tempest process (see tempest docs)

Sahara Docs https://docs.openstack.org/sahara/latest/ https://docs.openstack.org/sahara-tests/latest/ https://docs.openstack.org/sahara/latest/reference/restapi.html https://developer.openstack.org/api-ref/data-processing/