HDFS on Kubernetes -- Lessons Learned

Slides:



Advertisements
Similar presentations
RASPro is a secure high performance remote application delivery platform through a perfect combination of application hosting and application streaming.
Advertisements

Buffers & Spoolers J L Martin Think about it… All I/O is relatively slow. For most of us, input by typing is painfully slow. From the CPUs point.
Overview Network security involves protecting a host (or a group of hosts) connected to a network Many of the same problems as with stand-alone computer.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
PlanetLab Operating System support* *a work in progress.
Introduction to Virtualization
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
MongoDB Sharding and its Threats
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Module 13: Configuring Availability of Network Resources and Content.
Windows Azure Conference 2014 Running Docker on Windows Azure.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
Module 7: Fundamentals of Administering Windows Server 2008.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
Cisco Confidential © 2010 Cisco and/or its affiliates. All rights reserved. 1 MSE Virtual Appliance Presenter Name: Patrick Nicholson.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Android Security Auditing Slides and projects at samsclass.info.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Data Communications and Networks Chapter 9 – Distributed Systems ICT-BVF8.1- Data Communications and Network Trainer: Dr. Abbes Sebihi.
Cloud Computing project NSYSU Sec. 1 Demo. NSYSU EE IT_LAB2 Outline  Our system’s architecture  Flow chart of the hadoop’s job(web crawler) working.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
#msitconf. Damien Caro Technical Evangelist Manager, Что будет, если приложение поместить в контейнер? What happens if the application.
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
Open Source Virtualization Andrey Meganov RHCA, RHCX Consultant / VDEL
Common System Exploits Tom Chothia Computer Security, Lecture 17.
Hadoop Introduction. Audience Introduction of students – Name – Years of experience – Background – Do you know Java? – Do you know linux? – Any exposure.
Petr Škoda, Jakub Koza Astronomical Institute Academy of Sciences
Holland Computing Center STAT802 Create and access Anvil Windows 10 SAS instance 01/23/2017.
HTCondor Annex (There are many clouds like it, but this one is mine.)
Scalable containers with Apache Mesos and DC/OS
Bentley Systems, Incorporated
Apache Ignite Data Grid Research Corey Pentasuglia.
ONAP/K8S Deployment OOM Team
Daniel Templeton, Cloudera, Inc.
Introduction to Distributed Platforms
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
File System Implementation
Spark Presentation.
The Client/Server Database Environment
Introduction to HDFS: Hadoop Distributed File System
In-Memory Performance
Enterprise security for big data solutions on Azure HDInsight
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Kubernetes Container Orchestration
Introduction to Docker
Azure Container Instances
Kubernetes intro.
The Basics of Apache Hadoop
HDFS on Kubernetes -- Lessons Learned
Intro to Docker Containers and Orchestration in the Cloud
GARRETT SINGLETARY.
RASPro is a secure high performance remote application delivery platform through a perfect combination of application hosting and application streaming.
Server & Tools Business
Webinar # April 2017 Isolates in the Cloud
Developing for the cloud with Visual Studio
Getting Started with Kubernetes and Rancher 2.0
Introduction to Apache
From Source to Production: The Latest in Container Dev
Orchestration & Container Management in EGI FedCloud
Lecture 16 (Intro to MapReduce and Hadoop)
OpenShift vs. Vanilla k8s on OpenStack IaaS
Linux and TCP/IP Networking
+ Attach service request
OpenShift as a cloud for Data Science
Kubernetes.
OpenStack Summit Berlin – November 14, 2018
Server & Tools Business
SQL Server on Containers
Presentation transcript:

HDFS on Kubernetes -- Lessons Learned Kimoon Kim, Pepperdata

Agenda Kubernetes intro Spark and HDFS on Kubernetes Demo Problems we fixed in Spark on K8s -- HDFS data locality, Secure HDFS

Kubernetes New open-source cluster manager. Runs programs in Linux containers. 1000+ contributors and 40,000+ commits. Designed and built for containers

“My app was running fine until someone installed their software” My spark job failed with ClassNotFound exception I had to change my Tomcat port

More isolation is good Kubernetes provides each program with: a lightweight virtual file system -- Docker image an independent set of S/W packages a virtual network interface a unique virtual IP address an entire range of ports Containers create more isolation layers Still very fast This other person install only on his container What if I have a server program like Tomcat Both my Tomcat and your Tomcat can open the same port under different IPs Like having VMS such that each program is the only program running

Other isolation layers Separate process ID space Max memory limit CPU share throttling Mountable volumes Config files -- ConfigMaps Credentials -- Secrets Local storages -- EmptyDir, HostPath Network storages -- PersistentVolumes

Kubernetes architecture Pod, a unit of scheduling and isolation. runs a user program in a primary container holds isolation layers like a virtual IP in an infra container Pod 1 Pod 2 Pod 3 10.0.0.2 10.0.0.3 10.0.1.2 node A node B We have pod 1, 2, and 3. Each pod gets a unique virtual IP address like 10.0.0.2. Virtual pod IPs look very different from physical IP addresses of cluster nodes Becomes important later when looking at Spark and HDFS together 196.0.0.5 196.0.0.6

Big Data on Kubernetes github.com/apache-spark-on-k8s Bloomberg, Google, Haiwen, Hyperpilot, Intel, Palantir, Pepperdata, Red Hat, and growing. patching up Spark Driver and Executor code to work on Kubernetes. being upstreamed with SPIP-Spark-on-Kubernetes. Related talk: spark-summit.org/2017/events/apache-spark-on- kubernetes/

Spark on Kubernetes Job 1 Job 2 Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.1.2 10.0.0.2 10.0.0.3 Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.1.3 10.0.0.4 10.0.0.5 node A node B 196.0.0.5 196.0.0.6

What about storage? Spark on Kubernetes supports cloud storages like S3. Your data is often stored on HDFS. Run HDFS itself on Kubernetes -- github.com/apache-spark-on- k8s/kubernetes-HDFS Access remote HDFS running outside Kubernetes

HDFS on Kubernetes github.com/apache-spark-on-k8s/kubernetes-HDFS Job 1 github.com/apache-spark-on-k8s/kubernetes-HDFS Job 2 HDFS Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.1.2 10.0.0.2 10.0.0.3 Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.0.4 10.0.0.5 10.0.1.3 Datanode Pod 2 Namenode Pod Datanode Pod 1 node A node B HostPath 196.0.0.5 HostPath 196.0.0.6 hdfs-namenode-selector = hdfs-namenode-0

Demo - Spark on K8s accessing secure HDFS Launch HDFS on K8s Run a Spark job on K8s to use HDFS Check the job pods

Problems we fixed in Spark on K8s HDFS data locality support -- Send compute to data Node locality Rack locality Where to launch executors Secure HDFS support Executor 1 Executor 2 FAST SLOW Datanode 1 Datanode 2 node A node B HDFS is faster than other distributed storage systems Fast in part because it supports data locality.

Solved: Node locality (/fileA → Datanode 1 → 196.0.0.5) == Location of Executor 1 (/fileA → Datanode 1 → 196.0.0.5) == (Executor 1 →10.0.0.3 → 196.0.0.5) (/fileA → Datanode 1 → 196.0.0.5) != (Executor 1 →10.0.0.3) Location of fileA == Location of Executor 1 Read /fileB Driver Executor Pod 1 Executor Pod 2 Read /fileA 10.0.1.2 10.0.0.2 10.0.0.3 /fileA /fileB Namenode Pod Datanode Pod 2 Datanode Pod 1 node A node B 196.0.0.5 196.0.0.6

Solved: Rack locality SLOW 10.0.1.2 10.0.2.2 10.0.0.2 196.0.0.5 (/fileA → Datanode 1 → 196.0.0.5 → Rack 1) == (Executor 1 →10.0.1.2 → 196.0.0.6 → Rack 1) (/fileA → Datanode 1 → 196.0.0.5 → Rack 1) != (Executor 1 →10.0.1.2) Rack of fileA == Rack of Executor 1 Read /fileB Driver Executor Pod 1 Executor Pod 2 Read /fileA 10.0.1.2 10.0.2.2 10.0.0.2 SLOW /fileA /fileB Datanode Pod 1 Datanode Pod 2 Datanode Pod 3 node A node B node C 196.0.0.5 196.0.0.6 196.0.1.5 Rack 1 Rack 2

Solved: Node preference Hey K8s, I’d like node A much more for my executors Node affinity Driver Executor Pod 1 Executor Pod 2 10.0.0.2 10.0.0.3 10.0.0.4 /fileA Datanode Pod 2 Datanode Pod 1 /fileB node A node B 196.0.0.5 196.0.0.6

Rescued data locality! The job was TeraValidate without data locality fix duration: 25 minutes with data locality fix duration: 10 minutes The job was TeraValidate Two runs processed the same input AWS CloudWatch chart

Problems we fixed in Spark on K8s HDFS data locality support Secure HDFS support Kerberos tickets HDFS tokens Long running jobs

encrypted with HDFS’ password Kerberos, simplified encrypted with A’s password encrypted with session key SK1 Order # SK1 Customer copy Order # SK1 Merchant copy I’m user A. May I talk to HDFS? SK1 Kerberos Server You guys should talk only if the other side knows SK1. I’ll secretly give SK1 only to each of you. I guarantee that the other side is genuine if they know SK1. SK1 copy for User A SK1 copy for HDFS A’s password HDFS’ password User A Ticket to HDFS SK1 copy for HDFS SK1 SK1 HDFS namenode Session 1 Requests/Responses HDFS’ password

HDFS Delegation Token Kerberos ticket, no good for executors on cluster nodes. Stamped with the client IP. Give tokens to driver and executors instead. Issued by namenode only if the client has a valid Kerberos ticket. No client IP stamped. Permit for driver and executors to use HDFS on your behalf across all cluster nodes.

Solved: Share tokens via K8s Secret Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.1.2 10.0.0.2 10.0.0.3 Secret 1 Kerberos Namenode Pod Datanode Pod 1 Datanode Pod 2 node A node B 196.0.0.5 196.0.0.6

Solved: Refresh tokens with K8s microservice Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.1.2 10.0.0.2 10.0.0.3 Refresh Pod Secret 1 10.0.0.8 Kerberos Namenode Pod Datanode Pod 2 node A Datanode Pod 1 node B 196.0.0.5 196.0.0.6

Solved: Keep Secret to yourself with K8s RBAC Job 1 Job 2 Driver Pod Executor Pod 1 Executor Pod 2 Client Secret 1 10.0.1.2 10.0.0.2 10.0.0.3 Secret 1 Driver Pod Executor Pod 1 Executor Pod 2 Client 10.0.0.4 10.0.0.5 10.0.1.3 node A node B 196.0.0.5 196.0.0.6

Recap Data locality works. Supports secure HDFS. Open problems Performance? Namenode High Availability

github.com/apache-spark-on-k8s Join us! github.com/apache-spark-on-k8s kimoon@pepperdata.com