Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018.

Slides:



Advertisements
Similar presentations
Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.
Advertisements

Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Google App Engine Cloud B. Ramamurthy 7/11/2014CSE651, B. Ramamurthy1.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
Google App Engine Danail Alexiev Technical Trainer SoftAcad.bg.
Sitefinity Performance and Architecture
Introduction to Amazon Web Services (AWS)
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
Software Engineer, #MongoDBDays.
Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.
Larisa kocsis priya ragupathy
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Module 10 Administering and Configuring SharePoint Search.
Server to Server Communication Redis as an enabler Orion Free
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Terraform at Adobe Kelvin Jasperson. Introduction 2 Systems Adobe Audience Manager (AAM) Been with Adobe for 18 months AAM was acquired by.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Deploying Docker Datacenter on AWS © 2016, Amazon Web Services, Inc. or its affiliates. All rights reserved.
MySQL HA An overview Kris Buytaert. ● Senior Linux and Open Source ● „Infrastructure Architect“ ● I don't remember when I started.
Performance Tuning Renegade
Architecting Enterprise Workloads on AWS Mike Pfeiffer.
242: Get Your Head in the Cloud!
Platform as a Service (PaaS)
Going Serverless with AWS Lambda
Platform as a Service (PaaS)
Amazon AWS Solution Architect Associate Exam Questions PDF associate.html AWS Solution Training Exam.
Platform as a Service (PaaS)
Blue Mixology.
Diskpool and cloud storage benchmarks used in IT-DSS
Large-scale file systems and Map-Reduce
An Introduction to Cloud Computing
Informatica PowerCenter Performance Tuning Tips
Logo here Module 3 Microsoft Azure Web App. Logo here Module Overview Introduction to App Service Overview of Web Apps Hosting Web Applications in Azure.
Cloud Computing Platform as a Service
Senior Solutions Architect, MongoDB Inc.
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Software Architecture in Practice
Acutelearn Amazon Web Services Training Classroom Training Instructor led trainings at Acutelearn premises Corporate Training Custom tailored trainings.
Compliance and Control of AWS Resources at Scale with Cloud Custodian
SharePoint Cloud hosted Apps
AWS DevOps Engineer - Professional dumps.html Exam Code Exam Name.
Where can I download Aws Devops Engineer Professional Exam Study Material - Get Updated Aws Devops Engineer Professional Braindumps Dumps4downlaod.us
Amazon AWS Solution Architect Associate Exam Questions PDF associate-dumps.html AWS Solution Training.
2018 Amazon AWS DevOps Engineer Professional Dumps - DumpsProfessor
Get Amazon AWS-DevOps-Engineer-Professional Exam Real Questions - Amazon AWS-DevOps-Engineer-Professional Dumps Realexamdumps.com
Managing Clouds with VMM
Google App Engine Danail Alexiev
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
Predictive Performance
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
ColdFusion Performance Troubleshooting and Tuning
How to Keep Running When Things Go Wrong
AWS Cloud Computing Masaki.
Prof. Leonardo Mostarda University of Camerino
Why Threads Are A Bad Idea (for most purposes)
5 Azure Services Every .NET Developer Needs to Know
Cloud Security AWS as an example.
FaaS на AWS очима дотнетчика
Cloud Security AWS as an example.
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Features Overview.
A tutorial on building large-scale services
Setting up PostgreSQL for Production in AWS
Presentation transcript:

Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018

About Me /in/lyubent Lyuben Todorov Engineering Director Instaclustr EMEA University of Dundee Distributed Programming / OSS Social Media /in/lyubent

Overview Cassandra + λ Scale POC λ and C* (Cassandra) introduction Why use λ and Instaclustr’s managed service High Level Setup of λ and C* in Instaclustr Technical Challenges of using λ Lessons Learned

What is λ Server Operation App Database Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available First Mil Reqests = free / month What is a GB second ? GBsec = 1 second of running a 1 GB memory pool. App’s operations sit idle and just wait. Waste of resources

What is λ Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available λ Operation λ Operation Operations are no longer sitting around idle. Ops listen for events

What is λ User Event Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available Container Creation λ Operation λ Operation AWS Creates container, executes it. Upon completaion host container can be torn down. Not user’s responsibility to: wake λ create / teardown container Update / manage hardware Container Teardown

λ Use-cases Peeking Applications Event driven applications Short Code Execution Times 1 Apps that peek – λ will scale them for you 2 ED – you only pay for what you use 3 Can’t process continuously for hours. EC2 would be better for such an app Building ML model would suck in lambda 4

What is C* Highly Available Distributed Database No SPOF (p2p architecture) Open Source Tunable Consistency Partition Tolerant Consistent

C* Client Relevant to lambda: Gossip – used by client to discover nodes Create λ per DC and use DC Aware Client Query with LOCAL consistencies Be careful with client timestamps (due to cold start) Gossip = smart IT LEARNS

Instaclustr Hosted Service Simple Auto scaling service 24/7 Support Access to Analytics Dashboard for Monitoring Security Plugins

Connect λ to Backend Deploy and test web app How to set up λ Create λ VPC Create subnet for VPC Connect λ to Backend Deploy and test web app Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Create VPC Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ VPC Subnet Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Regarding Subnet - Add subnet per AZ Update Route Table Create & Deploy λ

Add Some Instant Awesome Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Pick Cassandra Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Pick Your Cloud Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

Choose Node Capacity and Type Create λ VPC Choose Node Capacity and Type Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Resizable or Standard Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Scalable Backend Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ We get a backend we don’t need to worry about Monitoring in place You can customize it if you want Can add alerts if needed Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Peering λ and C*’s VPCs Create λ VPC Create subnet for VPC Lambda’s VPC needs to be connected with Instaclustr’s Cassandra VPC via Instaclustr console: Provision C* Cluster VPC Peering Request – C* to λ Remember to accept the request in aws Update Route Table Create & Deploy λ

Add rule for the API Gateway Add rule for Instaclustr VPC Route Tables Create λ VPC Create subnet for VPC Add rule for the API Gateway Add rule for Instaclustr VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Create λ in AWS Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Role assigned must have cloudwatch log access Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Deploy λ Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Important that JAR stays small. Makes it easy to deploy quickly Little downtime between re-deployment Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ Deploy λ Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Gateway has 2 resources. We can POST to push data to Cassandra We can GET to retrieve data Update Route Table Create & Deploy λ

Architecture

The App Allows to process web requests POST used for inserting an event GET used for fetching an event Cassandra Table (Model): CREATE TABLE event ( id uuid, source text, type text, recorded timestamp, PRIMARY KEY(id) )

Two resources added (POST and GET) The App - API Gateway Two resources added (POST and GET)

The App - API Gateway POST /event/ writes an event to C* session.execute("INSERT INTO ic.event (id, source, type, recorded)" + "VALUES (now(), '10.1.13.77', 'Auth', toTimestamp(now())"); GET /event/{id} retrieves an event from C* by id. session.execute("SELECT * FROM ic.event"); Two resources

The App Code Java Application Request processed as stream Output as JSON public void handler(InputStream inputStream, OutputStream outputStream, Context context) { // IMPL. // Pass request to either GET or POST depending on context. }

The Challenges Application Scalability λ Warmup Time Reducing Memory Usage Connection Pooling Dependency Management Execution Environment Limits

Scaling Requests Load balancer can distribute requests Adds Complexity What if a backend changes Solving Scalability Issues Requires Config (Rule Based) If app is modified, requires re-deploynment on many machines App Connection Pooling / Coordination Adds infrastructure overhead Cassandra apps don’t like load balancers.

Scaling Requests with λ Lambda scales app automatically Re-deploy only 1 thing on app update Scaling - predetermined amount depends on region Account limits exist to prevent DDOS

Scaling Requests with λ Configure concurrent execution Write good app code! How Many invocations of the function in parallel? Stateless approach makes parallelism simpler Avoid Read-before-write patter!

Function Warmup Time Cold start is when λ has to initialise resources in order to execute a λ Container / NIC / other resources. Containers torn-down after 15 min of inactivity = cold start after λ Function avoids cold-start if constantly running

Function Warmup Time Request Response Time (sec) Parallel Requests (hundreds) 12 10 8 6 4 2 time (min)

Function Warmup Time Cheat – Ping the λ every 5-10 mins Create a Rule in AWS as an Event and schedule it to run every 10 min. Monitor container changes If you ping – avoid container tear down Monitor how bad swapping is. Do you need to act? stackoverflow.com/questions/42877521/

Reduce Memory Usage 512 MB by default Way too much for a simple C* client CPU is proportional to memory allocated to app Don’t reduce too much, Benchmark if your app is bottlenecking in CPU or memory

Connection Pool Management Creating connections is expensive Connection pooling allows reuse λ is stateless and asynchronous in nature Not allowed to save any state from one execution to another No connection sharing. Asynch, who will get the connection? Race conditions…

Connection Pool Management Store session state outside of handler function’s scope Variables outside of handler remain initialised across λ calls // Keep client wrapper outside of handleReqest function // will keep client initialised throughout λ execution private CassandraClient client = new CassandraClient(); public String handleRequest(Map<String,Object> input, Context context) { return "C* Version: " + client.getVersion(); }} Not allowed to save any state from one execution to another No connection sharing. Asynch, who will get the connection? Race conditions…

Dependency Management Lean dependencies Smaller App Faster Deployment Less Downtime pom.xml <dependency> <groupId>io.symphonia</groupId> <artifactId>lambda-logging</artifactId> <version>1.0.1</version> </dependency> Silly problem to have… Don’t add dependencies unless you really need them. Apps can grow big very quickly, once you reach several hundred MB, deployment slows down. Log4J Jar Size 8.6 MB Symphonia Jar Size 8.1 MB No Logger Jar Size 7.3 MB

Execution Environment Limits Limited to 512 MB of disk 3008 MB Memory Limit Max timeout – 5 mins. Max response payload – 6MB Event payload – 128 KB Per λ invocation Large Cassandra partitions (groups for rows) are out Large input is also out.

POC Benchmark Create client to send out periodically increasing requests Run for 7 min 30 sec Review Cassandra latency metric 45 k requests max. 7 min 3 sec duration Mix of R and W

Outcome Latency 75percentile (μs) Requests Time (sec) Stable latency No app intervention required Time (sec)

Q & λ Stable latency No app intervention required