Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018.

Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018

About Me /in/lyubent Lyuben Todorov
Engineering Director Instaclustr EMEA University of Dundee Distributed Programming / OSS Social Media /in/lyubent

Overview Cassandra + λ Scale POC λ and C* (Cassandra) introduction
Why use λ and Instaclustr’s managed service High Level Setup of λ and C* in Instaclustr Technical Challenges of using λ Lessons Learned

What is λ Server Operation App Database Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available First Mil Reqests = free / month What is a GB second ? GBsec = 1 second of running a 1 GB memory pool. App’s operations sit idle and just wait. Waste of resources

What is λ Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available λ Operation λ Operation Operations are no longer sitting around idle. Ops listen for events

What is λ User Event Serverless Pay for execution time (1M requests free) (400k GBsec free) Auto-scale Always Available Container Creation λ Operation λ Operation AWS Creates container, executes it. Upon completaion host container can be torn down. Not user’s responsibility to: wake λ create / teardown container Update / manage hardware Container Teardown

λ Use-cases Peeking Applications Event driven applications
Short Code Execution Times 1 Apps that peek – λ will scale them for you 2 ED – you only pay for what you use 3 Can’t process continuously for hours. EC2 would be better for such an app Building ML model would suck in lambda 4

What is C* Highly Available Distributed Database
No SPOF (p2p architecture) Open Source Tunable Consistency Partition Tolerant Consistent

C* Client Relevant to lambda:
Gossip – used by client to discover nodes Create λ per DC and use DC Aware Client Query with LOCAL consistencies Be careful with client timestamps (due to cold start) Gossip = smart IT LEARNS

Instaclustr Hosted Service
Simple Auto scaling service 24/7 Support Access to Analytics Dashboard for Monitoring Security Plugins

Connect λ to Backend Deploy and test web app
How to set up λ Create λ VPC Create subnet for VPC Connect λ to Backend Deploy and test web app Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

VPC Peering Request – C* to λ
Create VPC Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

VPC Subnet Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Regarding Subnet - Add subnet per AZ Update Route Table Create & Deploy λ

Add Some Instant Awesome
Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Pick Cassandra Update Route Table Create & Deploy λ

Pick Your Cloud Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

Choose Node Capacity and Type
Create λ VPC Choose Node Capacity and Type Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Resizable or Standard Update Route Table Create & Deploy λ

Scalable Backend Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ We get a backend we don’t need to worry about Monitoring in place You can customize it if you want Can add alerts if needed Update Route Table Create & Deploy λ

Peering λ and C*’s VPCs Create λ VPC Create subnet for VPC Lambda’s VPC needs to be connected with Instaclustr’s Cassandra VPC via Instaclustr console: Provision C* Cluster VPC Peering Request – C* to λ Remember to accept the request in aws Update Route Table Create & Deploy λ

Add rule for the API Gateway Add rule for Instaclustr VPC
Route Tables Create λ VPC Create subnet for VPC Add rule for the API Gateway Add rule for Instaclustr VPC Provision C* Cluster VPC Peering Request – C* to λ Update Route Table Create & Deploy λ

Create λ in AWS Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Role assigned must have cloudwatch log access Update Route Table Create & Deploy λ

Deploy λ Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Important that JAR stays small. Makes it easy to deploy quickly Little downtime between re-deployment Update Route Table Create & Deploy λ

Deploy λ Create λ VPC Create subnet for VPC Provision C* Cluster VPC Peering Request – C* to λ Gateway has 2 resources. We can POST to push data to Cassandra We can GET to retrieve data Update Route Table Create & Deploy λ

Architecture

The App Allows to process web requests
POST used for inserting an event GET used for fetching an event Cassandra Table (Model): CREATE TABLE event ( id uuid, source text, type text, recorded timestamp, PRIMARY KEY(id) )

Two resources added (POST and GET)
The App - API Gateway Two resources added (POST and GET)

The App - API Gateway POST /event/ writes an event to C* session.execute("INSERT INTO ic.event (id, source, type, recorded)" + "VALUES (now(), ' ', 'Auth', toTimestamp(now())"); GET /event/{id} retrieves an event from C* by id. session.execute("SELECT * FROM ic.event"); Two resources

The App Code Java Application Request processed as stream
Output as JSON public void handler(InputStream inputStream, OutputStream outputStream, Context context) { // IMPL. // Pass request to either GET or POST depending on context. }

The Challenges Application Scalability λ Warmup Time
Reducing Memory Usage Connection Pooling Dependency Management Execution Environment Limits

Scaling Requests Load balancer can distribute requests Adds Complexity
What if a backend changes Solving Scalability Issues Requires Config (Rule Based) If app is modified, requires re-deploynment on many machines App Connection Pooling / Coordination Adds infrastructure overhead Cassandra apps don’t like load balancers.

Scaling Requests with λ
Lambda scales app automatically Re-deploy only 1 thing on app update Scaling - predetermined amount depends on region Account limits exist to prevent DDOS

Scaling Requests with λ
Configure concurrent execution Write good app code! How Many invocations of the function in parallel? Stateless approach makes parallelism simpler Avoid Read-before-write patter!

Function Warmup Time Cold start is when λ has to initialise resources in order to execute a λ Container / NIC / other resources. Containers torn-down after 15 min of inactivity = cold start after λ Function avoids cold-start if constantly running

Function Warmup Time Request Response Time (sec)
Parallel Requests (hundreds) 12 10 8 6 4 2 time (min)

Function Warmup Time Cheat – Ping the λ every 5-10 mins Create a Rule in AWS as an Event and schedule it to run every 10 min. Monitor container changes If you ping – avoid container tear down Monitor how bad swapping is. Do you need to act? stackoverflow.com/questions/ /

Reduce Memory Usage 512 MB by default
Way too much for a simple C* client CPU is proportional to memory allocated to app Don’t reduce too much, Benchmark if your app is bottlenecking in CPU or memory

Connection Pool Management
Creating connections is expensive Connection pooling allows reuse λ is stateless and asynchronous in nature Not allowed to save any state from one execution to another No connection sharing. Asynch, who will get the connection? Race conditions…

Connection Pool Management
Store session state outside of handler function’s scope Variables outside of handler remain initialised across λ calls // Keep client wrapper outside of handleReqest function // will keep client initialised throughout λ execution private CassandraClient client = new CassandraClient(); public String handleRequest(Map<String,Object> input, Context context) { return "C* Version: " + client.getVersion(); }} Not allowed to save any state from one execution to another No connection sharing. Asynch, who will get the connection? Race conditions…

Dependency Management
Lean dependencies Smaller App Faster Deployment Less Downtime pom.xml <dependency> <groupId>io.symphonia</groupId> <artifactId>lambda-logging</artifactId> <version>1.0.1</version> </dependency> Silly problem to have… Don’t add dependencies unless you really need them. Apps can grow big very quickly, once you reach several hundred MB, deployment slows down. Log4J Jar Size 8.6 MB Symphonia Jar Size 8.1 MB No Logger Jar Size 7.3 MB

Execution Environment Limits
Limited to 512 MB of disk 3008 MB Memory Limit Max timeout – 5 mins. Max response payload – 6MB Event payload – 128 KB Per λ invocation Large Cassandra partitions (groups for rows) are out Large input is also out.

POC Benchmark Create client to send out periodically increasing requests Run for 7 min 30 sec Review Cassandra latency metric 45 k requests max. 7 min 3 sec duration Mix of R and W

Outcome Latency 75percentile (μs) Requests Time (sec) Stable latency
No app intervention required Time (sec)

Q & λ Stable latency No app intervention required

Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018.

Similar presentations

Presentation on theme: "Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018.

Similar presentations

Presentation on theme: "Cassandra + λ Scale POC Presented by Lyuben Todorov July 2018."— Presentation transcript:

Similar presentations

About project

Feedback