Fluxo: Simple Service Compiler Emre Kıcıman, Ben Livshits, Madanlal Musuvathi {emrek, livshits,

Slides:

Advertisements

Similar presentations

Mark Simms Principal Program Manager Windows Azure Customer Advisory Team.

Advertisements

The Microsoft Cloud Azure Platform This presentation incorporates some content from Microsoft.

Scalable Content-aware Request Distribution in Cluster-based Network Servers Jianbin Wei 10/4/2001.

Approaches to EJB Replication. Overview J2EE architecture –EJB, components, services Replication –Clustering, container, application Conclusions –Advantages.

Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank

2/11/2004 Internet Services Overview February 11, 2004.

Architecture, Deployment Diagrams, Web Modeling Elizabeth Bigelow CS-15499C October 6, 2000.

1 Design and implementation of a Routing Control Platform Matthew Caesar, Donald Caldwell, Nick Feamster, Jennifer Rexford, Aman Shaikh, Jacobus van der.

Wide-area cooperative storage with CFS

Supervisor: Hadi Salimi Abdollah Ebrahimi Mazandaran University Of Science & Technology January,

Client-Server Processing and Distributed Databases

Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.

Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,

Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.

PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.

Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc

Client/Server Architectures

Building Offline/Cache Mode Web Apps Using Sync Framework Mike Clark Group Manager Cloud Data Services Team

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

Windows Azure SQL Database and Storage Name Title Organization.

Word Wide Cache Distributed Caching for the Distributed Enterprise.

Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.

SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.

 Anil Nori Distinguished Engineer Microsoft Corporation.

M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.

Web Caching By Neeraj Agrawal. Caching Caching is widely used for improving performance in many context( e.g processor caches in hardware, buffer pool.

AZR308. Building distributed systems on an abstraction against commodity hardware at Internet scale, composed of multiple services. Distributed System.

2-Tier,3-Tier datawarehouse Submitted by Manisha Dubey & Akanksha Agrawal.

Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.

Building micro-service based applications using Azure Service Fabric

Introduction to EJB. What is an EJB ?  An enterprise java bean is a server-side component that encapsulates the business logic of an application. By.

Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.

Presented by Vishy Grandhi.  Lesson 1: AX Overview  Lesson 2: Role based security  Lesson 3: Monitoring  Troubleshooting.

Features Scalability Manage Services Deliver Features Faster Create Business Value Availability Latency Lifecycle Data Integrity Portability.

Mick Badran Using Microsoft Service Fabric to build your next Solution with zero downtime – Lvl 300 CLD32 5.

Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.

Building Cloud Solutions Presenter Name Position or role Microsoft Azure.

Pinpoint: Problem Determination in Large, Dynamic Internet Services Mike Chen, Emre Kıcıman, Eugene Fratkin {emrek,

(re)-Architecting cloud applications on the windows Azure platform CLAEYS Kurt Technology Solution Professional Microsoft EMEA.

Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.

Andy Roberts Data Architect

Cloud Computing: Pay-per-Use for On-Demand Scalability Developing Cloud Computing Applications with Open Source Technologies Shlomo Swidler.

E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.

Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server

Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.

Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:

CalvinFS: Consistent WAN Replication and Scalable Metdata Management for Distributed File Systems Thomas Kao.

Connected Infrastructure

Chapter 9: The Client/Server Database Environment

Slide credits: Thomas Kao

Connected Maintenance Solution

The Client/Server Database Environment

Open Source distributed document DB for an enterprise

Connected Maintenance Solution

Cloud Computing Platform as a Service

The Client/Server Database Environment

Connected Infrastructure

CHAPTER 3 Architectures for Distributed Systems

Exploring Azure Event Grid

Service Fabric Patterns & Best Practices

Outline Virtualization Cloud Computing Microsoft Azure Platform

Building a Database on S3

Lecture 1: Multi-tier Architecture Overview

Web Application Architectures

Design pattern for cloud Application

Image Magick in the Cloud Scalable Image Processing Service

Web Application Architectures

Presentation transcript:

Fluxo: Simple Service Compiler Emre Kıcıman, Ben Livshits, Madanlal Musuvathi {emrek, livshits,

Architecting Internet Services Difficult challenges and requirements – 24x7 availability – Over 1000 request/sec CNN on election day: 276M page views Akamai on election day: 12M req/sec – Manage many terabytes or petabytes of data – Latency requirements <100ms

Flickr: Photo Sharing App Servers Databases Cal Henderson, “Scalable Web Architectures: Common Patterns and Approaches,” Web 2.0 Expo NYC $ $ Cache Images $ $ Cache Page Request Image Request

Common Architectural Patterns (In no particular order) Tiering: simplifies through separation Partitioning: aids scale-out Replication: redundancy and fail-over Data duplication & de-normalization: improve locality and perf for common-case queries Queue or batch long-running tasks

Everyone does it differently! Many caching schemes – Client-side, front-end, backend, step-aside, CDN Many partitioning techniques – Partition based on range, hash, lookup Data de-normalization and duplication – Secondary indices, materialized view, or multiple copies Tiering – 3-tier (presentation/app-logic/database) – 3-tier (app-layer / cache / db) – 2-tier (app-layer / db)

Flickr: Photo Sharing App Servers Databases Cal Henderson, “Scalable Web Architectures: Common Patterns and Approaches,” Web 2.0 Expo NYC $ $ Cache Images $ $ Cache Page Request Image Request Different caching schemes!

Flickr: Photo Sharing App Servers Databases Cal Henderson, “Scalable Web Architectures: Common Patterns and Approaches,” Web 2.0 Expo NYC $ $ Cache Images $ $ Cache Page Request Image Request Different partitioning and replication schemes!

Differences for good reason Choices depend on many things Component performance and resource requirements Workload distribution Persistent data distribution Read/write rates Intermediate data sizes Consistency requirements

Differences for good reason Choices depend on many things Component performance and resource requirements Workload distribution Persistent data distribution Read/write rates Intermediate data sizes Consistency requirements These are all measurable in real systems!

Differences for good reason Choices depend on many things Component performance and resource requirements Workload distribution Persistent data distribution Read/write rates Intermediate data sizes Consistency requirements These are all measurable in real systems! Except this one!

F LUXO Goal: Separate service’s logical programming from necessary architectural choices E.g., Caching, partitioning, replication, … Techniques: 1.Restricted programming model Coarse-grained dataflow with annotations 2.Runtime request tracing Resource usage, performance and workload distributions 3.Analyze runtime behavior -> determine best choice Simulations, numerical or queuing models, heuristics…

Architecture Dataflow Program + Annotations Dataflow Program + Annotations F LUXO Compiler Environment Info Environment Info Runtime Profile Runtime Profile Analysis Module Analysis Module Analysis Module Analysis Module Analysis Module Analysis Module Program Transform Program Transform Thin Execution Layer Thin Execution Layer Deployable Program

Dataflow Program CloudDB:: Messages CloudDB:: Messages CloudDB:: Friends CloudDB:: Friends CloudDB:: Messages CloudDB:: Messages UserID Merge message lists Merge message lists List html Restrictions All components are idempotent No internal state State update restrictions Restrictions All components are idempotent No internal state State update restrictions

What do We Annotate? CloudDB:: Messages CloudDB:: Messages CloudDB:: Friends CloudDB:: Friends CloudDB:: Messages CloudDB:: Messages UserID Merge message lists Merge message lists List html Volatile Annotate Semantics Consistency requirements (No strong consistency) Side-effects Annotate Semantics Consistency requirements (No strong consistency) Side-effects

What do We Measure? CloudDB:: Messages CloudDB:: Messages CloudDB:: Friends CloudDB:: Friends CloudDB:: Messages CloudDB:: Messages UserID Merge message lists Merge message lists List html On every edge Data content/hash Data size Component performance and resource profiles Queue info On every edge Data content/hash Data size Component performance and resource profiles Queue info

How do we transform? Caching CloudDB:: Friends CloudDB:: Friends Messages Cache Messages Cache Messages Cache Messages Cache Pick First

How do we transform? Caching Messages Cache Messages Cache Messages Cache Messages Cache Pick First

So, where do we put a cache? CloudDB:: Messages CloudDB:: Messages CloudDB:: Friends CloudDB:: Friends CloudDB:: Messages CloudDB:: Messages UserID Merge message lists Merge message lists List html Volatile 1.Analyze Dataflow: Identify subgraphs with single input, single output 2. Check Annotations: Subgraphs should not contain nodes with side-effects; or volatile 3. Analyze measurements Data size -> what fits in cache size? Content hash -> expected hit rate Subgraph perf -> expected benefit 1.Analyze Dataflow: Identify subgraphs with single input, single output 2. Check Annotations: Subgraphs should not contain nodes with side-effects; or volatile 3. Analyze measurements Data size -> what fits in cache size? Content hash -> expected hit rate Subgraph perf -> expected benefit

Related Work MapReduce/Dryad – separates app from scalability/reliability architecture but only for batch WaveScope – uses dataflow and profiling for partitioning computation in sensor network J2EE – provides implementation of common patterns but developer still requires detailed knowledge SEDA – event driven system separates app from resource controllers

Conclusion Q: Can we automate architectural decisions? Open Challenges: – Ensuring correctness of transformations – Improving analysis techniques Current Status: In implementation – Experimenting with programming model restrictions and transformations If successful would enable easier development and improve agility

Extra Slides

Utility Computing Infrastructure On-demand compute and storage – Machines no longer bottleneck to scalability Spectrum of APIs and choices – Amazon EC2, Microsoft Azure, Google AppEngine Developer figures out how to use resources effectively – Though, AppEngine and Azure restrict programming model to reduce potential problems

Flickr: Photo Sharing App ServerDatabase Web ServerImages Cal Henderson, “Scalable Web Architectures: Common Patterns and Approaches,” Web 2.0 Expo NYC High- Level

Fault Model Best-effort execution layer provides machines – On failure, new machine is allocated Deployed program must have redundancy to work through failures Responsibility of Fluxo compiler

Storage Model Store data in an “external” store – S3, Azure, Sql Data Services – may be persistent, session, soft, etc. Data written as delta-update – Try to make reconciliation after partition easier Writes have deterministic ID for idempotency

Getting our feet wet… Built toy application: Weather service – Read-only service operating on volatile data Run application on workload traces from Popfly – Capture performance and intermediate workload distributions Built cache placement optimizer – Replays traces in simulator to test a cache placement – Simulated annealing to explore the space of choices

Caching choices vary by workload 31% 4% 65% 13% 52% 13% 62% 22% 9%

Example #2: Pre/post compute