Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson.

Slides:



Advertisements
Similar presentations
Inner Architecture of a Social Networking System Petr Kunc, Jaroslav Škrabálek, Tomáš Pitner.
Advertisements

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
1 Copyright © 2012 Oracle and/or its affiliates. All rights reserved. Convergence of HPC, Databases, and Analytics Tirthankar Lahiri Senior Director, Oracle.
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
Predictive Parallelization: Taming Tail Latencies in
Eric Nelson Application Architect, Microsoft |
CS162 Section Lecture 9. KeyValue Server Project 3 KVClient (Library) Client Side Program KVClient (Library) Client Side Program KVClient (Library) Client.
1 © Fluke networks 2004 Everett WAMonday, May 18, 2015 Application Performance & Network Analysis Improving the end user experience.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
W alkie Doggie is a web application that allows dog owners to help each other with their dog walks. It’s main feature is the walkies, which are the user’s.
CS 415 N-Tier Application Development By Umair Ashraf July 2nd,2013 National University of Computer and Emerging Sciences Lecture # 7 N-Tier Architecture.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
1 CSSE 477 – A bit more on Performance Steve Chenoweth Friday, 9/9/11 Week 1, Day 2 Right – Googling for “Performance” gets you everything from Lady Gaga.
Database Connectivity Rose-Hulman Institute of Technology Curt Clifton.
Multiple Tiers in Action
Seafile - Scalable Cloud Storage System
1 Scaling Stack Overflow David Fullerton, VP QCon NYC
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
22-Aug-15 | 1 |1 | Help! I need more servers! What do I do? Scaling a PHP application.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Software Engineer, #MongoDBDays.
What makes Facebook do what it does? By Gavin Mais.
Copyright © Meebo, Inc. All rights reserved presents... Scaling Synchronous Web Apps Web 2.0 Expo, New York 9/18/2008.
Lecture 11: DMBS Internals
CDN Brokering* Presented By Nick Arnold Authors Alexandros Biliris, et. Al.
Introduction to Hadoop and HDFS
Data Structures & Algorithms and The Internet: A different way of thinking.
IMDGs An essential part of your architecture. About me
Module 10 Administering and Configuring SharePoint Search.
Scaling Out Without Partitioning Phil Bernstein & Colin Reid Microsoft Corporation A Novel Transactional Record Manager for Shared Raw Flash © 2010 Microsoft.
TCP behavior of a Busy Internet Server: Analysis and Improvements Y2K Oct.10 Joo Young Hwang Computer Engineering Research Laboratory KAIST. EECS.
FireProof. The Challenge Firewall - the challenge Network security devices Critical gateway to your network Constant service The Challenge.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Transport Layer3-1 TCP throughput r What’s the average throughout of TCP as a function of window size and RTT? m Ignore slow start r Let W be the window.
PHP Performance w/APC + thaicyberpoint.com thaithinkpad.com thaihi5.com.
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Bigtable: A Distributed Storage System for Structured Data
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
Studies of LHCb Trigger Readout Network Design Karol Hennessy University College Dublin Karol Hennessy University College Dublin.
Cloud Computing: Pay-per-Use for On-Demand Scalability Developing Cloud Computing Applications with Open Source Technologies Shlomo Swidler.
Gorilla: A Fast, Scalable, In-Memory Time Series Database
GENERAL SCALABILITY CONSIDERATIONS
How to tune your applications before moving your database to Microsoft Azure SQL Database (MASD) OK, you've jumped into your Azure journey by creating.
Global Search: An Introduction and Administrator Perspective
Scaling Network Load Balancing Clusters
Recipes for Use With Thin Clients
The Case for a Session State Storage Layer
Lecture 16: Data Storage Wednesday, November 6, 2006.
Alternative system models
Time is the enemy: Ten Core Lessons for Achieving Peak
CSE-291 Cloud Computing, Fall 2016 Kesden
Maximum Availability Architecture Enterprise Technology Centre.
Load Balancing Memcached Traffic Using SDN
Software Architecture in Practice
Lecture 11: DMBS Internals
What is the Azure SQL Datawarehouse?
COS 518: Advanced Computer Systems Lecture 9 Michael Freedman
Agenda Database Development – Best Practices Why Performance Matters ?
The Tail At Scale Dean and Barroso, CACM 2013, Pages 74-80
AWS Cloud Computing Masaki.
Caching 50.5* + Apache Kafka
Fast Accesses to Big Data in Memory and Storage Systems
Presentation transcript:

Fast Data at Massive Scale Lessons Learned at Facebook Bobby Johnson

Me Director of Engineering –Scaling and Performance –Site Security –Site Reliability –Distributed Systems –Development tools –Customer Service Tools Took Facebook from 7M users to 120M.

Architecture Load Balancer (assigns a web server) Web Server (PHP assembles data) Memcache (fast) Database (slow, persistent) Other services Search, Feed, etc (ignore for now)

- 1/2 the time is in PHP - 1/4 is in memcache - 1/8 is in database

One year ago, almost half the time was memcache

Network Incast PHP Client Switch memcache Many Small Get Requests

Network Incast PHP Client Switch memcache Many big data packets

Clustering PHP Client memcache 10 objects 1 round trip for 10 objects

Clustering PHP Client memcache 5 objects - 2 round trips total - 1 round trip per server - longest request is 5

Clustering PHP Client memcache 3 objects 4 objects - 3 round trips total - 1 round trip per server - longest request is 4 memcache 3 objects

Clustering If objects are small, round trips dominate so you want objects clustered If objects are large, transfer time dominates so you want objects distributed In a web application you will almost always be dealing with small objects

Caching -Basic tools are parallelism and clustering -Clustering is a latency/throughput tradeoff -Application code must be aware -Networking is a burst problem -Dropped packets kill you -TCP quick ack

PHP CPU

Application Improvements

know what your libraries do $results = get_search_results( $needle ); foreach ( $results as $result ) { if ( is_pending_friend( $result[id] ) ) { // well change the links based on this $result[pending] = true; }

know what your libraries do function is_pending_friend( $id ) { // this is short-lived, so dont cache expensive_db_query( $id …)

Databases -Tend to be slower than lighter weight alternatives, so avoid using them -If you do use them partition them right from the start -If a query is _really_ slow, like a few seconds or a few minutes, you probably have a bug where youre scanning a table -The db should have a command to tell you what index its using for a query, and how many rows its examining

General Lessons Your best tool is parallelism Look at your data Build tools to look at your data Dont make assumptions about what components are doing Algorithmic and system improvements are almost always better than micro- optimization