Just-In-Time Scalability: Agile Methods to Support Massive Growth.

Slides:



Advertisements
Similar presentations
Tales from the Lab: Experiences and Methodology Demand Technology User Group December 5, 2005 Ellen Friedman SRM Associates, Ltd.
Advertisements

Capacity Planning for LAMP Architectures John Allspaw Manager, Operations Flickr.com Web Builder 2.0 Las Vegas.
1 Perspectives from Operating a Large Scale Website Dennis Lee VP Technical Operations, Marchex.
Help! My table is getting too big! How to divide and conquer SQL Relay 2014.
Copyright  2002, Medical Present Value, Inc. All rights reserved. Copyright © 2010 Texas Education Agency. All rights reserved. TEA confidential and proprietary.
The Platform on which to build the Lean Enterprise Improving Efficiency Real time Performance Management.
Non-Coding Activities a Development Team Needs a.k.a ”I don’t code, am I no longer useful?” Maaret Pyhäjärvi| | Twitter: maaretp Test Granlund.
Keeping our websites running - troubleshooting with Appdynamics Benoit Villaumie Lead Architect Guillaume Postaire Infrastructure Manager.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
Database Development Keep It Agile, Not Fragile Dev Nambi, Senior Software Engineer, Microsoft.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
DataBase Administration Scheduling jobs Backing up and restoring Performing basic defragmentation and index rebuilding Using alerts Archiving.
Eric Ries The Lean Startup #leanstartup Why Rails Makes Startups.
G51FSE Version Control Naisan Benatar. Lecture 5 - Version Control 2 On today’s menu... The problems with lots of code and lots of people Version control.
Setting Up a Sandbox Presented by: Kevin Brunson Chief Technology Officer.
Continuous Integration Demonstration. Agenda 1.Continuous Integration Basics 2.Live Demonstration 3.Bamboo Concepts 4.Advantages 5.Version 2.0 Features.
Building Highly Available Systems with SQL Server™ 2005 Robert Rea Brandon Consulting.
By John Boal  Continuous Integration [CI] ◦ Automating the build process ◦ Build the entire system each time any new.
Copyright © 2007 Quest Software The Changing Role of SQL Server DBA’s Bryan Oliver SQL Server Domain Expert Quest Software.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
How SQL Monitor can benefit your business. SQL Monitor How can it benefit your business? SQL Monitor is a SQL Server performance monitoring tool.  It’s.
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
Database Design for DNN Developers Sebastian Leupold.
1 Perspectives from Operating a Large Scale Website Dennis Lee.
Creating a Maintainable Software Ecosystem Jeremy D. Miller November 27th, 2007.
©2014 Bit9. All Rights Reserved Endpoint Threat Prevention Charles Roussey | Sr. Sales Engineer Detection and Response in Seconds.
Lecture 16 Page 1 CS 236 Online SQL Injection Attacks Many web servers have backing databases –Much of their information stored in a database Web pages.
Connecticut Computer Measurement Group 2015 Spring Meeting 5 Ingredients to Executing Application Performance Management on.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
© 2007 by Prentice Hall 1 Introduction to databases.
Plan Design Analyze Develop Test Implement Maintain Systems Development Life Cycle MAT Dirtbikes.
Extreme/Agile Programming Prabhaker Mateti. ACK These slides are collected from many authors along with a few of mine. Many thanks to all these authors.
© 2012 About Me Doing agile since 1999 Start ups / Enterprises Planigle - Consulting and Training Qcue – VP, Engineering.
Triggers A Quick Reference and Summary BIT 275. Triggers SQL code permits you to access only one table for an INSERT, UPDATE, or DELETE statement. The.
Views Lesson 7.
XP Practical PC, 3e Chapter 6 1 Protecting Your Files.
Step 5: Complete Your Project. Setting the scene Suppose you have been running a project to write a small piece of computer software for a business. The.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
1 MONGODB: CH ADMIN CSSE 533 Week 4, Spring, 2015.
Chapter 7 The Practices: dX. 2 Outline Iterative Development Iterative Development Planning Planning Organizing the Iterations into Management Phases.
Module CC3002 Post Implementation Issues Lecture for Week 7
Extreme programming (XP) Variant of agile Takes commonsense practices to extreme levels © 2012 by Václav Rajlich1.
UHCS 2005, slide 1 About Continuous Integration. UHCS 2005, slide 2 Why do you write Unit Test ? Improve quality/robustness of your code Quick feedback.
Version Control and SVN ECE 297. Why Do We Need Version Control?
Breaking Up Is Hard To Do From Monolith to Microservices.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Dynamicpartnerconnections.com Development for performance Oleksandr Katrusha, Program manager
1 Punishment Through Continuous Delivery If it hurts, do it more often…
Difference between External and Internal Server Monitoring.
Extreme programming (XP) Advanced Software Engineering Dr Nuha El-Khalili.
BIG DATA/ Hadoop Interview Questions.
7.5 Using Stored-Procedure and Triggers NAME MATRIC NUM GROUP Muhammad Azwan Bin Khairul Anwar CS2305A Muhammad Faiz Bin Badrol Shah CS2305B.
Data Platform Modernization
Version Control with Subversion
Netscape Application Server
Informatica PowerCenter Performance Tuning Tips
Maximum Availability Architecture Enterprise Technology Centre.
SQL Injection Attacks Many web servers have backing databases
NCAR-Developed Tools Bill Anderson and Marc Genty
100% Exam Passing Guarantee & Money Back Assurance
Transactional Replication A Deeper Dive Drew Furgiuele, Senior DBA IGS
X in [Integration, Delivery, Deployment]
Data Platform Modernization
TEMPDB – INTERNALS AND USAGE
The Science of Success: Building Faith in a Data Warehouse
Moving your on-prem data warehouse to cloud. What are your options?
Jamie Cool Program Manager Microsoft
Michael Stephenson Microsoft MVP - Azure
Presentation transcript:

Just-In-Time Scalability: Agile Methods to Support Massive Growth

What is IMVU?

Behind the scenes... IMVU is LAMP, plus... Perlbal Memcached Solr MogileFS plus... BuildBot eAccelerator Linux (Debian) memcached Nagios Perl Roundup rrd Subversion ADODB b2evolution Coppermine feed2js FreeTag Incutio XML-RPC jrcache JSON-PHP Magpie osCommerce phpBB Phorum SimpleTest Selenium Audiere Boost Cal3D CFL NSIS Pixomatic Python pywin32 SCons wxPython

Before and After Architecture Before We started with a small site, a mess of open source, and a small team that didn't know much about scaling. After We ended with a large site, a medium sized team, and an architecture that has scaled. We never stopped. We used a roadmap and a compass, made weekly changes in direction, regularly shipped code on Wednesday to handle the next weekend's capacity constraints, and shipped new features the whole time.

Before and After Architecture (1/4) November

Before and After Architecture (2/4) December

Before and After Architecture (3/4) February

Before and After Architecture (4/4) May

Advanced planning vs. fast response “Driving” Continuously figure out what is going to go wrong soon Quickly fix it, without breaking something else Get feedback along the way “Rocket ship” Figure out in advance what is going to go wrong Build a plan that prevents those things from happening Execute your plan Get feedback when done

Questions to ask “Driving” How do you know you will be able to fix the problem in time? How can you be sure you won't cause collateral damage? How can you be sure you won't code yourself into a corner? “Rocket ship” Are you sure you know what is going to happen? Are you sure you can execute? Can you afford it? Do you need feedback?

Continuous Ship Deploy new software quickly At IMVU time from check-in to production = 20 minutes Tell a good change from a bad change (quickly) Revert a bad change quickly Work in small batches At IMVU, a large batch = 3 days worth of work Break large projects down into small batches Don't have the same problem twice – fix the root cause of each class of problems IMVU pushes code to production times every day

Cluster Immune System What it looks like to ship one piece of code to production: Run tests locally (SimpleTest, Selenium) o Everyone has a complete sandbox Continuous Integration Server (BuildBot) o A ll tests must pass or “shut down the line” o Automatic feedback if the team is going too fast Incremental deploy o Monitor cluster and business metrics in real-time o Reject changes that move metrics out-of-bounds Alerting & Predictive monitoring (Nagios) o Monitor all metrics that stakeholders care about o If any metric goes out-of-bounds, wake somebody up o Use historical trends to predict acceptable bounds When customers see a failure: o Fix the problem for customers o Improve your defenses at each level

Case Study: Sharding Problem: Spread write queries across multiple databases Solution: Intercept and redirect queries based on SQL comments Move one table or sub-system at a time Our experience was one engineer horizontally partitions one table or small sub-system in one week New engineers figure this out in about 5 minutes db_query(“INSERT INTO inventory (customers_id, products_id) VALUES ($customer_id, $product_id)"); db_query("/*shard customer://$customer_id */ INSERT INTO inventory (customers_id, products_id) VALUES ($customer_id, $product_id)"); Learning: cross shard joins & transactions aren’t required

Case Study: Caching Problem: Cache frequently read data to memcached Solution: Intercept and cache queries based on SQL comments db_query_cache(BUDDY_CACHE_TIME, "/*shard customer://$customer_id */ /*cache-class customer://$customer_id/buddies */ SELECT friend_id, buddy_order FROM customers_friends WHERE customers_id=$customer_id"); db_query(“/*shard customer://$customer_id */ DELETE FROM customers_friends WHERE customers_id = $customer_id AND friend_id = $friend_id”); db_flush_cacheclass("customer://$customer_id/buddies”); Learning: Flushing cache critical to users and performance –When a customer spends $24.95, they want the benefits immediately Learning: Test the cache behavior for critical systems

Case Study: Steering Data Design Problem: Improve database schemas and data design to meet scalability requirements without downtime Solution: Measure to find the real problems (harder than it sounds) Migrate to new design that takes advantage of sharding and/or caching

Case Study: Steering Data Design

Problem: You can’t bulk move large frequently accessed data Solution: Copy on read –Use when you are read bound –Reads check cache, new location, and copy to new location if missing –Writes go to new location if data has been migrated, otherwise old Copy on write –Use when you are write bound –Reads check cache, new location, then old location –Writes go to new location, copying to new location if missing Copy all –Use when file system fills up –Reads & writes go to new location, falling back to old location if missing –Cron copies data a few records at a time

“Thank You for Listening!”