1 Mean Time to Innocence Your Dashboards are Green – but your end users are still complaining. Now What? Phil Stanhope October 2015.

Slides:



Advertisements
Similar presentations
Rob Smets A user centred approach IPv6 deployment monitoring.
Advertisements

Chapter 22: Cloud Computing and Related Security Issues Guide to Computer Network Security.
Real-Time Cities: an Introduction to Urban Cybernetics Harvard Design School: SCI Spring 2014 Kalisha Holmes Exercise #2: Case Studies in Data.
June 2007APTLD Meeting/Dubai ANYCAST Alireza Saleh.ir ccTLD
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
CSE 190: Internet E-Commerce Lecture 16: Performance.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Web Caching and CDNs March 3, Content Distribution Motivation –Network path from server to client is slow/congested –Web server is overloaded Web.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
SpeedReliabilityEfficiency In almost every case, content utilizing a CDN will be much closer to the end-user and that will result in faster.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
IPv6 end client measurement George Michaelson
CSCI-1680 Web Performance and Content Distribution Based partly on lecture notes by Scott Shenker and John Jannotti Rodrigo Fonseca.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
1 Caching  Temporary storage of frequently accessed data (duplicating original data stored somewhere else)  Reduces access time/latency for clients 
Harness Your Internet Activity. Zeroing in On Zero Days DNS OARC Spring 2014 Ralf Weber
Lecture 15 – Amazon Network as a Service. Recall IaaS Server as a Service Storage as a Service Network as a Service.
1. 1.Charting the CDNs(locating all their content and DNS servers). 2.Assessing their server availability. 3.Quantifying their world-wide delay performance.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
MNO Cloud Use Cases 4 to 9 Source: Rogers Wireless Contact: Ed O’Leary George Babut 3GPP/SA3-LI#44Tdoc.
1 Chapter 6: Proxy Server in Internet and Intranet Designs Designs That Include Proxy Server Essential Proxy Server Design Concepts Data Protection in.
Web Application Firewall (WAF) RSA ® Conference 2013.
Akamai Technologies - Overview RSA ® Conference 2013.
Jason Houle Vice President, Travel Operations Lixto Travel Price Intelligence 2.0.
Business Intelligence (BI) Primer BI Tools in SharePoint 2010 Excel Services Performance Point Services.
DNS Security Pacific IT Pros Nov. 5, Topics DoS Attacks on DNS Servers DoS Attacks by DNS Servers Poisoning DNS Records Monitoring DNS Traffic Leakage.
Advanced Networking Lab. Given two IP addresses, the estimation algorithm for the path and latency between them is as follows: Step 1: Map IP addresses.
Application of Content Computing in Honeyfarm Introduction Overview of CDN (content delivery network) Overview of honeypot and honeyfarm New redirection.
Tony McGregor RIPE NCC Visiting Researcher The University of Waikato DAR Active measurement in the large.
Webinar Monday October 6, 2014 aiScaler software is installed on servers to create private point of presence (PoPs) – much like a CDN endpoint. They cache.
Computing Infrastructure for Large Ecommerce Systems -- based on material written by Jacob Lindeman.
AWS Cloud Firewall Review Architecture Decision Group October 6, 2015 – HUIT-Holyoke-CR 561.
Content-oriented Networking Platform: A Focus on DDoS Countermeasure ( In incremental deployment perspective) Authors: Junho Suh, Hoon-gyu Choi, Wonjun.
Developer TECH REFRESH 15 Junho 2015 #pttechrefres h Understand your end-users and your app with Application Insights.
Jan 30, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Project Milestone 2 due today. Undergraduate projects should have 3 students per project.
CONTENT DELIVERY NETWORKS
The Intranet.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
1 Mean Time to Innocence Your Dashboards are Green – but your end users are still complaining. Now What? Phil Stanhope October 2015.
Yaping Zhu with: Jennifer Rexford (Princeton University) Aman Shaikh and Subhabrata Sen (ATT Research) Route Oracle: Where Have.
Information-Centric Networks Section # 3.2: DNS Issues Instructor: George Xylomenos Department: Informatics.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Content Distribution Network, Proxy CDN: Distributed Environment
Nexthink V5 Demo ITSM – Users Impacted. Situation › It’s Wednesday morning › Last night the infrastructure team we worked hard on a proxy migration We.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Benefits and Value of an IXP The IXP Value Proposition.
Content Distribution Internetworking IETF BOF December 12, 2000 Phil Rzewski Gary Tomlinson.
Web Werks Data Center Connects To All Regional ISPs in India to Improve Network Latency.
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
WHAT'S THE DIFFERENCE BETWEEN A WEB APPLICATION STREAMING NETWORK AND A CDN? INSTART LOGIC.
Cyber security: Lithuanian National Regulatory Authority expertise in monitoring national networks resilience Dr. Rytis Rainys | rrt.lt at TAIEX Multi-beneficiary.
Maintaining and Updating Windows Server 2008 Lesson 8.
Dissecting Significant Outages from 2014 Valerio Plessi CCIE R&S Customer Success Engineer
How LinkedIn used TCP Anycast to make the site faster Ritesh Maheshwari Shawn Zandi.
Multicast in Information-Centric Networking March 2012.
Web GIS: Architectural Patterns and Practices
The Intranet.
Caching Temporary storage of frequently accessed data (duplicating original data stored somewhere else) Reduces access time/latency for clients Reduces.
Ad-blocker circumvention System
CloudFront: Living on the Edge
Chapter 21: Cloud Computing and Related Security Issues
Internet Networking recitation #12
Chapter 22: Cloud Computing Technology and Security
Is Your Online Security Intelligent? Internet Performance Management
Distributed Content in the Network: A Backbone View
HWP2 – Distributed search
AWS Cloud Computing Masaki.
Presentation transcript:

1 Mean Time to Innocence Your Dashboards are Green – but your end users are still complaining. Now What? Phil Stanhope October 2015

2 30B Real-Time Steering Decisions per day 6B trace route and RUM latency measurements per day That’s over 6 Light years! 13 Hops per traceroute Traffic covering 80% of ASNs on the internet seen every few minutes 52K ASN monitored 200M BGP updates per day No major CDN can deliver 99.9 uptime – from the end users perspective. But is it fault. Real Time Feeds Cooked Time Series Data – Near Real Time Pre-Cooked across ~1000 dimensions every 5 minutes (Geography, Mobile Network, Fixed Line Networks, Target Markets Cities and IPSets) Outages & Hijacks Pairwise Comparisons Performance Alarms Some Numbers 2

3 ● Major Outages Major Impact Rare ● Regional Outages and Degradations Variable Impact Always Happening “We experienced an Internet connectivity issue with a provider outside of our network which affected traffic from some end-user networks.” AWS Business Impacting 3

4 ● Consolidated view across your Internet Infrastructure ● Determine the impact to Cloud, CDN and Hosting Infrastructure globally ● Immediate time to information What is Internet Intelligence? 4

5 Leverage Currently Deployed Dyn Assets ● Global Monitoring Infrastructure ● Custom Cloud Monitoring Infrastructure ● Real User Monitoring data ● Global Routing Infrastructure Monitors How is it Done? 5

6 Global Monitoring Infrastructure 6

7 7 Reachability Markets

8 8 What is being Monitored?

9 Waterfalls & RUM – Where do you start? 9

10 Rather than focus on entire page RUM and waterfall – focus on what happens OUTSIDE of normal your span of control as a cloud, content & security consumer: Monitor the critical content servers (CDNs both public and private) Monitor the cloud providers, DNS providers & core SaaS providers Give you the tooling to get to start answering mean time to innocence questions Is it a problem you have ability to address? Not if it’s your cloud provider’s transit. Or the ISPs recursive DNS. Is your CDN provider overloaded? Is there a more generalized congestion problem on the internet? Are the network paths to your users suboptimal – maybe even hijacked? Can you see a micro-outage? Can you see patterns with providers? Did a user come via a proxy gateway? Does the gateway fail to forward websockets? Let’s Dive in – Some Context 10

11 NOTE: This is a fake URL – it won’t work for you. Sorry. A single web page that shows combination of real-time and near-real-time forensic data Intentionally unbranded – what can you do with our datasets? Covers the internal APIs that we use – they are all becoming public. Talk to me! Common set of UX controls can be used to a variety of real-time and batch data: GeoViews, Sunburst, Matrix & Long-Term Trending Under the covers: ReactJS, D3, GeoJson/Topojson, jQuery, Go, Varnish, Nginx, Websockets Live Demo 11

12 Telemetry Data Cooking Pipeline 12 Users Cover 80% Of the ASNs On the internet Every minute Relays Globally distributed network. Handling 50K/sec per relay Probers Network of 300+ probers performing 10K traces/second AND synthetic DIG & HTTP[S] Geo annotated real-time API Time Series analyzed API Gatherers Real-time geo annotation, data transformation & filtering. Handling 100K/sec events Cookers Statistical analysis and aggregation services

13 Browser Recursive Authoritive Injector & Beacon GET - beacon = HMAC(secret, token) Javascript “injection” – just like injecting an advertisement into a page Writes a transparent iframe into the page Loading the iframe requires resolving beacon Guaranteed to cause recursive DNS cache miss time, client_ip, beacon time, recursive_ip,beacon HTTPDNSLOG & ANALYZE Collect GET - time, client_ip Dynamic HTML - containing customer resources to test Resources 1.. target origins tested Resource timing Information sent to collector Per resource timing info KEY: Gatherer token = encode(cust_id, client_ip, time, nodeid, referer) Time – 2 - Authoritative Time – 2 – Recursive (inferred)

14 5min, 1H & 1D Cooking – What’s going on in our Data Kitchen? 14 MHD Raw MHD formatted data at one minute granularity Client IP STATS Histograms 5 minute timing histograms across 6 latency features DNS IP STATS Histograms 5 minute timing histograms across 6 latency features IP Maps Client  Recursive Recursive  Client Client IP Sets Typed Label IP Sets Latencies Country City Continent ASN DNS IP Sets Typed Label IP Sets Latencies Country City Continent ASN Correlation Scores and Ranks Daily by Origin for every TYLIP feature All data is GEO Redundant Gathering, Raw, Intermediates & Aggregates Geo annotated real-time API Gatherers

15 QUESTIONS?