Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Plane distributed measurement infrastructure Overview, insights & hindsights Journée du Conseil Scientifique de l’Afnic, #JCSA2015, July 9 th 2015.

Similar presentations


Presentation on theme: "The Plane distributed measurement infrastructure Overview, insights & hindsights Journée du Conseil Scientifique de l’Afnic, #JCSA2015, July 9 th 2015."— Presentation transcript:

1 The Plane distributed measurement infrastructure Overview, insights & hindsights Journée du Conseil Scientifique de l’Afnic, #JCSA2015, July 9 th 2015 Grant Agreement n. 318627 Dario Rossi Professor dario.rossi@telecom-paristech.fr http://www. telecom-paristech.fr/~drossi

2 Today Internet Which ISP is the best in my area ? Why is not working? How does it work? How to do better? User viewpoint ISP viewpoint Cloud viewpoint How to optimize my network for users apps? Where is traffic coming from? MotivationsMeasurementArchitectureEcosystemExample of useSummary How to keep users happy & engaged? Researcher viewpoint

3 Today Internet “The Internet is the first thing that humanity has built that humanity doesn't understand, the largest experiment in anarchy that we have ever had.” Eric Schmidt – ex Google Exec. Chairman MotivationsMeasurementArchitectureEcosystemExample of useSummary 3

4 …. issing orchestration Internet measurement Shed light on the Internet operational obscurity Lot of measurement tools… Plane: avoid to reinvent the wheel & assist in building automated pilots! MotivationsMeasurementArchitectureEcosystemExample of useSummary Plane: 4

5 Two kinds of measurements Passive – Observe network traffic without interference – Similar to Aristotle’s observational method Active – Perturb the network & measure its reaction – Similar to Newton’s experimental method 5 Ἀριστοτέλης Sir Isaac Newton @aristotle @newton 1388 372 2411353 All science is either practical, poetical or theoretical (Metaphysics) Every body continues in its state of rest, or of uniform motion in a right line, unless it is compelled to change that state by forces impressed upon it (Principia) MotivationsMeasurementArchitectureEcosystemExample of useSummary 5

6 Passive m easurements Deploy passive sensors Deploy passive sensors Extract meaningful analytics Post Process Passive sensor Raw (or half- cooked) data Data collection User traffic MotivationsMeasurementArchitectureEcosystemExample of useSummary 6

7 Active m easurements Raw (or half- cooked) data Data collection Actuator Control active sensors Control active sensors A Post Process Active probes Perform active measurements Extract meaningful analytics MotivationsMeasurementArchitectureEcosystemExample of useSummary 7

8 erging all together Active sensor Control Passive sensor Data Integration with existing monitoring frameworks Integration with existing monitoring frameworks A Active and passive analysis for iterative root-cause-analysis Active and passive analysis for iterative root-cause-analysis MotivationsMeasurementArchitectureEcosystemExample of useSummary 8

9 ~~~Big data hype~~~ Traffic measurements orders of m agnitude – 40Gbps (157PB/yr) full-packet processing at each passive sensor, with up to O(10 7 ) traffic classifications per second… – O(10 9 ) active probes in an Internet anycast census… (more on that later if time allows) To be compared with To be compared with – The Large Hadron Collider (LHC), generates ~25PB/yr 8 ) collisions per second and O(10 8 ) collisions per second – Sloan Digital Sky Survey (SDSS), generates ~73TB/yr – Capacity of the Human genome with 2-bits bases ~ O(10 9 ) [Image from http://irevolution.net/category/big-data/] [courtesy of Sony Classical]

10 Note: simplified view, yet the devil hides in the details Plane architecture walkthrough A Repositories Probes S Supervisor Client/ Reasoner MotivationsMeasurementArchitectureEcosystemExample of useSummary 10

11 Plane architecture: entities A Repositories Probes S Supervisor Client/ Reasoner Orchestrate measurements; expert system (automatic pilot) Access control; delegation; boostrap Past sensor data; legacy databases; Big data processing Past sensor data; legacy databases; Big data processing Sensor data; Proxies for legacy tools/ platforms Sensor data; Proxies for legacy tools/ platforms MotivationsMeasurementArchitectureEcosystemExample of useSummary 11

12 Plane architecture: interfaces A Probes S Supervisor Client/ Reasoner Native probe Proxy probe/repo Native repo Repositories (avoid reinventing the wheel!) MotivationsMeasurementArchitectureEcosystemExample of useSummary 12

13 Plane architecture: messages A Repository Probe S Supervisor Client/ Reasoner Schema-centric definition, with a type registry Schema completely describes the measurement, JSON format C S R MotivationsMeasurementArchitectureEcosystemExample of useSummary 13

14 Note: simple ping example, for illustrational purposes Plane architecture: messages CSR { "capability": “measure”, "version": 0, "registry": http://ict-mplane.eu/reg, "label": “ping-aggregate” "when": “now.. future / 1s” "parameters": {"source.ip4“: "137.3.1.1", "destination.ip4": "*"}, “results”: ["delay.twoway.icmp.us.min", "delay.twoway.icmp.us.mean", "delay.twoway.icmp.us.max", "delay.twoway.icmp.count"] } { “specification": “measure”, "version": 0, "registry": http://ict-mplane.eu/reg, "label": “ping-aggregate-experiment1”, "when": “now + 30s / 1s”, "token" : "0f31c9033f8fce0c9be41d4944" "parameters": {"source.ip4“: "137.3.1.1", "destination.ip4": “137.194.164.1"}, “results”: ["delay.twoway.icmp.us.min", "delay.twoway.icmp.us.mean", "delay.twoway.icmp.us.max", "delay.twoway.icmp.count"] } { “specification": “measure”, "version": 0, "registry": http://ict-mplane.eu/reg, "label": “ping-aggregate-experiment1”, "when": “2015-09-07 14:30:02.123 […] /1s“, "token" : "0f31c9033f8fce0c9be41d4944“, "parameters": {"source.ip4“: "137.3.1.1", "destination.ip4": “137.194.164.1"}, “results”: ["delay.twoway.icmp.us.min", "delay.twoway.icmp.us.mean", "delay.twoway.icmp.us.max", "delay.twoway.icmp.count"] “resultsvalue” : [[ 2390,2983, 6600,30]] } *grin* If I'd known then that [ping] would be my most famous accomplishment in life, I might have worked on it another day or two and added some more options. Mike Muuss - US Army Research Laboratory MotivationsMeasurementArchitectureEcosystemExample of useSummary 14

15 Plane architecture: transport Repository Probe S Supervisor Client/ Reasoner mPlane over: HTTPS (HTTP over TLS) WSS (WebSockets over TLS) SSH (Secure Shell) Indirect export uses custom protocols: IPFIX, FTP, BitTorrent, Apache Flume, RFC1149, RFC6214, etc. MotivationsMeasurementArchitectureEcosystemExample of useSummary 15

16 Plane architecture: workflow Repositories Probes S Supervisor Client/ Reasoner 1. Send high level specification 2. Decompose specification 3. Send lowlevel specification 4. Measure 5. Indirect export of Big data 6.Analyze 7.Return results 8. Aggregate results 9.Return high-level results 10. Decide and iterate MotivationsMeasurementArchitectureEcosystemExample of useSummary 16

17 Plane architecture: Inter-domain Repositories Probes Supervisor S S S Client/ Reasoner MotivationsMeasurementArchitectureEcosystemExample of useSummary 17

18 The broader easurement ecosystem “From global measurements to local management” – 10 partners, 3.8 MEUR, FP7 STREP – More focused use case: Build a framework out of probes – Knowledge sharing (e.g., joint work, Dagsthul seminars, etc.) IETF Large-Scale Measurement of Broadband Performance (LMAP) – Defines the components, protocols, rules, etc., but does not specifically target adding “a brain” to the system – Common core set, Largely interoperable IETF IP Performance Metrics (IPPM) – Registry related, we use its vocabulary as much as possible IETF IP Flow Information Export (IPFIX) – Indirect export related, active contributors Others in scope – IETF DOCTORS; tcpm; ConEx; NETCONF; IRTF NMRG; ETSI STQ; ITU SG12 MotivationsMeasurementArchitectureEcosystemExample of useSummary 18

19 Plane consortium Marco Mellia POLITO Saverio Nicolini NEC Dina Papagiannaki Telefonica Ernst Biersack Eurecom Brian Trammell ETH Tivadar Szemethy NetVisor Dario Rossi ENST Fabrizio Invernizzi Telecom Italia Guy Leduc Univ. Liege Pietro Michiardi Eurecom Pedro Casas FTW Andrea Fregosi Fastweb 16 partners – 3 ISPs – 6 research centers – 5 universities – 2 SMEs FP7 IP – 11 MEUR – 3 years long – ends early 2016 MotivationsMeasurementArchitectureEcosystemExample of useSummary 19

20 Plane achievements so far Standardization &dissemination – 3 RFCs, 5 drafts – 80+ papers, including 4 Best paper awards and ACM SIGCOMM IMC & CoNEXT, IEEE INFOCOM Software https://www.ict-mplane.eu/public/software and https://github.com/fp7mplane/ https://www.ict-mplane.eu/public/softwarehttps://github.com/fp7mplane/ (Ready-to-use Virtual Machines under preparation) Check the demos on https://www.youtube.com/channel/UCHGS6UlUKvGZTyt5DemmPaw https://www.youtube.com/channel/UCHGS6UlUKvGZTyt5DemmPaw “Biased sampling”: Anycast geolocation as a showcase of mPlane achievements Note: hard to summarize! MotivationsMeasurementArchitectureEcosystemExample of useSummary 20

21 IP Anycast Anycast: set of equivalent replicated servers, traffic routed to closest replica (in BGP sense) Question: where are Google DNS servers 8.8.8.8? Mountain View, CA (IP2Location) New York, NY (Geobytes) United States (Maxmind) ????? 1ms ≈ 100Km 2ms from EU to US ????? Speed of light violation !! Tools using distributed measurement aren’t better ! Time varying answers Unknown accuracy Commercial databases Distributed measurement MotivationsMeasurementArchitectureEcosystemExample of useSummary 21

22 Geolocation technique MotivationsMeasurementArchitectureEcosystemExample of useSummary Measure Latency Planetlab/RIPE Atlas Measure Latency Planetlab/RIPE Atlas Enumerate Solve MIS Optimum (brute force). 5-approx (greedy).Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihoodGeolocateClassification Detect Speed of light violations Detect Scalability Infrastructure ? Lightweight IterateFeedback Filter latency noise MeasureDetectEnumerateGeolocateIterate D. Cicalese, et al. “A fistful of Pings: Accurate and lightweight Anycast enumeration and Geolocation”, IEEE INFOCOM 2015 22

23 measurement infrastructure RIPE Atlas vs PlanetLab Anecdotal understanding – For some deployments, RIPE includes PlanetLab – For others, the union is greater than the sum of the parts ! mPlane enables/simplifies systematic studies MotivationsMeasurementArchitectureEcosystemExample of useSummary Infrastructure ? FootprintVPsASesCountries RIPE Atlas7k2k150 PlanetLab~30018030 (Microsoft IP/24 Example) 23

24 Anycast census MotivationsMeasurementArchitectureEcosystemExample of useSummary O(10 7 ) targets x O(10 2 ) active sensors O(10 3 ) targets /sensor /second O(10 3 ) targets are anycast – needle in the IPv4 haystack http://www.telecom-paristech.fr/~drossi/anycast Scalable Lightweight 24

25 Su ary Overview – Measurements to shed light on Internet operational obscurity Insights – Measurement plane to facilitate expression of measurement capabilities and needs – Allow users to concentrate effort on hard problems – Avoiding time consuming tasks Hindsights – Crucial to foster adoption (GitHub, ready-to-use VMs) to reach a critical mass (incentive in proxying) MotivationsMeasurementArchitectureEcosystemExample of useSummary 25

26 any thanks ?? || // 26 MotivationsMeasurementArchitectureEcosystemExample of useSummary

27 Backup slides 27

28 Plane architecture: simplistic view Repository components Measurement components (aka Probes) mPlaneAPI A Reasoner Client S Supervisor Client proxy

29 S Plane architecture: gory details

30 Methodology overview MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood IterateFeedback Detect Speed of light violations

31 Measure PlanetLab – 300 vantage points – Geolocated with Spotter (ok for unicast) – Freedom in type/rate of measurement ICMP, DNS, TCP-3way delay, etc RIPE – 6000 vantage points – Geolocated with MaxMind (ok for unicast) – More constrained (ICMP, traceroute) In this talk: min over 10 ICMP samples MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood Iterate Feedbac k Detect Speed of light violations

32 Detect The vantage points p and q are referring to two different instances if: MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood Iterate Feedbac k Detect Speed of light violations Packets cannot travel faster than the speed of light

33 Enumerate Find a maximum independent set E – of discs such that: – Brute force (optimum) vs Greedily from smallest (5-approximation) MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood Iterate Feedbac k Detect Speed of light violations

34 Geolocate Classification task – Map each disk Dp to most likely city – Compute likelihood (p) of each city in disk based on: ci: Population of city i Ai: Location of ATA airport of city i d(x,y): Geodesic distance  : city vs distance weighting Output policy Proportional: Return all cities in D p with respective likelihoods Argmax: Pick city with highest likelihood Frankfurt p=0.30 Zurich p=0.10 Munich p=0.60 Munich MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood Iterate Feedbac k Detect Speed of light violations rationale: users lives in densely populated area; to serve users, servers are placed close to cities airports: simplifies validation against ground truth (DNS)

35 Iterate Collapse – Geolocated disks to city area Rerun – Enumeration on modified input set Increase recall of anycast replicas enumeration MeasureLatency Planetlab/Ripe Enumerate Solve MIS Optimum (brute force). 5-approx (greedy). GeolocateClassification Maximum likelihood Iterate Feedbac k Detect Speed of light violations

36 Performance at a glance – Protocol agnostic and lightweight Based on a handful of delay measurement O(100) VPs 1000x fewer VPs than state of the art – Enumeration iGreedy use75% of the probes (25% discarded due to overlap) Overall 50% recall (depends on VPs; stratification is promising) – Geolocation Correct geolocation for 78% of enumerated replicas 361 km mean geolocation error for all enumerated replicas (271 km median for erroneous classification)


Download ppt "The Plane distributed measurement infrastructure Overview, insights & hindsights Journée du Conseil Scientifique de l’Afnic, #JCSA2015, July 9 th 2015."

Similar presentations


Ads by Google