RandPing: A Randomized Algorithm for IP Mapping

Slides:



Advertisements
Similar presentations
Pune, India, 13 – 15 December 2010 ITU-T Kaleidoscope 2010 Beyond the Internet? - Innovations for future networks and services Dr. Bamba Gueye Joint work.
Advertisements

Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.
Data Set used. K Means K Means Clusters 1.K Means begins with a user specified amount of clusters 2.Randomly places the K centroids on the data set 3.Finds.
Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
Word Spotting DTW.
Danzhou Liu Ee-Peng Lim Wee-Keong Ng
Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.
Fabián E. Bustamante, 2007 Meridian: A lightweight network location service without virtual coordinates B. Wong, A. Slivkins and E. Gün Sirer SIGCOM 2005.
Case Study: BibFinder BibFinder: A popular CS bibliographic mediator –Integrating 8 online sources: DBLP, ACM DL, ACM Guide, IEEE Xplore, ScienceDirect,
EL9331 Meridian: A Lightweight Network Location Service without Virtual Coordinates Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer SIGCOMM’05 ( Slides.
Authors: Venkata N. Padmanabhan and Lakshminarayanan Subramanian Publisher: SIGCOMM 2001 Presenter: Chai-Yi Chu Date: 2013/03/06 1.
Resource Prediction Based on Double Exponential Smoothing in Cloud Computing Authors: Jinhui Huang, Chunlin Li, Jie Yu The International Conference on.
Quantifying Generalization from Trial-by-Trial Behavior in Reaching Movement Dan Liu Natural Computation Group Cognitive Science Department, UCSD March,
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Vivaldi Coordinate Service Justin Ma, Patrick Verkaik, Michael Vrable Department of Computer Science And Engineering UCSD CSE222A, Winter 2005.
Author: Jason Weston et., al PANS Presented by Tie Wang Protein Ranking: From Local to global structure in protein similarity network.
Robust estimation Problem: we want to determine the displacement (u,v) between pairs of images. We are given 100 points with a correlation score computed.
כמה מהתעשייה? מבנה הקורס השתנה Computer vision.
UCSC 1 Aman ShaikhICNP 2003 An Efficient Algorithm for OSPF Subnet Aggregation ICNP 2003 Aman Shaikh Dongmei Wang, Guangzhi Li, Jennifer Yates, Charles.
On the Power of Off-line Data in Approximating Internet Distances Danny Raz Technion - Israel Institute.
IP-Geolocation Mapping for Moderately Connected Internet Regions.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Optimizing Cost and Performance in Online Service Provider COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
SCPL: Indoor Device-Free Multi-Subject Counting and Localization Using Radio Signal Strength Chenren Xu†, Bernhard Firner†, Robert S. Moore ∗, Yanyong.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
University of Central Florida CAP 6135: Malware and Software Vulnerability Spring 2012 Paper Presentation Dude, where’s that IP? Circumventing measurement-based.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
Yaomin Jin Design of Experiments Morris Method.
Optimal XOR Hashing for a Linearly Distributed Address Lookup in Computer Networks Christopher Martinez, Wei-Ming Lin, Parimal Patel The University of.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Ming Hsiao Author : Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.
CURE: An Efficient Clustering Algorithm for Large Databases Sudipto Guha, Rajeev Rastogi, Kyuseok Shim Stanford University Bell Laboratories Bell Laboratories.
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Determining the Geographic Location of Internet Hosts Venkata N. Padmanabhan Microsoft Research Lakshminarayanan Subramanian University of California at.
Optimal Sampling Strategies for Multiscale Stochastic Processes Vinay Ribeiro Rolf Riedi, Rich Baraniuk (Rice University)
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Chung-hung.
CURE: EFFICIENT CLUSTERING ALGORITHM FOR LARGE DATASETS VULAVALA VAMSHI PRIYA.
Accurate Robot Positioning using Corrective Learning Ram Subramanian ECE 539 Course Project Fall 2003.
© 2007 Sean A. Williams 1 Ecolocation: A Sequence Based Technique for RF Localization in Wireless Sensor Networks Authors: Kiran Yedavalli, Bhaskar Krishnamachari,
C. Savarese, J. Beutel, J. Rabaey; UC BerkeleyICASSP Locationing in Distributed Ad-hoc Wireless Sensor Networks Chris Savarese, Jan Beutel, Jan Rabaey.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Database Management Systems, R. Ramakrishnan 1 Algorithms for clustering large datasets in arbitrary metric spaces.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
New Algorithms for Efficient High-Dimensional Nonparametric Classification Ting Liu, Andrew W. Moore, and Alexander Gray.
A Protocol for Tracking Mobile Targets using Sensor Networks H. Yang and B. Sikdar Department of Electrical, Computer and Systems Engineering Rensselaer.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
Gang Wang, Shining Wu, Guodong Wang, Beixing Deng, Xing Li Tsinghua University Tsinghua Univ. Oct Experimental Study on Neighbor Selection Policy.
Optimal Relay Placement for Indoor Sensor Networks Cuiyao Xue †, Yanmin Zhu †, Lei Ni †, Minglu Li †, Bo Li ‡ † Shanghai Jiao Tong University ‡ HK University.
Distributed Localization Using a Moving Beacon in Wireless Sensor Networks IEEE Transactions on Parallel and Distributed System, Vol. 19, No. 5, May 2008.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
1 IP2Geo: Locating Internet Hosts Geographically Venkat Padmanabhan Microsoft Research Joint work with L. Subramanian (UC Berkeley)
Anomaly Detection Carolina Ruiz Department of Computer Science WPI Slides based on Chapter 10 of “Introduction to Data Mining” textbook by Tan, Steinbach,
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Vivaldi: A Decentralized Network Coordinate System
Authors: Sajjad Rizvi, Xi Li, Bernard Wong, Fiodar Kazhamiaka
A New Support Vector Finder Method Based on Triangular Calculations
Accurate Robot Positioning using Corrective Learning
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Dude, where’s that IP? Circumventing measurement-based geolocation
Phillipa Gill University of Toronto
Chandrika Jayant Ethan Katz-Bassett
Finding Functionally Significant Structural Motifs in Proteins
Jongik Kim1, Dong-Hoon Choi2, and Chen Li3
A Scalable Content Addressable Network
CSE572: Data Mining by H. Liu
CAMCOS Report Day December 9th, 2015 San Jose State University
Approximate Mean Value Analysis of a Database Grid Application
Presentation transcript:

RandPing: A Randomized Algorithm for IP Mapping Michelle Liu Yuhan Cai 11/16/2018

Outline Introduction Related Work Background Algorithm Overview Experimental Evaluation Conclusions and Future Work 11/16/2018

Introduction Motivations Problem statement Challenges Collection of personalized information Authorities of transactions Problem statement IP mapping is the problem that, given an IP address p, find the geographic location of the internet host with IP address p. Challenges No authorative database IP addresses do not contain geographic information 11/16/2018

Related Work DNS based approach Delay based approach Using DNS records from databases IP2LL, NetGeo, and GeoTrack DNS might not be related to locations Delay based approach Exploiting relationship between distances and network delays GeoPing and CBG Clustering based approach Splitting IP address space into clusters Assumption: all hosts within the same cluster are co-located 11/16/2018

Background Best line bound Above the baseline Below all data points Closest to all data points 11/16/2018

Background (cont.) Clustering Outlier detection Scriptroute system Partitioning Around Medoids (PAM) Quality of a Clustering = average of the distance of an object to the medoid of its cluster Outlier detection O is a DB(p, D)-outlier if at least fraction p of T lies greater than distance D from O. Scriptroute system A system that allows network measurements conduction from remote vantage points 11/16/2018

Algorithm Overview Overall idea Major steps Clustering probing machines Random selection of a small set of probing machines Reduction of search space by pruning Major steps Preprocessing stage Randomized pinging Location estimation 11/16/2018

Preprocessing Stage Construction of RTT table and Distance table for probing machines Computation of the best line for each probing machine subject to the constraint: 11/16/2018

Preprocessing (cont.) Clustering of probing machines based on their geographic locations Transformation of the geographic system to a Cartesian coordinate system x = 2RcosT0 (G – G0) / 360 y = 2R (T - T0) / 360 11/16/2018

Randomized Pinging Random selection of m clusters Random selection of k probing machines within each cluster Pinging the target machine to get n = m*k RTT measurements 11/16/2018

Location Estimation Computation of estimated distances Determination of the best group of circles by dynamic programming Keep track of groups of circles Incrementally build up each group Pick the biggest group 11/16/2018

Location Estimation (cont.) Locating the target machine by non-linear programming subject to the constraints: 11/16/2018

Location Estimation (cont.) Repeat the process for r times Computation of the centroid for the r estimated locations Prune out distance-based outliers Compute the centroid of the points left 11/16/2018

Experimental Results Setup Results Machines selected from Planetlab in US One small set of machines to be target machines, the rest to be probing machines Results Error distance: distance between the real location of the target machine and the estimated one 11/16/2018

Experimental Results (cont.) City Name Actual Location Estimated Location Error Distance (km) Cornell (NY) (-76.476, 42.4478) (-72.3764, 43.2691) 345.9 Duke (-78.9427, 36.0088) (-73.9713, 39.6992) 633 Intel (Seattle) (-122.316, 47.6614) (-122.2084, 45.5088) 250.1 Northwestern (-87.69, 42.05) (-89.9477, 40.2735) 272.2 Stanford (-122.172, 37.4294) (-114.5750, 35.8964) 663 Dartmouth (-70.9667, 41.6167) (-77.3431, 40.9380) 496.3 UCSC (-122.06, 37.0) (-119.1027, 37.4213) 270.2 UGA (-83.36, 33.98) (-76.7415, 33.5117) 591.1 UMASS (-72.5249, 42.3881) (-68.5706, 41.5383) 333.7 UOregon (-123.06, 44.04) (-111.4846, 39.1779) 1075 Uvirginia (-78.4749, 38.0613) (-72.8606, 39.9402) 536.2 CalTech (-118.15, 34.1358) (-114.4350, 35.5736) 373.3 Pittsburg (-79.9486, 40.4451) (-80.2406, 39.7665) 53.88 Rutgers (-70.4313, 40.5228) (-74.4294, 40.5492) 336.7 Umich (-83.7126, 42.2944) (-82.4517, 42.6130) 131.2 Wisc (-89.3867, 43.0757) (-88.2059, 42.1028) 150.6 11/16/2018

Experimental Results (cont.) 11/16/2018

Experimental Analysis Limited number of probing machines Effect of randomization is not obvious The best line estimation is too conservative. Intersection region of the circles is too big. 11/16/2018

Conclusions A randomized approach for IP mapping using clustering and outlier detection Location estimation based on dynamic programming and non-linear programming 11/16/2018

Future Work Adjusting the algorithm parameters: number of clusters number of trials and number of picked machines Proving a lower bound for the difference between the accuracy of randomized algorithm and deterministic algorithm 11/16/2018