WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * *

Slides:



Advertisements
Similar presentations
Emerging Platform#6: Cloud Computing B. Ramamurthy 6/20/20141 cse651, B. Ramamurthy.
Advertisements

Antonio Nappa⇤‡, Zhaoyan Xu†, M
“It’s going to take a month to get a proof of concept going.” “I know VMM, but don’t know how it works with SPF and the Portal” “I know Azure, but.
Advanced Java Class Web Applications – Part 0 (Introduction)
1 Software Testing and Quality Assurance Lecture 32 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
How Clients and Servers Work Together. Objectives Learn about the interaction of clients and servers Explore the features and functions of Web servers.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Prophiler: A fast filter for the large-scale detection of malicious web pages Reporter : 鄭志欣 Advisor: Hsing-Kuo Pao Date : 2011/03/31 1.
1 Enabling Secure Internet Access with ISA Server.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
11 SUPPORTING INTERNET EXPLORER IN WINDOWS XP Chapter 11.
Additional SugarCRM details for complete, functional, and portable deployment.
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
Niels Provos and Panayiotis Mavrommatis Google Google Inc. Moheeb Abu Rajab and Fabian Monrose Johns Hopkins University 17 th USENIX Security Symposium.
Session 10 Windows Platform Eng. Dina Alkhoudari.
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
1 Chapter 6: Proxy Server in Internet and Intranet Designs Designs That Include Proxy Server Essential Proxy Server Design Concepts Data Protection in.
Chapter 1: Introduction to Web Applications. This chapter gives an overview of the Internet, and where the World Wide Web fits in. It then outlines the.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Next Stop, the Cloud: Understanding Modern Web Service Deployment in EC2 and Azure Keqiang He, Alexis Fisher, Liang Wang, Aaron Gember, Aditya Akella,
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Hour 7 The Application Layer 1. What Is the Application Layer? The Application layer is the top layer in TCP/IP's protocol suite Some of the components.
The Inter-network is a big network of networks.. The five-layer networking model for the internet.
Firewall Fingerprinting Amir R. Khakpour 1, Joshua W. Hulst 1, Zhihui Ge 2, Alex X. Liu 1, Dan Pei 2, Jia Wang 2 1 Michigan State University 2 AT&T Labs.
CDN: Content Distribution Networks  References:  CS613 textbook, “Computer Networking – A Top-Down Approach”, 6 th edition. Chapter  The text.
All Your iFRAMEs Point to Us Cheng Wei. Acknowledgement This presentation is extended and modified from The presentation by Bruno Virlet All Your iFRAMEs.
Lugano Microsoft Azure Overview Ken Casada Technical Evangelist Microsoft Switzerland
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
McLean HIGHER COMPUTER NETWORKING Lesson 14 Firewalls & Filtering Comparison of Internet content filtering methods: firewalls, Internet filtering.
Windows Azure Virtual Machines Anton Boyko. A Continuous Offering From Private to Public Cloud.
CHAPTER 7 THE INTERNET AND INTRANETS 1/11. What is the Internet? 2/11 Large computer network ARPANET (Dept of Defense) It is international and growing.
Configuring and Troubleshooting Identity and Access Solutions with Windows Server® 2008 Active Directory®
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Web Technologies Lecture 13 Introduction to cloud computing.
On Premises Storage Servers Networking O/S Middleware Virtualization Data Applications Runtime You manage Infrastructure (as a Service) Storage Servers.
Cloud Architecture. SPI Model Cloud Computing Classification Model – SPI Cloud Computing Classification Model – SPI - SaaS: (Software as a Service) -
Launch Amazon Instance. Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides resizable computing capacity in the Amazon Web Services (AWS) cloud.
Uniform Resource Locator URL protocol URL host Path to file Every single website on the Internet has its own unique.
Alfresco Enterprise on Azure Shah Rahman Founder and CEO, CloudlyIO.
Chapter 11 – Cloud Application Development. Contents Motivation. Connecting clients to instances through firewalls. Cloud Computing: Theory and Practice.
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Alfresco on Azure Shah Rahman Founder and CEO, CloudlyIO.
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
11 SUPPORTING INTERNET EXPLORER IN WINDOWS XP Chapter 11.
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
Chapter 8.  Upon completion of this chapter, you should be able to:  Understand the purpose of a firewall  Name two types of firewalls  Identify common.
Architecting Enterprise Workloads on AWS Mike Pfeiffer.
4.01 How Web Pages Work.
Course: Cluster, grid and cloud computing systems Course author: Prof
Chapter 10: Web Basics.
Ad-blocker circumvention System
Software Applications for end-users
Practical Censorship Evasion Leveraging Content Delivery Networks
Written by : Thomas Ristenpart, Eran Tromer, Hovav Shacham,
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Cloud Computing Dr. Sharad Saxena.
AWS Cloud Computing Masaki.
Emerging technologies-
Deploying Your First Full Stack Application to the Cloud
Cloud Security AWS as an example.
LOAD BALANCING INSTANCE GROUP APPLICATION #1 INSTANCE GROUP Overview
Cloud Security AWS as an example.
4.01 How Web Pages Work.
TRANCO: A Research-Oriented Top Sites Ranking Hardened Against Manipulation By Prudhvi raju G id:
Presentation transcript:

WhoWas: A Platform for Measuring Web Deployments on IaaS Clouds Liang Wang *, Antonio Nappa +, Juan Caballero +, Thomas Ristenpart *, Aditya Akella * * University of Wisconsin-Madison + IMDEA Software Institute 1

2 Motivation An increasing number services are using clouds Understanding cloud usage pattern is important What is the usage pattern of a website? How many instances are used by a website? Do tenants leverage elasticity? Is piratebay using EC2? Are there OpenVPN servers in EC2? - Design new services & applications - Design provisioning & scaling algorithm

3 Motivation Little research about how tenants use public clouds Deepfield, 2012: 1/3 of daily users, 1% of Internet traffic are associated with AWS He et al., IMC 2013: 4% of the Alexa top million are in EC2/Azure -Answer the question: Who is using public clouds? -Technique: Investage DNS entries for Alexa top websites and network packet capture data. -No insight into changes to deployment pattern over time Bermudez et al, INFOCOM 2013: Exploring the cloud from passive measurements: The Amazon AWS case

4 Contributions We develop a new measurement platform, WhoWas, to facilitate measurement studies of public cloud services WhoWas High churn rates of IPs used by services each day Most of web services use a single IP New software adopted slowly. Outdated software popular Quantify growth in usage of EC2 & Azure Quantify growth in usage of EC2 & Azure Small number of malicious websites in clouds

The WhoWas Platform Analysis Clustering Engine VPC Map Feature Generator IP ranges TCP SYN Probes At most 3 probes for an IP per day At most two GET requests for an IP per day HTTP GET: http(s):// / IP= Lightweight probing to associate content to IPs over time 5 WhoWas DB Analysis APIs

6 Ethical Measurement Design Lightweight, low-frequency probing Robots.txt checking Note in the User-Agent IP exclusion list Collected data kept private Servers are not designed to be public (many tenants didn’t realize their servers are public) Providers charge tenants based on traffic Privacy issues

EC2: 4,702,208 IPs Oct 2013 – Dec rounds Azure: 495,872 IPs Nov 2013 – Dec rounds About 900 GB data in total Data Collection & DataSets No. of clusters 24.4% of all IPs 22.6% of all IPs 24.3% of all IPs Overall growth of No. of IPs responding to probes: 4.9% in EC2 and 7.7% in Azure 7

WhoWas Engines--Clustering WhoWas offers a new clustering heuristic … How to find IPs being operated by the same website? … Webpage Clustering 8

9 WhoWas Engines--Clustering Feature Extractor Fingerprint (six-item tuple) Title Keywords Template Google Analytics ID Server version Simhash of HTML textual content HTML contents For two fingerprints, check if : title1=title2 & keyword1=keyword2 & template1=template2 & server1=server2 & GID1=GID2? No Different clusters Yes Same top level clusters Clusters Unsupervised clustering + Elbow method Use simhash

10 WhoWas Engines--Clustering The No. of clusters increased by : 3.3% in EC2 and 6.2% in Azure EC2: 1,767,072 simhashes 243,164 clusters Azure: 210,418 simhashes 31,728 clusters

11 WhoWas Engines--Clustering About 80% use 1 IP, 0.1% use more than 50 IPs Large clusters tend to leverage cloud elasticity Total #IPMean #IP/RoundMin #IPMax #IP 51,21133,14530,62434,509 15,2835,5975,4355,785 3,8692,0291,7242,228 22,2261, ,501 8, ,837 Top 5 clusters by average number of IP addresses used per round (EC2)

12 More Results from WhoWas 1.Feature Adoption 2.Malicious Activity 3.Cloud Availability 4.Software Adoption

13 More Results from WhoWas 1.Feature Adoption 2.Malicious Activity 3.Cloud Availability 4.Software Adoption

14 Virtual Private Cloud Mapping Host A, Public IP=a Host B, Public IP=b DNS Resolve Host A Resolve Host B Get a Private IP != a Always Get Public IP b VPC networksClassic network Default DNS hostname =region specific string + IP EC2 Data Center

15 EC2 VPC usage increase whereas classic decrease Change over time in classic-only, VPC-only, and mixed clusters in EC2 classic-onlyVPC-onlymixed clusters

16 More Results from WhoWas 1.Feature Adoption 2.Malicious Activity 3.Cloud Availability 4.Software Adoption

Lifetime of malicious IP is long 90+ days! Webpage from an IP URLs in webpage 60% up for 7+ days WhoWas DB Safe Browsing API IP is malicious IP is benign EC2: 1,393 malicious URLs 196 malicious IPs Azure: 14 malicious URLs 13 malicious IPs 17

18 File hosting services are used for distributing malicious contents Domain# of URLs flagged as malicious dl.dropboxusercontent.com993 dl.dropbox.com936 download-instantly.com295 tr.im268 IP ranges Malicious activity history VirusTotal API EC2: 2,070 malicious IPs 13,752 malicious URLs Azure: No malicious IPs!

19 Cloud Measurement Challenge and Future VM VM Backend VM No public IP Backend VM No public IP Frontend VM Public IP = Frontend VM Public IP = VPC VM No default HTTP(S) Port VM No default HTTP(S) Port Firewall VM Default website Other websites VM Website VM Website: deny IP access Only see a portion of web servers Only see a portion of web sites’ pages Lower bound on number of IPs used by web services Able to find Fail to find

20 Other results are in the paper! Visit our website: to get more information!

21 Conclusion WhoWas: new measurement platform Lightweight probing to associate content to IPs over time Used WhoWas for several first-of-their-kind measurements: Growth rates of IP usage Identification of malicious websites Software adoption rate in clouds … Questions?

22 Conclusion WhoWas: new measurement platform Lightweight probing to associate content to IPs over time Used WhoWas for several first-of-their-kind measurements: Growth rates of IP usage Identification of malicious websites Software adoption rate in clouds … Questions?