Rake: Semantics Assisted Network- based Tracing Framework Yao Zhao (Bell Labs), Yinzhi Cao, Yan Chen, Ming Zhang (MSR) and Anup Goyal (Yahoo! Inc.) Presenter:

Slides:



Advertisements
Similar presentations
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Advertisements

Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
RPC Robert Grimm New York University Remote Procedure Calls.
NDN in Local Area Networks Junxiao Shi The University of Arizona
CTO Office Reliability & Security Distinctions and Interactions Hal Lockhart BEA Systems.
1 Internet Networking and Application Troubleshooting Yao Zhao EECS Department Northwestern University.
Look Who’s Talking: Discovering Dependencies between Virtual Machines Using CPU Utilization HotCloud 10 Presented by Xin.
Yale LANS ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang.
Identifying Performance Bottlenecks in CDNs through TCP-Level Monitoring Peng Sun Minlan Yu, Michael J. Freedman, Jennifer Rexford Princeton University.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Web Server Administration
1 A Suite of Schemes for User-level Network Diagnosis without Infrastructure Yao Zhao, Yan Chen Lab for Internet and Security Technology, Northwestern.
Author: Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica Presenter :Yinzhi Cao.
Student Projects in Computer Networking: Simulation versus Coding Leann M. Christianson Kevin A. Brown Cal State East Bay.
User-level Internet Path Diagnosis R. Mahajan, N. Spring, D. Wetherall and T. Anderson.
Polaris Financial Technologies Welcomes the members of Hyderabad chapter for the 2nd event on 4 th July 14 held by PACE (The Testing Practice)
Maintaining and Updating Windows Server 2008
Client/Server Architecture
Lesson 19 Internet Basics.
 Distributed Software Chapter 18 - Distributed Software1.
Presenter: Chi-Hung Lu 1. Problems Distributed applications are hard to validate Distribution of application state across many distinct execution environments.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
INTRODUCTION TO WEB DATABASE PROGRAMMING
IT 210 The Internet & World Wide Web introduction.
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
Towards Highly Reliable Enterprise Network Services via Inference of Multi-level Dependencies Paramvir Bahl, Ranveer Chandra, Albert Greenberg, Srikanth.
2013Dr. Ali Rodan 1 Handout 1 Fundamentals of the Internet.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Crawlers and Spiders The Web Web crawler Indexer Search User Indexes Query Engine 1.
Chapter 1: Introduction to Web Applications. This chapter gives an overview of the Internet, and where the World Wide Web fits in. It then outlines the.
Copyright © 2002 Pearson Education, Inc. Slide 3-1 CHAPTER 3 Created by, David Zolzer, Northwestern State University—Louisiana The Internet and World Wide.
Enabling Embedded Systems to access Internet Resources.
Web Overview The birth of Web: 1989 Now Web is about everything – Business (HR systems, e.g. NUHR) – Online Shopping (Amazon), Banking (Citibank, Chase)
Configuring, Diagnosing, and Securing Data Center Networks and Systems Yan Chen Lab for Internet and Security Technology (LIST) Department of Electrical.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
Module 10: Monitoring ISA Server Overview Monitoring Overview Configuring Alerts Configuring Session Monitoring Configuring Logging Configuring.
2004/12/02Slide Number 1 of 15 Exposure Time Calculator (ETC) as a Web Service Donald McLean 2004 Technology Open House.
©NEC Laboratories America 1 Huadong Liu (U. of Tennessee) Hui Zhang, Rauf Izmailov, Guofei Jiang, Xiaoqiao Meng (NEC Labs America) Presented by: Hui Zhang.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
MapReduce M/R slides adapted from those of Jeff Dean’s.
The Inter-network is a big network of networks.. The five-layer networking model for the internet.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
1 Welcome to CSC 301 Web Programming Charles Frank.
Module 10 Administering and Configuring SharePoint Search.
CS 6401 The World Wide Web Outline Background Structure Protocols.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Quality of System requirements 1 Performance The performance of a Web service and therefore Solution 2 involves the speed that a request can be processed.
1 Internet Networking and Application Troubleshooting Yao Zhao EECS Department Northwestern University.
ECEN “Internet Protocols and Modeling”, Spring 2012 Course Materials: Papers, Reference Texts: Bertsekas/Gallager, Stuber, Stallings, etc Class.
Distribution and components. 2 What is the problem? Enterprise computing is Large scale & complex: It supports large scale and complex organisations Spanning.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
CHAPTER 7 THE INTERNET AND INTRANETS 1/11. What is the Internet? 2/11 Large computer network ARPANET (Dept of Defense) It is international and growing.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
An Introduction to Web Services Web Services using Java / Session 1 / 2 of 21 Objectives Discuss distributed computing Explain web services and their.
CS 6401 The World Wide Web Outline Background Structure Protocols.
A Runtime Verification Based Trace-Oriented Monitoring Framework for Cloud Systems Jingwen Zhou 1, Zhenbang Chen 1, Ji Wang 1, Zibin Zheng 2, and Wei Dong.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Pinpoint: Problem Determination in Large, Dynamic Internet Services Mike Chen, Emre Kıcıman, Eugene Fratkin {emrek,
The Internet Technological Background. Topic Objectives At the end of this topic, you should be able to do the following: Able to define the Internet.
System Architecture CS 560. Project Design The requirements describe the function of a system as seen by the client. The software team must design a system.
Map Reduce.
Yan Chen Department of Electrical Engineering and Computer Science
Presentation transcript:

Rake: Semantics Assisted Network- based Tracing Framework Yao Zhao (Bell Labs), Yinzhi Cao, Yan Chen, Ming Zhang (MSR) and Anup Goyal (Yahoo! Inc.) Presenter: Yinzhi Cao Lab for Internet and Security Technology (LIST) Northwestern Univ.

22 Rake: Semantic Assisted Large Distributed System Diagnosis Motivation Related Work Rake Evaluation Conclusions

33 Motivation Large distributed systems involve hundreds or thousands of nodes –E.g. search system, CDN Host-based monitoring cannot infer the performance or detect bugs –Hard to translate OS-level info (such as CPU load) into application performance –Application log may not be enough Task-based approach is adopted in many diagnosis systems –WAP5, Magpie, Sherlock

44 Example of Message Linking in Search System URL Search keyword URL Search keyword Doc ID

55 Task-based Approaches The Critical Problem – Message Linking –Link the messages in a task together into a path or tree Black-box approaches –Do not need to instrument the application or to understand its internal structure or semantics –Time correlation to link messages Project 5, WAP5, Sherlock White-box approaches –Extracts application-level data and requires instrumenting the application and possibly understanding the application's source codes –Insert a unique ID into messages in a task X-Trace, Pinpoint

66 Problems of White-box and Black-Box White-box –Invasive due to source code modification Black-box –Rely on time Correlation –Accuracy affected by cross traffic

77 Rake Key Observations –Generally no unique ID linking the messages associated with the same request –Exist polymorphic IDs in different stages of the request Semantic Assisted –Use the semantics of the system to identify polymorphic IDs and link messages

8 Architecture of Rake

99 Message Linking Example URL Search keyword URL Search keyword Doc ID

10 Necessary Semantics Intra-node linking –The system semantics Inter-node linking –The protocol semantics Node P Q R S

11 Intra-Node Linking Follow_IDs: The IDs will be in the triggered messages by this message –One message may have multiple Follow_IDs for triggering multiple messages Link_ID: The ID of the current message –Match with Follow_ID previously seen 11 P Q R S Link_IDFollow_ID = Query_ID = Response_ID

12 Inter-Node Linking Query_IDs: The IDs will be in the response messages to this message –The communication is in Query/Response style, e.g. RPC call and DNS query/response. Response_ID: The ID of the current message to match Query_ID previously seen –By default requires the query and response to use the same socket 12 P Q R S Link_IDFollow_ID = Query_ID = Response_ID

Example of Rake Language (IRC) TCP 6667 Regular expression PRIVMSG\s+(.*) Same as Link ID No Return ID

14 Complicated Semantics The process of generating IDs may be complicated –XML or regular expression is not good at complex computations –So let user provide own functions User provide share/dynamic libraries Specify the functions for IDs in XML Implementation using Libtool to load user defined function in runtime

15 Example for DNS UDP 53 udp[10] & 128 == 0 User Function dns.so Link_ID Link_ID Link_ID …………………………….. Extract the queried host

16 Accuracy Analysis One-to-one ID Transforming –Examples In search, URL -> Keywords -> Canonical format In CoralCDN, URL -> Sha1 hash value –Ideally no error if requests are distinct Request ambiguousness –Search keywords Microsoft search data Less than 1% messages with duplication in 1s –Web URL Two real http traces Less than 1% messages with duplication in 1s –Chat messages No duplication with timestamps

17 Potential Applications Search –Verified by a Microsoft guy CDN –CoralCDN is studied and evaluated Chat System –IRC is tested Distributed File System –Hadoop DFS is tested

18 Evaluation Application –CoralCDN –Hadoop Experiment –Employ PlanetLab hosts as web clients –Retrieve URLs from real traces with different frequency Metrics –Linking accuracy (false positive, false negative) –Diagnosis ability Compared Approach –WAP5

19 CoralCDN Semantics

20 Message Linking Accuracy Use Log-Based Approach to Evaluate WAP5 and Rake Linking in CoralCDN

21 Diagnosis Ability Controlled Experiments –Inject junk CPU-intensive processes –Calculated the packet processing time using WAP5 and Rake Obviously Rake can identify the slow machine, while WAP5 fails.

22 Semantics of Hadoop Get operation

23 Abused IPC Call in Hadoop It is a problem that we found in Hadoop source code. Four “getFileInfo”s are used here, while only one is enough.

24 Running time of Hadoop steps

25 Discussion Implementation Experience –How hard for user to provide semantics CoralCDN – 1 week source code study DNS – a couple of hours Hadoop DFS – 1 week source code study

26 Conclusions of Rake Feasibility –Rake works for many popular applications in different categories Easiness –Rake allows user to write semantics via XML –Necessary semantics are easy to obtained given our experience Accuracy –Much more accurate than black-box approaches and probably matches white-box approaches

27 Q & A? Thanks!

28 Backup

29 Utilize Semantics in Rake Implement Different Rakes for Different Application is time consuming –Lesson learnt for implementing two versions of Rake for CoralCDN and IRC Design Rake to take general semantics –A unified infrastructure –Provide simple language for user to supply semantics

30 Questions on Semantics What Are the Necessary Semantics? –In worst case, re-implement the application How Does Rake Use the Semantics? –Naïve design is to implement Rake for each application with specific application semantics How Efficient Is the Rake with Semantics –Can message linking to accurate? –What’s the computational complexity of Rake?

31 Related Work Non-InvasiveInvasive Network Sniffing Interpo- sition App or OS Logs Source code modification Black-box Project 5, Sherlock WAP5Footprint Grey-boxRakeMagpie White-box X-Trace, Pinpoint Invasiveness Application Knowledge

32 Semantics of Hadoop Grep operation