1 A Framework for Network Monitoring and Performance Based Routing in Distributed Middleware Systems Gurhan Gunduz Advisor: Professor.

Slides:

Advertisements

Similar presentations

26/05/2004HEPIX, Edinburgh, May Lemon Web Monitoring Miroslav Šiket CERN IT/FIO

Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

Grid Monitoring Discussion Dantong Yu BNL. Overview Goal Concept Types of sensors User Scenarios Architecture Near term project Discuss topics.

Introduction to Databases

Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.

Extensible Networking Platform IWAN 2005 Extensible Network Configuration and Communication Framework Todd Sproull and John Lockwood

Information Retrieval in Practice

FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification D. Sandler, A. Mislove, A. Post, P. Druschel Presented by: Andrew Sutton.

Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.

JXTA P2P Platform Denny Chen Dai CMPT 771, Spring 08.

Application Layer Anycasting: A Server Selection Architecture and Use in a Replicated Web Service Presented in by Jayanthkumar Kannan On 11/26/03.

1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

Internetworking Fundamentals (Lecture #2) Andres Rengifo Copyright 2008.

Overview of Search Engines

Client-Server Processing and Distributed Databases

Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.

A Web Services Based Streaming Gateway for Heterogeneous A/V Collaboration Hasan Bulut Computer Science Department Indiana University.

Chapter Overview TCP/IP Protocols IP Addressing.

Principles for Collaboration Systems Geoffrey Fox Community Grids Laboratory Indiana University Bloomington IN 47404

1.  TCP/IP network management model: 1. Management station 2. Management agent 3. Management information base 4. Network management protocol 2.

TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.

Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.

The NaradaBroker: A Flexible Messaging Infrastructure Rahim Lakhoo (Raz) DSG Seminar 12 th April 2004.

SOA, BPM, BPEL, jBPM.

JMS Compliance in NaradaBrokering Shrideep Pallickara, Geoffrey Fox Community Grid Computing Laboratory Indiana University.

13/09/2015 Michael Chai; Behrouz Forouzan Staffordshire University School of Computing Transport layer and Application Layer Slide 1.

Institute of Computer and Communication Network Engineering OFC/NFOEC, 6-10 March 2011, Los Angeles, CA Lessons Learned From Implementing a Path Computation.

June 25 th PDPTA Incorporating an XML Matching Engine into Distributed Brokering Systems.

DISTRIBUTED COMPUTING

A Portal Based Approach to Viewing Aggregated Network Performance Data in Distributed Brokering Systems By Gurhan Gunduz, Shrideep Pallickara, Geoffrey.

QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.

A Transport Framework for Distributed Brokering Systems Shrideep Pallickara, Geoffrey Fox, John Yin, Gurhan Gunduz, Hongbin Liu, Ahmet Uyar, Mustafa Varank.

An Integrated Instrumentation Architecture for NGI Applications Ian Foster, Darcy Quesnel, Steven Tuecke Argonne National Laboratory The University of.

Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.

Lec 3: Infrastructure of Network Management Part2 Organized by: Nada Alhirabi NET 311.

GridNM Network Monitoring Architecture (and a bit about my phd) Yee-Ting Li, 1 st Year UCL, 17 th June 2002.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

BitTorrent enabled Ad Hoc Group 1  Garvit Singh( )  Nitin Sharma( )  Aashna Goyal( )  Radhika Medury( )

Group 3 Sandeep Chinni Arif Khan Venkat Rajiv. Delay Tolerant Networks Path from source to destination is not present at any single point in time. Combining.

10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.

1 Chapter 8 – TCP/IP Fundamentals TCP/IP Protocols IP Addressing.

CIS 210 Systems Analysis and Development Week 8 Part II Designing Distributed and Internet Systems,

A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,

Investigating the Performance of Audio/Video Service Architecture I: Single Broker Ahmet Uyar & Geoffrey Fox Tuesday, May 17th, 2005 The 2005 International.

A Demonstration of Collaborative Web Services and Peer-to-Peer Grids Minjun Wang Department of Electrical Engineering and Computer Science Syracuse University,

Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.

Distributed Handler Architecture (DHArch) Beytullah Yildiz Advisor: Prof. Geoffrey C. Fox.

Distributed Handler Architecture (DHArch) Beytullah Yildiz Advisor: Prof. Geoffrey C. Fox.

Distributed Handler Architecture Beytullah Yildiz

June 18 th ACM Middleware NaradaBrokering: A Middleware Framework and Architecture for.

09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.

Distributed Handler Architecture (DHArch) Beytullah Yildiz Advisor: Prof. Geoffrey C. Fox.

Scaling and Fault Tolerance for Distributed Messages in a Service and Streaming Architecture Hasan Bulut Advisor: Prof. Geoffrey Fox Ph.D. Defense Exam.

A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.

1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.

1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.

Lec 3: Infrastructure of Network Management Part2 Organized by: Nada Alhirabi NET 311.

Mobile IP THE 12 TH MEETING. Mobile IP  Incorporation of mobile users in the network.  Cellular system (e.g., GSM) started with mobility in mind. 

WP2: Data Management Gavin McCance University of Glasgow.

Information Retrieval in Practice

Lec 5: SNMP Network Management

Search Engine Architecture

Self Healing and Dynamic Construction Framework:

CHAPTER 3 Architectures for Distributed Systems

#01 Client/Server Computing

Design and Implementation of Audio/Video Collaboration System Based on Publish/subscribe Event Middleware CTS04 San Diego 19 January 2004 PTLIU Laboratory.

MWCN`03 Singapore 28 October 2003

Enterprise Infrastructure

#01 Client/Server Computing

Presentation transcript:

1 A Framework for Network Monitoring and Performance Based Routing in Distributed Middleware Systems Gurhan Gunduz Advisor: Professor Geoffrey Fox

2 Outline Motivation Research issues Overview of Performance based dynamic routing system Network monitoring framework Aggregation framework Dynamic routing framework Related work Conclusion and future work

3 Motivation I Tango experience in Syracuse  Suffered from bad network performance Packets travel from Syracuse to the west cost and then back to Mississippi.  Impossible to re-route traffic using hardware routers

4 Motivation II

5 Research Issues We investigate the question of how to develop a performance based dynamic routing system.  De-centralized  Scalable  Hardware independent We identify 4 core research issues for a complete performance based dynamic routing system and investigate related issues;  Performance monitoring  Aggregation of measured metrics  Distribution of performance metrics to relevant locations  Dynamic routing

6 Performance Based Dynamic Routing Our framework has the following components  Monitoring Scheme Monitor the network and gather the performance metrics  Aggregation Scheme Aggregate the performance measurements from monitoring services Store them into database.  Distribute performance metrics to relevant locations Updates the link costs in the system with the ones calculated by the performance monitoring system  Dynamic routing scheme Uses the new costs to dynamically update routes in the system We chose NaradaBrokering distributed broker system to implement our ideas  Open source Ideas developed here are applicable to other networking and messaging infrastructures

7 Well Known Systems Network Weather Service is a well known performance monitoring tool  Monitor TCP/IP, CPU load and available memory  Predicts the future performance  Does not have dynamic routing feature  No support for protocols other than TCP/IP  Does not use messaging infrastructure Monitoring Agents in A Large Integrated Services Architecture(MonALISA)  Distributed monitoring service  Gets performance metrics from SNMP (Simple Network Management Protocol)

8 NaradaBrokering Distributed event brokering system designed to run on a large network of cooperating broker nodes. NB uses enterprise service bus style network overlay technology.  It constructs a logical overlay network on top of the underlying network.  Organizes nodes into clusters, super-clusters, super- super-clusters to achieve efficient routing/dissemination schemes. Communication in NaradaBrokering is asynchronous. NaradaBrokering provides support for JMS, P2P interactions, grid services, A/V conferencing while supporting communication through firewalls and proxies

9 Performance Monitoring System I Measures the performance of the links originating from a node. Every node incorporates the performance monitoring system  De-centralized algorithm  Improves scalability  New nodes can be added to the system without the need to interact with a centralized coordinating unit

10 Monitoring Service

11 Performance Monitoring System II Each node could have several links that use different transport protocols for communication Need to have transport independent design  New transports can be added easily Nodes can start/stop performance measurements either for a given link or the entire set of links at that node

12 Performance Monitoring System III Measurement initiator module controls all the performance monitoring activities at a given node Monitored link structure has been created to enable transport protocols to perform performance measurements.  Abstract class Supported transports are;  SSL, HTTP, HTTPS, TCP, NIO TCP, UDP

13 What are the Measured Network Metrics Latency  Transit delay from a source to a destination Jitter  Measure how the spacing between successive messages over a given link varies Loss rates  Number of messages that are lost in transit between the source and the destination

14 Performance Packet

15 Measuring Performance Metrics Loss rates are computed based on received responses and the lack thereof Latencies are computed based on time stamps  Incorporates support for outlier removal Jitter is computed based on the delays between successive messages

16 Frequency of Measurements Frequency of measurements are controlled by the performance monitoring system High frequency measurements could corrupt the metrics being measured  Generates unnecessary traffic Low frequency measurement could miss some short time bursts

17 Link Cost vs. Frequency I

18 Link Cost vs. Frequency II

19 Performance Aggregation System Aggregates the performance metrics from nodes which incorporates the performance monitoring system. Performance monitoring system reports performance data to a performance aggregation node Performance metrics are encapsulated in XML and sent to the aggregation nodes

20 Performance Aggregation System

21 Encapsulating performance data Performance monitoring system encapsulates performance data in an XML format. Why XML;  Easy access to relevant fields in the performance data.  Description capability of the content provides support for intelligent data mining through the use of XPATH queries  Thanks to XML structure, it is easy to incorporate results gathered from another network monitoring services such as NWS and it is easy for other systems to use our performance metrics Disadvantage  Causes an overhead

22 Document Construction Time From an XML File

23 XPATH Evaluation Time

24 XSL Transformation Time

25 Storing Performance Data Flat files No additional database required Slow for large data Easy to display on portals using XSLT which converts a given XML file into HTML using the given XSL style sheet Database Relational database program is needed MySQL  The performance data is stored in non-XML format  Fast searches on data by using SQL queries

26 Flat File

27 HTML Representation

28 Data Mining Stored data can be mined to identify, circumvent, project and prevent system bottlenecks. Check metrics for thresholds and Inform nodes to take actions to correct situation  Frequency of performance measurements can be lowered or increased  Measurement can be stopped  Number of links can be reduced

29 DYNAMIC ROUTING

30 Finding the Best Routes NaradaBrokering organizes nodes into clusters, super-clusters, super-super-clusters to achieve efficient routing/dissemination schemes. Broker Network Map (BNM)  Each broker maintains its own BNM  Abstract view of broker network  Provides information regarding the inter- connections between brokers in the cluster  Ensure the calculation of optimal paths

31

32 Broker Network Map

33 Link Cost Link costs are computed based on the metrics found by the performance monitoring system LinkCost=Overall_coeff+Latency_Coeff*Latency+ PKT_LOSS_COEF*lossrate+JITTER_COEF*jitter Link cost formula can be modified to favor specific metrics  Audio and video applications require good jitter, so jitter coefficient can be increased These link costs should be disseminated within the system to update existing link costs Updated costs are used to find the best routes in the system

34 Dissemination of Link Costs I Dissemination should be carefully done since the number of the links could be really high  Threshold values are used to check if the new link costs are worth propagating Each link in the system has unique ID.  Universally Unique Identifier (UUID) is used to generate unique IDs  Prevents conflicts Topic based publish/subscribe scheme is used for dissemination  Link ID, new link cost and the topic name are put into the message before publishing it.

35 Dissemination of New Link Costs II Interested nodes subscribe to a specific topic to get the measured costs for the links It is a loosely-coupled system  Publishers and subscribers do not know each other Increases scalability New nodes can be added easily

36 Testing the System There are two routes from node 2 to node 3: 2  1  3 Cost is 86 2  4  3 Cost is 3

37 Cost Values

38 Testing the system 2  6  5  4  Cost is  3  4  Cost is 32

39 Cost values ConnectionCost CGL1  Korea 102 CGL1  CGL3 1.2 CGL1  UK 52 CGL2  Korea 99 CGL2  San Diego 30 CGL3  San Diego 29 CGL3  UK 50

40 Related Work I There are several disjoint activities on network performance and characteristic monitoring for the grid. Existing network monitoring systems tend to use the well known measurement engines (PingER, IPERF, UDP throughput, FTP throughput)  Each implement context specific framework and visualization  All speak different languages

41 Related Work II Network Weather Service  Monitors network, CPU and memory performances  Only TCP protocol  Make forecasts Self configuring network monitor project has a hardware infrastructure to monitor the network. DataGrid EDG project—site-to-site monitoring and publication to Relational Grid Monitoring Architecture UK e-science monitoring infrastructure—aggregate traffic statistics (available on an ad hoc basis form core providers.) There are Peer-to-Peer applications which implements dynamic routing  Skype uses intelligent routing to route calls through best possible paths

42 Conclusions and Future work

43 Summary of Contributions Designing and implementing a complete framework for scalable, de-centralized, and hardware independent performance based dynamic routing system which consists of performance monitoring system, aggregation system and dynamic routing system Proposing an architecture for transport protocol independent monitoring framework Designing an efficient and scalable way of disseminating new costs within the system Investigating the issues related to the frequency of measurements and the overhead caused by the performance based dynamic routing system.

44 Future Work More sophisticated deployment of statistical and data mining techniques XML and object databases could be investigated to see how they work with our system User interface that increases the interaction with the administrators can be developed.