A Data Stream Management System for Network Traffic Management Shivnath Babu Stanford University Lakshminarayanan Subramanian Univ. California, Berkeley.

Slides:



Advertisements
Similar presentations
Jennifer Rexford Princeton University MW 11:00am-12:20pm Logically-Centralized Control COS 597E: Software Defined Networking.
Advertisements

1 11. Streaming Data Management Chapter 18 Current Issues: Streaming Data and Cloud Computing The 3rd edition of the textbook.
Jaringan Komputer Lanjut Packet Switching Network.
1 EL736 Communications Networks II: Design and Algorithms Class3: Network Design Modeling Yong Liu 09/19/2007.
UNIT-IV Computer Network Network Layer. Network Layer Prepared by - ROHIT KOSHTA In the seven-layer OSI model of computer networking, the network layer.
A Flexible Model for Resource Management in Virtual Private Networks Presenter: Huang, Rigao Kang, Yuefang.
1 Improving the Performance of Distributed Applications Using Active Networks Mohamed M. Hefeeda 4/28/1999.
1 Continuous Queries over Data Streams Vitaly Kroivets, Lyan Marina Presentation for The Seminar on Database and Internet The Hebrew University of Jerusalem,
Traffic Engineering With Traditional IP Routing Protocols
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
ACN: IntServ and DiffServ1 Integrated Service (IntServ) versus Differentiated Service (Diffserv) Information taken from Kurose and Ross textbook “ Computer.
Distributed DBMSs A distributed database is a single logical database that is physically distributed to computers on a network. Homogeneous DDBMS has the.
Network Protocols Designed for Optimizability Jennifer Rexford Princeton University
Traffic Engineering and Routing Hansen Bow. Topics Traffic Engineering with MPLS Issues Concerning Voice over IP Features of Netscope QoS Routing for.
1 PODS 2002 Motivation. 2 PODS 2002 Data Streams data sets Traditional DBMS – data stored in finite, persistent data sets data streams New Applications.
Network Monitoring for Internet Traffic Engineering Jennifer Rexford AT&T Labs – Research Florham Park, NJ 07932
Models and Issues in Data Streaming Presented By :- Ankur Jain Department of Computer Science 6/23/03 A list of relevant papers is available at
An Active Reliable Multicast Framework for the Grids M. Maimour & C. Pham ICCS 2002, Amsterdam Network Support and Services for Computational Grids Sunday,
Game-based Analysis of Denial-of- Service Prevention Protocols Ajay Mahimkar Class Project: CS 395T.
WAN Technologies.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
Not All Microseconds are Equal: Fine-Grained Per-Flow Measurements with Reference Latency Interpolation Myungjin Lee †, Nick Duffield‡, Ramana Rao Kompella†
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
UCSC 1 Aman ShaikhICNP 2003 An Efficient Algorithm for OSPF Subnet Aggregation ICNP 2003 Aman Shaikh Dongmei Wang, Guangzhi Li, Jennifer Yates, Charles.
Internet Traffic Management Prafull Suryawanshi Roll No - 04IT6008.
Mobile IP Performance Issues in Practice. Introduction What is Mobile IP? –Mobile IP is a technology that allows a "mobile node" (MN) to change its point.
Integrated Services Advanced Multimedia University of Palestine University of Palestine Eng. Wisam Zaqoot Eng. Wisam Zaqoot December 2010 December 2010.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
Internet Traffic Management. Basic Concept of Traffic Need of Traffic Management Measuring Traffic Traffic Control and Management Quality and Pricing.
SIGCOMM 2002 New Directions in Traffic Measurement and Accounting Focusing on the Elephants, Ignoring the Mice Cristian Estan and George Varghese University.
HERO: Online Real-time Vehicle Tracking in Shanghai Xuejia Lu 11/17/2008.
Traffic Engineering for ISP Networks Jennifer Rexford Internet and Networking Systems AT&T Labs - Research; Florham Park, NJ
An Integration Framework for Sensor Networks and Data Stream Management Systems.
By Sylvia Ratnasamy, Andrey Ermolinskiy, Scott Shenker Presented by Fei Jia Revisiting IP Multicast.
Query Processing, Resource Management, and Approximation in a Data Stream Management System.
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
DoWitcher: Effective Worm Detection and Containment in the Internet Core S. Ranjan et. al in INFOCOM 2007 Presented by: Sailesh Kumar.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Adaptive Query Processing in Data Stream Systems Paper written by Shivnath Babu Kamesh Munagala, Rajeev Motwani, Jennifer Widom stanfordstreamdatamanager.
Data Stream Management Systems
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Two-Tier Resource Management Designed after the Internet’s two-tier routing hierarchy Separate packet forwarding from admission and resource allocation.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
INSIGNIA : A QOS ARCHITECTURAL FRAMEWORK FOR MANETS Course:-Software Architecture & Design Team Members 1.Sameer Agrawal 2.Vivek Shankar Ram.R.
Speaker: Yi-Lei Chang Advisor: Dr. Kai-Wei Ke 2012/05/15 IPv6-based wireless sensor network 1.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Company LOGO Network Management Architecture By Dr. Shadi Masadeh 1.
An End-to-End Service Architecture r Provide assured service, premium service, and best effort service (RFC 2638) Assured service: provide reliable service.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
1 Monitoring: from research to operations Christophe Diot and the IP Sprintlabs ipmon.sprintlabs.com.
Access Link Capacity Monitoring with TFRC Probe Ling-Jyh Chen, Tony Sun, Dan Xu, M. Y. Sanadidi, Mario Gerla Computer Science Department, University of.
1 Netflow Collection and Aggregation in the AT&T Common Backbone Carsten Lund.
WAN Technologies. 2 Large Spans and Wide Area Networks MAN networks: Have not been commercially successful.
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
SketchVisor: Robust Network Measurement for Software Packet Processing
Jennifer Rexford Princeton University
Data Streaming in Computer Networking
Chapter 4: Routing Concepts
Virtual LANs.
ECE 544 Protocol Design Project 2016
Adaptive Query Processing (Background)
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

A Data Stream Management System for Network Traffic Management Shivnath Babu Stanford University Lakshminarayanan Subramanian Univ. California, Berkeley Jennifer Widom Stanford University NRDM, Santa Barbara, CA, May 25, 2001

Network Traffic Management Large networks are growing complex and difficult to manage –Increasing demands, overprovisioning, hardware changes, manual configuration –Lack of information to configure network for effective usage –Collect data E.g., packet traces, network-flow data, SNMP data – Process data E.g., compute link utilization, per-hop delays, traffic demands – Deploy mechanisms to control traffic E.g., change routing parameters Network traffic management is becoming an important part of the Internet infrastructure Data management forms a core part of traffic management

Traffic Management: Data Collection Many data sources –Packet and flow traces –Router forwarding tables and configuration data –SNMP data –Active measurements of packet delay, link utilization Data is collected continuously –Networks need to be 24*7 for everything –Huge and fast-growing databases Many current traffic management systems store collected data in file systems or data warehouses

Traffic Management: Data Processing Sophisticated data processing is required Measuring link utilization –Aggregate packet traces Maintaining network topology –Join SNMP data from different network elements Deriving traffic demands –Join network flow traces, router forwarding tables and configuration data, and SNMP data Anomaly detection, traffic modeling, traffic prediction, and many others Most current traffic management systems process data using ad-hoc scripts or software toolkits

Challenge in Data Management: Online Data Processing Most current traffic management applications process data offline –Huge volume of data –Complex processing involved Offline processing is indeed appropriate for some applications –E.g., capacity planning, determining pricing plans Many traffic management applications need online processing –E.g., congestion cause detection, resource allocation for guaranteed QoS, detecting denial-of-service attacks, detecting Service-Level Agreement violations, admission control and traffic policing

Online Processing What’s wrong with using a file system and procedural processing? –Difficult to maintain and reuse (not a long term solution) What’s wrong with using a Database Management System (DBMS)? –DBMS expects all data to be managed as persistent data sets –DBMS assumes “one-time” queries against stored and finite data

A Data Stream Management System (DSMS) for Online Processing Data Streams are the appropriate model for online processing –Data is changing frequently (often exclusively though insertions) –It is impractical to operate on same data multiple times Continuous queries -- issued once and run “forever” Performance –Need continuous-query optimization –Need adaptive query-optimization A Data Stream Management System for traffic management –Idea: Support online processing with continuous queries over data streams

A Data Stream Management System for Online Processing (cont’d) Packet traces Flow tracesRouter forwarding tables SNMP data Active measurements Applications based on online processing Continuous Queries Data Management System Streams Data Stream Management System

Continuous Query over a Single Data Stream Many options with different ramifications Q A? Stream is infinite, append-only (e.g., packet traces) – size of A is unbounded for a filter query -- cannot store A – Stream out A -- but self-join query requires unbounded intermediate state to compute A – Updates to tuples in A -- e.g., aggregation query Stream has updates, deletions (e.g., SNMP data) – often require more intermediate state to compute A Data Stream

Operator Architecture in a DSMS Stream Append-only semantics: Result tuples that won’t change later Update semantics: Updates to current result Store: Result tuples that could change later Scratch: Intermediate state to compute future results Throw: Unneeded data

Example Queries from Traffic Management Single packet trace input data stream (IP headers over a link) Continuous query 1: Link utilization (total #bytes sent over the link) –Store -- sum of packet lengths –Stream -- empty –Scratch -- empty Continuous query 2: Number of flows per protocol Flow Identifier Scratch Packet Trace Stream Per-Protocol #flows counter Store

Example Queries from Traffic Management (cont’d) Continuous query 3: Join packet traces collected from different points in the network to measure packet delays (or identify routes) Packet trace 1 Packet trace 2 Symmetric Hash-Join Scratch Stream Efficient intermediate state management Intermediate state is unbounded theoretically Use of constraints can reduce intermediate state Can reclaim memory after each match Approximate answers can further reduce intermediate state Can you trade precision for state? HT 1 HT 2

Examples Queries from Traffic Management (cont’d) Continuous query 4: Identify top 5% (source IP address, destination IP address) Pairs with maximum bandwidth consumption over a link Non-trivial query over a stream –Number of distinct Pairs can vary –Bandwidth consumption of each Pair can vary –How much intermediate state is needed? Count Distinct Pairs Scratch Packet trace Stream Bandwidth Consumption Of Pairs Store Scratch Top 5% Pairs

Further Challenges in Data Management: Distributed Stream Processing Data is collected from different points in a network Structure of an Internet Service Provider imposes restrictions –Core routers are sensitive (so are the network operators ) Sending collected data to a central processing site is harmful –Additional load on the network –Hinders real-time processing –Won’t scale with the network and traffic Truly distributed processing is infeasible for many queries –Goal: minimize communication traffic –Trade communication traffic for precision

Example Queries from Traffic Management (cont’d) Continuous query 5: Identify top 5% of destination IP addresses with maximum bandwidth consumption (to detect denial-of- service attacks) CQ 5 local CQ 5 local CQ 5 local CQ 5 global Stream Hierarchical processing structure could also be useful

Summary of Basic Problems and Techniques Continuous queries over data streams is a unique combination of: –Online processing –Storage constraints -- amount of memory available is bounded Query result size may be unbounded Intermediate state may be unbounded Relevant techniques –Online data structures (not build-and-throw) –Summarization: samples, histograms, wavelets, fractals –Adaptivity Data characteristics Flow rates Amount of memory

Some Simplifying Assumptions In talk, but not necessarily in work Traffic management data is clean –Data is dirty: incomplete, inconsistent –Temporal uncertainties –Could be reduced as the importance of traffic management is realized Traffic management data is tuple-oriented –Often true –Implications for query language

Conclusions Traffic management requires efficient data management Many traffic management applications benefit from online data processing Case for a Data Stream Management System (DSMS) –Provides continuous queries over data streams for online processing –Many interesting research issues –Work is in progress Additional references –S. Babu and J. Widom. Continuous queries over data streams –STREAM project homepage