Ken Birman Cornell University. CS5410 Fall 2008..

Slides:



Advertisements
Similar presentations
Reliable Multicast for Time-Critical Systems Mahesh Balakrishnan Ken Birman Cornell University.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Multicast Fundamentals n The communication ways of the hosts n IP multicast n Application level multicast.
Dynamic Routing Scalable Infrastructure Workshop, AfNOG2008.
1 Improving the Performance of Distributed Applications Using Active Networks Mohamed M. Hefeeda 4/28/1999.
CS514: Intermediate Course in Operating Systems Professor Ken Birman Vivek Vishnumurthy: TA.
Internetworking Different networks –Different bit rates –Frame lengths –Protocols.
M ERCURY : A Scalable Publish-Subscribe System for Internet Games Ashwin R. Bharambe, Sanjay Rao & Srinivasan Seshan Carnegie Mellon University.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
588 Section 6 Neil Spring May 11, Schedule Notes – (1 slide) Multicast review –(3slides) RLM (the paper you didn’t read) –(3 slides) ALF & SRM –(8.
Matching Patterns Servers assemble sequences of notifications from smaller subsequences or from single notifications.This technique requires an advertisement.
Lecture 5A: Communication and Storage IT 202—Internet Applications Based on notes developed by Morgan Benton.
Oct 21, 2004CS573: Network Protocols and Standards1 IP: Addressing, ARP, Routing Network Protocols and Standards Autumn
An Overlay Multicast Infrastructure for Live/Stored Video Streaming Visual Communication Laboratory Department of Computer Science National Tsing Hua University.
Hermes: A Distributed Event- Based Middleware Architecture Peter Pietzuch and Jean Bacon 1st DEBS Workshop, Vienna,
Sage Insights 2015 Using the mobile and social benefits of Sage CRM to enhance your business. Ocean Helberg. Senior CRM Consultant.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall /18/2014.
© 2009 AT&T Intellectual Property. All rights reserved. Multimedia content growth: From IP networks to Medianets Cisco-IEEE ComSoc Webinar. Sept. 23, 2009.
Achieving fast (approximate) event matching in large-scale content- based publish/subscribe networks Yaxiong Zhao and Jie Wu The speaker will be graduating.
1 Internet Protocol: Forwarding IP Datagrams Chapter 7.
1 Distributed Monitoring of Peer-to-Peer Systems By Serge Abiteboul, Bogdan Marinoiu Docflow meeting, Bordeaux.
Lecture 8 Page 1 Advanced Network Security Review of Networking Basics: Internet Architecture, Routing, and Naming Advanced Network Security Peter Reiher.
P.1Service Control Technologies for Peer-to-peer Traffic in Next Generation Networks Part2: An Approach of Passive Peer based Caching to Mitigate P2P Inter-domain.
HERO: Online Real-time Vehicle Tracking in Shanghai Xuejia Lu 11/17/2008.
Data Structures Week 5 Further Data Structures The story so far  We understand the notion of an abstract data type.  Saw some fundamental operations.
SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall /18/2014.
Higashino Lab. Maximizing User Gain in Multi-flow Multicast Streaming on Overlay Networks Y.Nakamura, H.Yamaguchi and T.Higashino Graduate School of Information.
Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.
Sharing Information across Congestion Windows CSE222A Project Presentation March 15, 2005 Apurva Sharma.
Streaming over Subscription Overlay Networks Department of Computer Science Iowa State University.
Cayuga: A General Purpose Event Monitoring System Mirek Riedewald Joint work with Alan Demers, Johannes Gehrke, Biswanath Panda, Varun Sharma (IIT Delhi),
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
1 Emergency Alerts as RSS Feeds with Interdomain Authorization Filippo Gioachin 1, Ravinder Shankesi 1, Michael J. May 1,2, Carl A. Gunter 1, Wook Shin.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Data Oriented Network Architecture (DONA) Andrey Ermolinskiy Mohit Chawla CS 262 A Project Poster December 14.
Bob Knowledge Plane -- Scaling of the WHY App Bob Braden, ISI 24 Sept 03.
Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.
Enterprise Integration Patterns CS3300 Fall 2015.
Actualog Social PIM Helps Companies to Manage and Share Product Information Using Secure, Scalable Ease of Microsoft Azure MICROSOFT AZURE ISV PROFILE:
Accumulus Delivers Enterprise Class Subscription Billing and Automation Solutions for Gaming, Retail, and More on the Scalable Microsoft Azure Platform.
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
Uni Innsbruck Informatik - 1 Network Support for Grid Computing... a new research direction! Michael Welzl DPS NSG Team
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
SocialVoD: a Social Feature-based P2P System Wei Chang, and Jie Wu Presenter: En Wang Temple University, PA, USA IEEE ICPP, September, Beijing, China1.
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
4: Network Layer4-1 Chapter 4: Network Layer Last time: r Internet routing protocols m RIP m OSPF m IGRP m BGP r Router architectures r IPv6 Today: r IPv6.
CS470 Computer Networking Protocols
Integration Patterns in BizTalk Server 2004 Integration Patterns Explained What are integration patterns? What patterns does BizTalk Server 2004 provide.
Ad Hoc On-Demand Distance Vector Routing (AODV) ietf
Avantida is Helping Ocean Carriers to Optimize Their Empty Shipping Container Flows, Based on the Highly Scalable Microsoft Azure Platform MICROSOFT AZURE.
Q and A, Ch. 21 IS333, Spring 2016 Victor Norman.
ETERE NUNZIO The ultimate end-to-end solution for your NewsRoom.
BIG DATA/ Hadoop Interview Questions.
Mobile IP THE 12 TH MEETING. Mobile IP  Incorporation of mobile users in the network.  Cellular system (e.g., GSM) started with mobility in mind. 
WHY VIDEO SURVELLIANCE
IP: Addressing, ARP, Routing
Optimising Streaming Systems with SDN/P4/NetFPGA
Recommender Systems & Collaborative Filtering
Introduction to Wireless Sensor Networks
CS4470 Computer Networking Protocols
Implementation of GPU based CCN Router
Net 323 D: Networks Protocols
Software Defined Networking (SDN)
Data-Centric Networking
WHY VIDEO SURVELLIANCE
CS5412: Using Gossip to Build Overlay Networks
EE 122: Lecture 22 (Overlay Networks)
Presentation transcript:

Ken Birman Cornell University. CS5410 Fall 2008.

Content filtering Two kinds of publish-subscribe Topic-based: A topic defines the group of receivers. Some systems allow you to subscribe to a pattern that matches sets of topics, by having a special “topics” meta- topic, but this is still topic-oriented For scaling, typically must map topics to a smaller set of multicast groups or overlays Content-based: A query determines the messages that each receiver will accept Can implement in a database or in an overlay

Challenges… Each approach has substantial challenges For topic-based systems, the “channelization” problem (mapping many topics to a small number of multicast channels or overlays) is very hard In the most general cases, channelization is NP-complete! Yet some form of channelization may be critical because few multicast mechanisms scale well if huge numbers of groups are needed Today we won’t look closely at the channelization problem, but may revisit it later if time permits Under some conditions, may be solvable

Challenges… What about content-based solutions? We need to ask how to express queries “on content” Could use Xquery, the new XML query language Or could define a special-purpose packet inspection solution, a so-called “deep packet inspector” Then would ideally want to build a smart overlay Any given packet routes towards its destinations… … and any given router optimizes so that it doesn’t have an amount of work proportional to the number of pending content queries

Scenarios When would content routing be helpful? In cloud systems, often want to route a request to some system that processed prior work of a related nature For example, if I interact with Premier Cru to purchase 2007 Rhone red wines, as I query their data center it could build up a cache of data. If my queries revisit the same nodes, they perform far better In (unpublished) work at Amazon.com, the company found that almost every service has “opinions” about how to route messages within service clusters!

Scenarios What about out in the wild? Here, imagine using content filtering as a way to query huge sets of RSS feeds User expresses “interests” and these map to content queries… which route exactly the right stuff to him/her IBM Gryphon project: used this model, assumed that clients would be corporate users (often stock traders) Siena: similar model but assumes more of a P2P community in the Internet WAN

Things known about settings? All of these settings are very different Amazon’s world is dominated by machine-controlled layout algorithms that selectively place services on clusters. Produces all sorts of “regularities” E.g. clones of aservice often subscribe to the same data And if A 0 and B 0 are collocated on node X, probably representatives of A and B will always be collocated IBM’s world is dominated by heavy-tailed interest behaviors: Traders specialize in various ways Siena world is more like a web search stream

Examples of issues raised Early work on IBM’s Gryphon platform focused on in- network aggregation of the queries They assumed that each message has an associated set of tags (attached by sender for efficiency) Subscription was a predicate over these tags Their focus was on combining the predicates, in the network, to avoid redundant work They got good results and even sold Gryphon as a product. But…

Thought question How often would you “expect” to have an opportunity to do in-network query combinations? Would you prefer to do an in-network solution, like Gryphon, or build a database solution like Cornell’s Cayuga, where events can also be stored?

… and the answer is For IBM’s corporate clients, there turned out to most often be just a single Gryphon router per data center, with WAN links between them In effect: Broadcast every event to all data centers This happens because “most” topics were of interest to at least someone in “most” data centers. Then filter at the last hop before delivery to client nodes Turns out that the router was fast enough for this model So all that in-network query combination work was unneeded in most client settings!

… and the rest of the answer? The majority of users had some form of archival storage unit in each data center It subscribes to everything and keeps copies So in effect, the average user “turned Gryphon into something much like Cayuga” Given this insight, Cayuga assumes full broadcast for event streams, focuses on a database model with rapid update rates. A more natural solution… Benefit? A single integrated story versus one network story coupled to a distinct database solution

What about Amazon? Amazon has lots of packet-inspection routers that peek inside data quickly and forward as appropriate Customized on a per-service basis Many packet formats… hence little commonality between these inspection “applets” Motivates Cornell’s current work on “featherweight processes” to inspect packets at line speeds and exploit properties of multicore machines for scalability

Taking us to… Siena Relatively popular Claimed user community of a few hundred thousand downloads Perhaps a few thousand of whom actually use the system Little known about the actual users Today we’ll look at a slide set generously provided by the development team

Remainder of today’s talk We’ll dive down to look closely at Siena Covering all three scenarios is just more than we have time to do Siena