SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
Perspective on Overlay Networks Panel: Challenges of Computing on a Massive Scale Ben Y. Zhao FuDiCo 2002.
Tapestry: Decentralized Routing and Location SPAM Summer 2001 Ben Y. Zhao CS Division, U. C. Berkeley.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
A Taxonomy and Survey of Content Delivery Networks Meng-Huan Wu 2011/10/26 1.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 Clustering Web Content for Efficient Replication Yan Chen, Lili Qiu*, Weiyu Chen, Luan Nguyen, Randy H. Katz EECS Department UC Berkeley *Microsoft Research.
A Scalable Semantic Indexing Framework for Peer-to-Peer Information Retrieval University of Illinois at Urbana-Champain Zhichen XuYan Chen Northwestern.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Router-based Anomaly/Intrusion Detection and Mitigation (RAIDM) Systems Scalable.
Reverse Hashing for High-speed Network Monitoring: Algorithms, Evaluation, and Applications Robert Schweller 1, Zhichun Li 1, Yan Chen 1, Yan Gao 1, Ashish.
OSMOSIS Final Presentation. Introduction Osmosis System Scalable, distributed system. Many-to-many publisher-subscriber real time sensor data streams,
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Collaborative Web Caching Based on Proxy Affinities Jiong Yang, Wei Wang in T. J.Watson Research Center Richard Muntz in Computer Science Department of.
Towards a High-speed Router-based Anomaly/Intrusion Detection System (HRAID) Zhichun Li, Yan Gao, Yan Chen Northwestern.
Object Naming & Content based Object Search 2/3/2003.
Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.
1 Routing as a Service Karthik Lakshminarayanan (with Ion Stoica and Scott Shenker) Sahara/i3 retreat, January 2004.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
Capacity planning for web sites. Promoting a web site Thoughts on increasing web site traffic but… Two possible scenarios…
1 Towards Anomaly/Intrusion Detection and Mitigation on High-Speed Networks Yan Gao, Zhichun Li, Yan Chen Northwestern Lab for Internet and Security Technology.
Inferring the Topology and Traffic Load of Parallel Programs in a VM environment Ashish Gupta Peter Dinda Department of Computer Science Northwestern University.
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
Communication Part IV Multicast Communication* *Referred to slides by Manhyung Han at Kyung Hee University and Hitesh Ballani at Cornell University.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
1/36. 2/36 Towards Sustained Scalability of Communication Networks Mike P. Wittie
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Proposed Work 1. Client-Server Synchronization Proposed Work 2.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
Quality of Service in the Internet The slides of part 1-3 are adapted from the slides of chapter 7 published at the companion website of the book: Computer.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Towards a Scalable, Adaptive and Network-aware Content Distribution Network Yan Chen EECS Department UC Berkeley.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Towards a Transparent and Proactively-Managed Internet Ehab Al-Shaer School of Computer Science DePaul University Yan Chen EECS Department Northwestern.
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, ESRI Vana Kalogeraki, AUEB
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 1.Introduction.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Efficient and Adaptive Replication using Content Clustering Yan Chen EECS Department UC Berkeley.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
Challenges in the Next Generation Internet Xin Yuan Department of Computer Science Florida State University
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
TRUST Self-Organizing Systems Emin G ü n Sirer, Cornell University.
/ Fast Web Content Delivery An Introduction to Related Techniques by Paper Survey B Li, Chien-chang R Sung, Chih-kuei.
Northwestern Lab for Internet & Security Technology (LIST)
Content Delivery Cloud A Better Alternative To Your Content Delivery Network (CDN) ©2013 Riverbed Technology Confidential and Proprietary.
Efficient and Adaptive Replication using Content Clustering Yan Chen EECS Department UC Berkeley.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Internet Indirection Infrastructure (i3)
Northwestern Lab for Internet and Security Technology (LIST) Yan Chen Department of Computer Science Northwestern University.
Plethora: Infrastructure and System Design
End-user Based Network Measurement and Diagnosis
Content Distribution Networks
Dynamic Replica Placement for Scalable Content Delivery
EE 122: Lecture 22 (Overlay Networks)
Existing CDNs Fail to Address these Challenges
Presentation transcript:

SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University

Motivation The Internet has evolved to become a commercial infrastructure for service delivery –Web delivery, VoIP, streaming media … Challenges for Internet-scale services –Scalability: 600M users, 35M Web sites, 2.1Tb/s –Efficiency: bandwidth, storage, management –Agility: dynamic clients/network/servers –Security: proliferate attacks/viruses/worms E.g., content delivery - Content Distribution Network (CDN) –Web delivery –Grid computing

How CDN Works

Challenges for CDN Content Location –Find nearby replicas with good DoS attack resilience –Dynamic, scalable semantic search Replica Deployment –Dynamics, efficiency –Client QoS (latency, coherence) and server capacity constraints Replica Management –Replica index state maintenance scalability Adaptation to Network Congestion/Failures –Overlay monitoring scalability and accuracy Security –Proactive anomaly/intrusion detection on high-speed network

Provision: Dynamic Replication + Update Multicast Tree Building Replica Management: (Incremental) Content Clustering Network End-to-End Distance Monitoring (latency & loss rate) DHT-based Replica Location: Network DoS Attack Resilient & Semantic Search Support SCAN: Scalable Content Access Network Proactive Anomaly/Intrusion Detection on High-speed Network

Replica Location (security) Existing Work and Problems –Centralized, Replicated and Distributed Directory Services –No security benchmarking, which one has the best DoS attack resilience? Solution –Proposed the first simulation-based network DoS resilience benchmark –Applied it to compare three directory services –DHT-based Distributed Directory Services has best resilience in practice Publication –3 rd Int. Conf. on Info. and Comm. Security (ICICS), 2001

Replica Location (semantic search) Existing Work and Problems –Mostly keyword/title based search –Emerging semantic search systems, but static, unscalable Solution –Apply DHT to distribute the indices –Use “concept indexing” to incrementally grow the semantic space => incrementally add new concepts & documents –Group the indices based on semantic locality => semantic routing, better query accuracy and efficiency

Replica Placement & Coherence Support Existing Work and Problems –Static placement –Dynamic but inefficient placement –No coherence support Solution –Dynamically place close to optimal # of replicas with clients QoS (latency) and servers capacity constraints –Self-organize replica into a scalable application-level multicast for disseminating updates –With overlay network topology only Publication –IPTPS 2002, Pervasive Computing 2002

Existing Work and Problems –Cooperative access for good efficiency requires maintaining replica indices –Per Website replication, scalable, but poor performance –Per URL replication, good performance, but unscalable Solution –Clustering-based replication reduces the overhead significantly without sacrificing much performance –Proposed a unique online Web object popularity prediction scheme based on hyperlink structures –Online incremental clustering and replication to push replicas before accessed Publication –ICNP 2002, IEEE J-SAC 2003 Replica Management

Adaptation to Network Congestion/Failures Existing Work and Problems –Latency estimation systems scalable, but cannot monitor congestion/failures which require n 2 measurement for n end hosts Solution –Tomography-based Overlay Monitoring (TOM) - selectively monitor a basis set of O(n logn) paths to infer the loss rates of other paths –Works in real-time, adapts to topology changes, has good load balancing and tolerates topology errors –Built an adaptive overlay streaming media system on top of TOM –Root-cause diagnosis in progress Publication –Modeling: SIGCOMM IMC 2003 (extended abstract) –Full version under submission

Existing Work and Problems –A/I detection requires flow-level traffic monitoring, unscalable for high-speed network –Most IDS are signature-based, only for known attacks Solution –Leverage “K-ary sketch”, a compact probabilistic summary of flow-level traffic, constant update/query cost, linearity –Use statistical methods, like Hidden Markov Model (HMM) and time series analysis for proactive detection –Profile characteristics of new apps to reduce false positive Publication –K-ary sketch: SIGCOMM IMC 2003 Proactive Anomaly/Intrusion Detection on High-speed Network