Architecture of Software Systems – Lecture 8 Massively Distributed Architectures Reliability, Failover … and failures Martin Rehák.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Understanding KaZaA Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Distributed Computations
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
CS 345A Data Mining MapReduce. Single-node architecture Memory Disk CPU Machine Learning, Statistics “Classical” Data Mining.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Peer-to-Peer Intro Jani & Sami Peltotalo.
1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 17 January 2005.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Google Distributed System and Hadoop Lakshmi Thyagarajan.
P2P File Sharing Systems
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
MapReduce.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Frankie Pike. 2010: 1.2 zettabytes 1.2 trillion gigabytes DVDs past the moon 2-way = 6 newspapers everyday ~58% growth per year Why care?
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
Introduction of P2P systems
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce and Hadoop 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 2: MapReduce and Hadoop Mining Massive.
MapReduce – An overview Medha Atre (May 7, 2008) Dept of Computer Science Rensselaer Polytechnic Institute.
Skype P2P Kedar Kulkarni 04/02/09.
Chapter 2: Application layer
Latest Relevant Techniques and Applications for Distributed File Systems Ela Sharda
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Understanding KaZaA Jian Liang Rakesh Kumar Keith Ross Polytechnic University Brooklyn, N.Y.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
FastTrack Network & Applications (KaZaA & Morpheus)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
An analysis of Skype protocol Presented by: Abdul Haleem.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
MapReduce Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
CHAPTER 3 Architectures for Distributed Systems
Peer-to-Peer and Social Networks
MapReduce Simplied Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Architecture of Software Systems – Lecture 8 Massively Distributed Architectures Reliability, Failover … and failures Martin Rehák

Motivation Internet-based business models imposed new requirements on computational architectures and inspired new solutions – Extreme low cost – service is free – Extreme user numbers – Versatility of queries – New collaboration patterns

Overview Three approaches to massively distributed systems: – Google: Map-Reduce, GFS, … – Yahoo/Apache: Hadoop (Map-Reduce, HFS, …) – Kazaa (P2P example, Filesharing, Skype,…)

Distributed Map-Reduce Inspired and originally used by Google – Patent (?) on the technology Massively parallel approach based on functional programming primitives Used to perform data storage, manipulation and search in high-volume, simple-structure databases Dean and Ghemawat: MapReduce: Simplied Data Processing on Large Clusters, 2004

Map/Reduce primitives Originally inspired by Lisp/functional programming primitives Operations on key/value pairs Two phases: Map: Applies a function (filter) to set of records/elements in the list Reduce: shortens the list by applying an aggregation function Following example from: visual-explanation.aspxhttp://ayende.com/Blog/archive/2010/03/14/map-reduce-ndash-a- visual-explanation.aspx

Map/Reduce Example { "type": "post", "name": "Raven's Map/Reduce functionality", "blog_id": 1342, "post_id": , "tags": ["raven", "nosql"], "post_content": "... ", "comments": [ { "source_ip": ' ', "author": "martin", "txt": "..." }]} Set of blog posts Count the number of comments per blog … distributed on cluster … millions of blogs

The query (C#) Map from post in docs.posts select new { post.blog_id, comments_length = comments.length }; Reduce from agg in results group agg by agg.key into g select new { agg.blog_id, comments_length = g.Sum(x=>x.comments_lengt h) };

Map

Deployment Google: large farms of commodity (now custom built) hardware Distributed storage Asynchronous interface Map: parallelized by splitting the data and assigning them to different nodes to process Reduce: Partition the keys using pseudo- random partition with uniform distribution into R pieces – Hash(key) mod R

Runtime

Split input files into M pieces (S)elect master node, other nodes are workers Workers read assigned tasks/data and applies map. Results stored in memory as (key,value) Results are periodically written to local disk storage. Already partitioned to R regions. Reduce worker notified by master about data location Reduce worker reads out the buffered data by RPC Data is sorted by key Iterate over sorted data and apply reduce to all subsets with the same key Reduce results appended to final results Another Map-Reduce may follow

Applications Distributed Grep URL access/reference counting Reverse web-link graph (behind google search) Semantic search – term vector per host Inverted index Distributed sort From the original Google paper

KaZaA and Other Peer-To-Peer Family of protocols and technologies The “most typical” peer-to-peer variant Used in a variety of products and applications – Filesharing – Skype – Botnet Command&Control – … Reading & Resources: – Liang, Kumar, Ross: Understanding KaZaA – Barabasi/Bonabeau Scale Free Networks, SciAm May 2003

Napster and Gnutella Napster: peer-to-peer network with centralized directory element – Obvious security (and legal) implications Gnutella-like flat peer-to-peer: – Peers (Nodes) are equal – Bootstrapping upon startup – looking for other peers Horizontal scanning –is a bad design choice Use of local cache, web/UDP cache or IRC – System creates random connections to a specific number of currently active peers – Peers exchange info about other peers

Search (Gnutella) User specifies search terms Search is sent to and performed on actively connected peer nodes – Low number of connections (less than 10) implies the need for request forwarding High overhead of search communication, even with hop limits Introduction of “ultrapeers” in recent versions – Nodes (hubs) with high number of connections – Dense interconnection between ultrapeers – Design influenced by more recent architectures

Random Graphs A.-L. Barabási, Scale-Free Networks, SciAm 288

FastTrack Protocol Respects the Scale Free nature of real-world networks Introduces two populations of nodes: Ordinary Nodes and Supernodes Functionally separates: – network maintenance – hub selection, connenction etc… – lookup (equivalent to service discovery/directory) – business logic (high-volume P2P transmissions)

Initial Protocol Stages/Operations ON connects to SN upon startup – SN selected randomly from the list of available SN – Share service list (file names/hashes) with the selected SN SN maintains the database based on connected ON updates – SN database: file name, file size, content hash, metadata (descriptor), node (IP) Peers are heterogeneous – influence on SN role selection: bandwidth, CPU, latency, NAT,…

Search In most implementations, Supernodes only hold the information about the directly connected ON This is due to caching, scalability and explosive size growth problems Search and downloads are un-coupled in the architecture Both operations are associated through content hash – (assumed) unique resource ID

Search Operation User specifies the content ON sends the query to its SN (through TCP connection) SN returns the IP address and metadata (list) SN maintain an overlay core network between themselves – Topology changes very frequently Queries can be forwarded to other SN Queries cover a small subset of SN (long-term stable due to topology changes)

Connections/traffic Signaling: handshakes, metadata exchange, supernode lists exchanges, queries, replies File transfer traffic (phone calls): ON to ON directly. Reflector may be used to get through firewalls (Skype). Advertisement (HTTP, original Kazaa only) Instant messaging, base 64 encoded (original Kazaa) Skype encodes all traffic, maintains persistent UDP connections between ON and SN

Search vs. Connectivity Some peer implementations may re-connect to different SN during the search for a particular file/resource CPU and communication intensive – Filelists are exchanged with the new SN – (Remember that ON state is not cached by SN) User activity can create second-order effects in the network – Massive searches for a scarce resource

Search & Content Hash User (ON) performs metadata search ON receives the list of DB records User selects the best resource – Resource is described by ContentHash The system uses the ContentHash to identify the best ON(s) with the desired resource – Transparent download optimization – Transparent content management

Nodes and Supernodes

Random Node Failure, Random Networks

Random node failure, SF networks

Attacks on Hubs

Reliability “We have determined that, as part of the signalling traffic, KaZaA nodes frequently exchange with each other lists of supernodes. ONs keep a list of up 200 SNs whereas SNs appear to maintain lists of thousand of SNs. When a peer A (ON or SN) receives a supernode list from another peer B, peer A will typically purge some of the entries from its local list and add entries sent by peer B.” Liang et al, Understanding KaZaA

Importance