Sketching, Sampling and other Sublinear Algorithms: Algorithms for parallel models Alex Andoni (MSR SVC)

Slides:



Advertisements
Similar presentations
Network Design with Degree Constraints Guy Kortsarz Joint work with Rohit Khandekar and Zeev Nutov.
Advertisements

Optimal PRAM algorithms: Efficiency of concurrent writing “Computer science is no more about computers than astronomy is about telescopes.” Edsger Dijkstra.
A Model of Computation for MapReduce
Great Theoretical Ideas in Computer Science for Some.
1 Discrete Structures & Algorithms Graphs and Trees: III EECE 320.
CSCI-455/552 Introduction to High Performance Computing Lecture 11.
CIS December '99 Introduction to Parallel Architectures Dr. Laurence Boxer Niagara University.
Parallel Algorithms for Geometric Graph Problems Grigory Yaroslavtsev 361 Levine STOC 2014, joint work with Alexandr Andoni, Krzysztof.
Parallel Algorithms for Geometric Graph Problems Alex Andoni (Microsoft Research) Joint with: Aleksandar Nikolov (Rutgers), Krzysztof Onak (IBM), Grigory.
On Sketching Quadratic Forms Robert Krauthgamer, Weizmann Institute of Science Joint with: Alex Andoni, Jiecao Chen, Bo Qin, David Woodruff and Qin Zhang.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 21st Lecture Christian Schindelhauer.
1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.
7.3 Kruskal’s Algorithm. Kruskal’s Algorithm was developed by JOSEPH KRUSKAL.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 8 May 4, 2005
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
Lecture 12 Minimum Spanning Tree. Motivating Example: Point to Multipoint Communication Single source, Multiple Destinations Broadcast – All nodes in.
Minimum Spanning Trees CIS 606 Spring Problem A town has a set of houses and a set of roads. A road connects 2 and only 2 houses. A road connecting.
An Interval MST Procedure Rebecca Nugent w/ Werner Stuetzle November 16, 2004.
CSE 421 Algorithms Richard Anderson Lecture 10 Minimum Spanning Trees.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
Design and Analysis of Algorithms Minimum Spanning trees
Lecture 27 CSE 331 Nov 6, Homework related stuff Solutions to HW 7 and HW 8 at the END of the lecture Turn in HW 7.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Sketching, Sampling and other Sublinear Algorithms: Streaming Alex Andoni (MSR SVC)
CSCI 3160 Design and Analysis of Algorithms Tutorial 12
Data Structures and Algorithms Graphs Minimum Spanning Tree PLSD210.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Graph Algorithms: Minimum.
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
UNC Chapel Hill Lin/Foskey/Manocha Minimum Spanning Trees Problem: Connect a set of nodes by a network of minimal total length Some applications: –Communication.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
7.1 and 7.2: Spanning Trees. A network is a graph that is connected –The network must be a sub-graph of the original graph (its edges must come from the.
Module 5 – Networks and Decision Mathematics Chapter 23 – Undirected Graphs.
Data Structures and Algorithms A. G. Malamos
Minimum Spanning Trees Easy. Terms Node Node Edge Edge Cut Cut Cut respects a set of edges Cut respects a set of edges Light Edge Light Edge Minimum Spanning.
Frank Dehnewww.dehne.net Parallel Data Cube Data Mining OLAP (On-line analytical processing) cube / group-by operator in SQL.
Data Structures and Algorithms in Parallel Computing Lecture 2.
Sketching, Sampling and other Sublinear Algorithms: Euclidean space: dimension reduction and NNS Alex Andoni (MSR SVC)
11 Algorithmic Techniques for Massive Data (COMS ) Alex Andoni.
Sorting 1. Insertion Sort
11 Lecture 24: MapReduce Algorithms Wrap-up. Admin PS2-4 solutions Project presentations next week – 20min presentation/team – 10 teams => 3 days – 3.
Prims Algorithm for finding a minimum spanning tree
Tree Diagrams A tree is a connected graph in which every edge is a bridge. There can NEVER be a circuit in a tree diagram!
ETH Zurich – Distributed Computing Group Stephan HolzerSODA Stephan Holzer Silvio Frischknecht Roger Wattenhofer Networks Cannot Compute Their Diameter.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
Minimum Spanning Tree. p2. Minimum Spanning Tree G=(V,E): connected and undirected w: E  R, weight function a b g h i c f d e
Kruskal’s Algorithm for Computing MSTs Section 9.2.
Divide-and-Conquer MST
Minimum Spanning Tree Chapter 13.6.
Lecture 22 Minimum Spanning Tree
Data Structures and Algorithms in Parallel Computing
Sublinear Algorithmic Tools 2
MST in Log-Star Rounds of Congested Clique
Graph Algorithm.
Parallel Algorithms for Geometric Graph Problems
CIS 700: “algorithms for Big Data”
Connected Components Minimum Spanning Tree
Minimum Spanning Tree.
Autumn 2015 Lecture 11 Minimum Spanning Trees (Part II)
كلية المجتمع الخرج البرمجة - المستوى الثاني
4-4 Graph Theory Trees.
Sublinear Algorihms for Big Data
CSCI B609: “Foundations of Data Science”
Richard Anderson Lecture 10 Minimum Spanning Trees
CSE 373: Data Structures and Algorithms
CS 584 Project Write up Poster session for final Due on day of final
Minimum spanning trees
Minimum Spanning Trees (MSTs)
Clustering The process of grouping samples so that the samples are similar within each group.
Prim’s Minimum Spanning Tree Algorithm Neil Tang 4/1/2008
Presentation transcript:

Sketching, Sampling and other Sublinear Algorithms: Algorithms for parallel models Alex Andoni (MSR SVC)

Parallel Models  Data cannot be seen by one machine  Distributed across many machines  MapReduce, Hadoop, Dryad,…  Algorithmic tools for the models?  very incipient!

Types of problems  0. Statistics: 2 nd moment of the frequency  1. Sort n numbers  2. s-t connectivity in a graph  3. Minimum Spanning Tree on a graph  … many more!

Computational Model

Model Constraints

PRAMs

Problem 0: Statistics IP

Problem 1: sorting

Problem 2: graph connectivity VS

Problems 3: geometric graphs

Problem: Geometric MST [A-Nikolov-Onak-Yaroslavtsev’??]

General Approach  Partition the space hierarchically in a “nice way”  In each part  Compute a pseudo-solution to the problem  Sketch the pseudo-solution with small space  Send the sketch to be used in the next level/round

MST algorithm: attempt 1  Partition the space hierarchically in a “nice way”  In each part  Compute a pseudo-solution to the problem  Sketch the pseudo-solution with small space  Send the sketch to be used in the next level/round quad trees! compute MST send any point as a representative

Troubles  Quad tree can cut MST edges  forcing irrevocable decisions  Choose a wrong representative

MST algorithm: final

MST algorithm: Glimpse of analysis

Finale  Gotta love your models:  Streaming:  sub-linear space  see all data sequentially  Parallel computing:  sub-linear space per machine  data distributed over many machines  communication (rounds) expensive  Algorithmic tools in development!