Large-scale Machine Learning using DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Ambient Intelligence: From Sensor Networks to Smart Environments.

Slides:



Advertisements
Similar presentations
Distributed Data-Parallel Programming using Dryad Andrew Birrell, Mihai Budiu, Dennis Fetterly, Michael Isard, Yuan Yu Microsoft Research Silicon Valley.
Advertisements

Inter-Iteration Scalar Replacement in the Presence of Control-Flow Mihai Budiu – Microsoft Research, Silicon Valley Seth Copen Goldstein – Carnegie Mellon.
Cluster Computing with Dryad Mihai Budiu, MSR-SVC LiveLabs, March 2008.
Florida International University COP 4770 Introduction of Weka.
The DryadLINQ Approach to Distributed Data-Parallel Computing
Machine Learning in DryadLINQ Kannan Achan Mihai Budiu MSR-SVC, 1/30/
Distributed Data-Parallel Computing Using a High-Level Programming Language Yuan Yu Michael Isard Joint work with: Andrew Birrell, Mihai Budiu, Jon Currey,
Cluster Computing with DryadLINQ
Internet of Things with Intel Edison Web controller
C# and LINQ Yuan Yu Microsoft Research Silicon Valley.
The Kinect body tracking pipeline Oliver Williams, Mihai Budiu Microsoft Research, Silicon Valley With slides contributed by Johnny Lee, Jamie Shotton.
Pradeep Kumar Gunda, Lenin Ravindranath, Chandramohan A. Thekkath, Yuan Yu, and Li Zhuang Presented by: Hien Nguyen.
Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.
Big Data Platforms Mihai Budiu, Oct My work Ph.D. from Carnegie Mellon, 2003 Hardware synthesis Reconfigurable hardware Compilers and computer.
DryadLINQ A System for General-Purpose Distributed Data-Parallel Computing Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep.
Nectar: Efficient Management of Computation and Data in Data Centers Lenin Ravindranath Pradeep Kumar Gunda, Chandu Thekkath, Yuan Yu, Li Zhuang.
PARALLELIZING LARGE-SCALE DATA- PROCESSING APPLICATIONS WITH DATA SKEW: A CASE STUDY IN PRODUCT-OFFER MATCHING Ekaterina Gonina UC Berkeley Anitha Kannan,
Optimus: A Dynamic Rewriting Framework for Data-Parallel Execution Plans Qifa Ke, Michael Isard, Yuan Yu Microsoft Research Silicon Valley EuroSys 2013.
Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Cloud computing: Infrastructure, Services, and Applications UC Berkeley,
DryadLINQ A System for General-Purpose Distributed Data-Parallel Computing Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Úlfar Erlingsson, Pradeep.
Monitoring and Debugging Dryad(LINQ) Applications with Daphne Vilas Jagannath, Zuoning Yin, Mihai Budiu University of Illinois, Microsoft Research SVC.
From LINQ to DryadLINQ Michael Isard Workshop on Data-Intensive Scientific Computing Using DryadLINQ.
Dryad / DryadLINQ Slides adapted from those of Yuan Yu and Michael Isard.
Cluster Computing with DryadLINQ Mihai Budiu, MSR-SVC PARC, May
Types of software. Sonam Dema..
var site="s15gizmodo" var site="s15gizmodo"
SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall /18/2014.
Cloud Computing Systems Lin Gu Hong Kong University of Science and Technology Oct. 3, 2011 Hadoop, HDFS and Microsoft Cloud Computing Technologies.
A Top Level Overview of Parallelism from Microsoft's Point of View in 15 minutes IDC HPC User’s Forum April 2010 David Rich Director Strategic Business.
Dryad and DryadLINQ Theophilus Benson CS Distributed Data-Parallel Programming using Dryad By Andrew Birrell, Mihai Budiu, Dennis Fetterly, Michael.
Cluster Computing with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Intel Research Berkeley, Systems Seminar Series October 9, 2008.
Presenters: Abhishek Verma, Nicolas Zea.  Map Reduce  Clean abstraction  Extremely rigid 2 stage group-by aggregation  Code reuse and maintenance.
Microsoft DryadLINQ --Jinling Li. What’s DryadLINQ? A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language. [1]
Image Processing Image Processing Windows HPC Server 2008 HPC Job Scheduler Dryad DryadLINQ Machine Learning Graph Analysis Graph Analysis Data Mining.NET.
Hot Topics in OS Research Andy Wang COP 5611 Advanced Operating Systems.
CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Advanced.NET Programming I 11 th Lecture Pavel Ježek
Programming clusters with DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Association of C and C++ Users (ACCU) Mountain View, CA, April 13, 2011.
SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall /18/2014.
Dryad and DryadLINQ Aditya Akella CS 838: Lecture 6.
2 Philosophy Customer Design Experience Platform.
Training Kinect Mihai Budiu Microsoft Research, Silicon Valley UCSD CNS 2012 RESEARCH REVIEW February 8, 2012.
1 Dryad Distributed Data-Parallel Programs from Sequential Building Blocks Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly of Microsoft.
Computing History Milestones
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Artemis Logs Database View Data Collectio n GUI Dryad Overview Data collection Distributed system Plug-ins GUI Plug-ins Hunting for Bugs with Artemis System.
The Microsoft Services Provider License Program (SPLA)
Has the ETL run yet?
4 5 6 var logentries = from line in logs where !line.StartsWith("#") select new LogEntry(line); var user = from access in logentries where
Dryad and DryaLINQ. Dryad and DryadLINQ Dryad provides automatic distributed execution DryadLINQ provides automatic query plan generation Dryad provides.
D. Heynderickx DH Consultancy, Leuven, Belgium 22 April 2010EuroPlanet, London, UK.
Application Software System Software.
WSV207. Cluster Public Cloud Servers On-Premises Servers Desktop Workstations Application Logic.
MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction.
Definition DryadLINQ is a simple, powerful, and elegant programming environment for writing large-scale data parallel applications running on large PC.
SERVER I SLIDE: 3. SERVER I Topic for tomorrow: Chapter 3: Configuring Hyper-V ■■ Objective 3.1: Create and configure virtual machine settings (Group.
Join the MVA Community! ▪ Microsoft Virtual Academy—Free online training! ‒ Tailored for IT Pros and Developers ‒ Over 1M registered users ▪ Earn while.
Drew Lytle Principal Program Manager Microsoft Corporation SESSION CODE: WPH203.
Joel Pobar Language Geek Microsoft DEV320 Improve on C# % Backwards Compatible Language Integrated Query (LINQ)
WHO WILL BENEFIT FROM THIS TALK TOPICS WHAT YOU’LL LEAVE WITH Developers looking to build applications that analyze big data. Developers building applications.
Intro of UNITY (for beginner)
CS239-Lecture 3 DryadLINQ Madan Musuvathi Visiting Professor, UCLA
Some slides adapted from those of Yuan Yu and Michael Isard
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
CSCI5570 Large Scale Data Processing Systems
Parallel Computing with Dryad
Assignment 0 (5 points; Due Jan. 15, 2017)
Linux: A Product of the Internet
ПРОГРАМСКИ ДЕЛ НА КОМПЈУТЕРОТ
Intro of UNITY (for beginner)
DryadInc: Reusing work in large-scale computations
Presentation transcript:

Large-scale Machine Learning using DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Ambient Intelligence: From Sensor Networks to Smart Environments and Social Media Workshop Stanford, June 11, 2019

Goal of DryadLINQ 2

Software Stack 3 Windows Server Cluster services Cluster storage Dryad DryadLINQ Windows Server Applications.Net + LINQ

Dryad = Execution Layer 4 Job (application) Dryad Cluster Pipeline Unix Shell Machine ≈

Collection.NET objects of type T LINQ Data Model

LINQ Language Summary 6 Where (filter) Select (map) GroupBy OrderBy (sort) Aggregate (fold) Join Input

LINQ 7 Dryad => DryadLINQ

DryadLINQ Data Model 8 Partition Collection.Net objects

Collection collection; static bool IsLegal(Key c); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; 9 DryadLINQ = LINQ + Dryad C# collection results C# Code Dryad job Data

Example: Natal Training 10

Natal Problem 11 Recognize players from depth map At frame rate Low resource usage

Learn from Data 12 Motion Capture (ground truth) Classifier Training examples Machine learning Rasterize

Running on Xbox 13

Cluster-based training 14 Classifier Training examples Dryad DryadLINQ Machine learning

You can have it! Dryad+DryadLINQ available for download – Academic license – Commercial evaluation license Runs on Windows HPC platform Dryad is in binary form, DryadLINQ in source Requires signing a 3-page licensing agreement

Conclusions 17 =