Machine Learning in DryadLINQ Kannan Achan Mihai Budiu MSR-SVC, 1/30/2008 1.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Using Matrices in Real Life
Advanced Piloting Cruise Plot.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Distributed Data-Parallel Programming using Dryad Andrew Birrell, Mihai Budiu, Dennis Fetterly, Michael Isard, Yuan Yu Microsoft Research Silicon Valley.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 5 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
Math Vocabulary Review Part 1.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
Multiplying binomials You will have 20 seconds to answer each of the following multiplication problems. If you get hung up, go to the next problem when.
Reducing Fractions. Factor A number that is multiplied by another number to find a product. Factors of 24 are (1,2, 3, 4, 6, 8, 12, 24).
0 - 0.
ALGEBRAIC EXPRESSIONS
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
MULTIPLYING MONOMIALS TIMES POLYNOMIALS (DISTRIBUTIVE PROPERTY)
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
ALGEBRAIC EXPRESSIONS
Year 6 mental test 5 second questions
ZMQS ZMQS
Cluster Computing with Dryad Mihai Budiu, MSR-SVC LiveLabs, March 2008.
Solve Multi-step Equations
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
Slide 1 Copyright © 2004 Glenna R. Shaw & FTC Publishing Background Courtesy of Awesome BackgroundsAwesome BackgroundsDeliberation!Deliberation!
ABC Technology Project
© S Haughton more than 3?
© Charles van Marrewijk, An Introduction to Geographical Economics Brakman, Garretsen, and Van Marrewijk.
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Squares and Square Root WALK. Solve each problem REVIEW:
© 2012 National Heart Foundation of Australia. Slide 2.
Sets Sets © 2005 Richard A. Medeiros next Patterns.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
SIMOCODE-DP Software.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Bottoms Up Factoring. Start with the X-box 3-9 Product Sum
A SMALL TRUTH TO MAKE LIFE 100%
1 Unit 1 Kinematics Chapter 1 Day
PSSA Preparation.
TASK: Skill Development A proportional relationship is a set of equivalent ratios. Equivalent ratios have equal values using different numbers. Creating.
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
How Cells Obtain Energy from Food
Chapter 30 Induction and Inductance In this chapter we will study the following topics: -Faraday’s law of induction -Lenz’s rule -Electric field induced.
Large-scale Machine Learning using DryadLINQ Mihai Budiu Microsoft Research, Silicon Valley Ambient Intelligence: From Sensor Networks to Smart Environments.
Presentation transcript:

Machine Learning in DryadLINQ Kannan Achan Mihai Budiu MSR-SVC, 1/30/2008 1

2 Goal

The Software Stack Windows Server Cluster Services Distributed Filesystem: Cosmos Dryad DryadLINQ Windows Server Large Vector Machine learning Data analysis 3

Dryad 4

Dryad Jobs RR XXX MMM XX M M Vertices (processes) Channels Output files Input files Stage M RR X 5

6 LINQ and C#

LINQ Collection collection; bool IsLegal(Key); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; 7

Collection collection; bool IsLegal(Key k); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; DryadLINQ = LINQ + Dryad C# collection results C# 8 Vertex code Query plan (Dryad job) Data

Recall: The Software Stack Windows Server Cluster Services Distributed Filesystem: Cosmos Dryad DryadLINQ Windows Server Large Vector Machine learning Data analysis 9

Very Large Vector Library PartitionedVector 10 T Scalar TT T

Operations on Large Vectors: Map 1 11 U T T U f f f preserves partitioning

V Map 2 (Pairwise) 12 T U f V U T f

Map 3 (Vector-Scalar) 13 T U f V V U T f

Reduce (Fold) 14 UUU U f fff f UUU U

Linear Algebra 15 T U V =,, T

Linear Regression Data Find S.t. 16

Analytic Solution 17 X×X T Y×X T Σ X[0]X[1]X[2]Y[0]Y[1]Y[2] Σ [ ] -1 * A Map Reduce

Linear Regression Code 18 Matrices xx = x.PairwiseOuterProduct(x); OneMatrix xxs = xx.Sum(); Matrices yx = y.PairwiseOuterProduct(x); OneMatrix yxs = yx.Sum(); OneMatrix xxinv = xxs.Map(a => a.Inverse()); OneMatrix A = yxs.Map( xxinv, (a, b) => a.Multiply(b));

Expectation Maximization lines 3 iterations shown

Understanding Botnet Traffic using EM 20 3 GB data 15 clusters 60 computers 50 iterations 9000 processes 50 minutes

Conclusions Dryad simplifies programming large clusters DryadLINQ = declarative programming for Dryad jobs The Large Vector library provides simple mathematical primitives on top of DryadLINQ Matlab-style coding for writing distributed numeric computations 21 Win Cluster Services Distributed Filesystem Dryad DryadLINQ Win Large Vector ML Data analysis

Backup Slides 22

Chaining 23 X×X T Y×X T Σ X[0]X[1]X[2]Y[0]Y[1]Y[2] Σ [ ] -1 * A ΣΣΣΣΣΣ

EM Structure 24 E stage Input size π σ μ All parameters