1. RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang.

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

Chapter 1: INTRODUCTION TO DATA STRUCTURE
An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Multidimensional Data
Advanced Databases: Lecture 2 Query Optimization (I) 1 Query Optimization (introduction to query processing) Advanced Databases By Dr. Akhtar Ali.
Numerical Algorithms • Matrix multiplication
Maths for Computer Graphics
CEG 221 Lesson 5: Algorithm Development II Mr. David Lippa.
October 14-15, 2005Conformal Computing Geometry of Arrays: Mathematics of Arrays and  calculus Lenore R. Mullin Computer Science Department College.
UNC Chapel Hill Lin/Manocha/Foskey Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject.
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
11-1 Matrix-chain Multiplication Suppose we have a sequence or chain A 1, A 2, …, A n of n matrices to be multiplied –That is, we want to compute the product.
Multidimensional Data Many applications of databases are ``geographic'' = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Memory Management ◦ Operating Systems ◦ CS550. Paging and Segmentation  Non-contiguous memory allocation  Fragmentation is a serious problem with contiguous.
Language Evaluation Criteria
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
1 © 2012 The MathWorks, Inc. Speeding up MATLAB Applications.
ECON 1150 Matrix Operations Special Matrices
Martin Ellison University of Warwick and CEPR Bank of England, December 2005 Introduction to MATLAB.
High Performance Computing 1 Numerical Linear Algebra An Introduction.
Introduction to MATLAB Session 1 Prepared By: Dina El Kholy Ahmed Dalal Statistics Course – Biomedical Department -year 3.
Data, graphics, and programming in R 28.1, 30.1, Daily:10:00-12:45 & 13:45-16:30 EXCEPT WED 4 th 9:00-11:45 & 12:45-15:30 Teacher: Anna Kuparinen.
Ohio State University Department of Computer Science and Engineering Automatic Data Virtualization - Supporting XML based abstractions on HDF5 Datasets.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
Chapter 7 Advanced SQL Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Management 9. course. Execution of queries.
MonetDB/X100 hyper-pipelining query execution Peter Boncz, Marcin Zukowski, Niels Nes.
Physical DB Issues, Indexes, Query Optimisation Database Systems Lecture 13 Natasha Alechina.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Module 5 Planning for SQL Server® 2008 R2 Indexing.
1 ENERGY 211 / CME 211 Lecture 26 November 19, 2008.
Query Processing. Steps in Query Processing Validate and translate the query –Good syntax. –All referenced relations exist. –Translate the SQL to relational.
Advanced Databases: Lecture 6 Query Optimization (I) 1 Introduction to query processing + Implementing Relational Algebra Advanced Databases By Dr. Akhtar.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Bi-Hadoop: Extending Hadoop To Improve Support For Binary-Input Applications Xiao Yu and Bo Hong School of Electrical and Computer Engineering Georgia.
VIRTUAL MEMORY By Thi Nguyen. Motivation  In early time, the main memory was not large enough to store and execute complex program as higher level languages.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
8 1 Chapter 8 Advanced SQL Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
ME6104: CAD. Module 4. ME6104: CAD. Module 4. Systems Realization Laboratory Module 4 Matlab ME 6104 – Fundamentals of Computer-Aided Design.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
MA/CS 375 Fall 2002 Lecture 3. Example 2 A is a matrix with 3 rows and 2 columns.
Lecture 26: Reusable Methods: Enviable Sloth. Creating Function M-files User defined functions are stored as M- files To use them, they must be in the.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 15 A External Methods. © 2004 Pearson Addison-Wesley. All rights reserved 15 A-2 A Look At External Storage External storage –Exists beyond the.
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.
Greatest Common Divisors & Least Common Multiples  Definition 4 Let a and b be integers, not both zero. The largest integer d such that d|a and d|b is.
MA/CS 375 Fall 2002 Lecture 2. Motivation for Suffering All This Math and Stuff Try the Actor demo from
ECE 103 Engineering Programming Chapter 23 Multi-Dimensional Arrays Herbert G. Mayer, PSU CS Status 6/24/2014 Initial content copied verbatim from ECE.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
Chapter 13: Query Processing
Operating Systems Lecture 9 Introduction to Paging Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard. Zhiqing Liu School of.
Introduction to Programming Lecture # 43. Math Library Complex number Matrix Quadratic equation and their solution …………….…
Matrix Algebra Definitions Operations Matrix algebra is a means of making calculations upon arrays of numbers (or data). Most data sets are matrix-type.
CPSC-310 Database Systems
Properties and Applications of Matrices
ECE 3301 General Electrical Engineering
The Hardware/Software Interface CSE351 Winter 2013
MATLAB DENC 2533 ECADD LAB 9.
BLAS: behind the scenes
Chapter 15 QUERY EXECUTION.
Page Replacement.
Unit-2 Divide and Conquer
Virtual Memory: Working Sets
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as.
Presentation transcript:

1

RIOT: I/O-Efficient Numerical Computing in Yi Zhang Herodotos Herodotou Jun Yang

What is R? R: an open-source language/environment – Statistical computing, graphics – Comprehensive R Archive Network 1639 packages as of Dec 08 – Interpretive execution – High-level constructs Arrays, matrices Code example: Common to languages for numerical/statistical computing a <- 1:100 … d <- a+b^2+c 3

Big-Data Challenge R assumes all data in main memory – If not, VM starts swapping data from/to disk – Excessive I/O, poor performance – Example: 4 # n points with coordinates stored in x[1:n], y[1:n] (1) d <- sqrt((x-xs)^2+(y-ys)^2)+sqrt((x-xe)^2+(y-ye)^2) (2) s <- sample(n, 100) # draw 100 samples from 1:n (3) z <- d[s] # extract elements of d whose indices are in s S(xs,ys) E(xe,ye) x y x y x-xs x (x-xs)^2 y 1 st sqrt (x-xe)^2 y-ye y x … …… memory swap/ paging file x,y

Opportunities Avoiding intermediate results – Multiple large intermediate results are generated – Can we avoid them without hand-coding loops? for (i in 1:n) { d[i] <- sqrt((x[i]-xs)^2+…)+… } Deferred and selective evaluation – Each expression is evaluated in full immediately – Can we defer evaluation until really necessary? Just compute the 100 elements from d picked by s 5

Existing Solutions Rewrite and hand-optimize code – Tedious, not quite reusable Use I/O-efficient libraries – SOLAR [Toledo’96], DRA [Nieplocha’96], etc. – But efficient individual operations are not enough Build/extend a DB – RasDaMan [Baumann’99], AML [Marathe’02], ASAP [Stonebraker’07], … – Must rewrite using a new language (often SQL) – Explicit boundary between DB and host language 6

SQL R with I/O Transparency Attain I/O efficiency without explicit user intervention Run legacy code with no or minimal modification No need to learn new languages/libraries No boundary between host language and backend processing 7

RIOT Implemented as an R package – New types, same interfaces: dbvector, dbmatrix, … – Uses R’s generics mechanism for transparency 8 Method overloading: setMethod(“+”,signature(e1=“dbvector”,e2=“dbvector”), function(e1,e2) {.Call(“add_dbvectors”,e1,e2) } ) 2 New class definition: setClass(“dbvector”, representation(size=“numeric”,…)) 1 Implementation: SEXP add_dbvectors(SEXP e1, SEXP e2){ … } 3

RIOT-DB: Hidden DB Backend A strawman solution: Map large arrays to DB tables – e.g. vector: V(i,v) ; matrix: M(i,j,v) – Computation  query: a+b  SELECT A.I,A.V+B.V FROM A,B WHERE A.I=B.I – Leverages power of DB only at intra-operation level! Key: Translate operations to view definitions – Build up larger and larger views a step at a time – Evaluate only when needed  deferred evaluation – Query optimization  selective evaluation + more – Iterator-style execution  no intermediate results 9 CREATE VIEW T1(I,V) AS SELECT X.I,X.V-xs FROM X; d<-sqrt((x-xs)^2+(y-ys)^2)+… CREATE VIEW T2(I,V) AS SELECT T1.I, POW(T1.V,2) FROM T1; … CREATE VIEW D(I,V) AS SELECT T6.I, T6.V+T12.V FROM T6,T12 WHERE T6.I=T12.I; … z <- d[s] CREATE VIEW Z(I,V) AS SELECT S.I, D.V FROM D,S WHERE D.I=S.V; SELECT S.I, SQRT(POW(X.V-xs,2)+POW(Y.V-ys,2)) + SQRT(POW(X.V-xe,2)+POW(Y.V-ye,2)) FROM X,Y,S WHERE X.I=Y.I AND X.I=S.V

RIOT-DB Demo RIOT-DB built using with MyISAM engine 10

Plain R RIOT-DB variants – RIOT-DB/Strawman: use DB to store arrays and execute individual ops; no use of views to defer evaluation – RIOT-DB/MatNamed: use views, but compute/materialize every named object – RIOT-DB: full version; defer/optimize across statements Performance of RIOT-DB 11

Lessons Learned DB-style inter-operation optimization is really the key! Can we do better? – DB arrays carries too much overhead (ASAP [Stonebraker’07] ) Extra columns in V(i, v), M(i, j, v), …; more for higher dims – SQL & relational algebra may not be the right abstraction Advanced data layouts and complex ops are awkward  RIOT: The Next Generation – A new expression algebra closer to numerical computation – Flexible array storage/layout options – Optimizations better tailored for numerical computation – … and more 12

RIOT Expression Algebra Analogous to the view mechanism, but more flexible Operators – +, –, *, /, [, … – A[idxRange]<-newVals: turn updates into functional ops Instead of in-place updates, log them & define A new over (A old,log) – X%*%Y (matrix multiply) etc.: built-in, for high-level opt. E.g. matrix chain multiplication: (XY)Z or X(YZ)? 13

Processing/Layout Optimization Matrix multiplication T=A(n 1 xn 2 ) B(n 2 xn 3 ), with fixed memory size M 14 R: Plain algorithm For each row i of A: For each column j of B: T[i,j] <- A[i,] * B[,j] BNLJ-inspired algorithm Read as many rows of A as possible: Use one block to scan B in column-major order: Update elements in T A x BT = A x BT = A x BT = Blocked algorithm Divide memory into 3 equal parts Divide each matrix into square blocks For each chunk (i,j) in T: For k=1…p: Read chunk (i,k) from A and chunk (k,j) from B chunk T(i,j) += A(i,k) %*% B(k,j) Write chunk T(i,j) RIOT-DB Hashjoin-sort-aggregate Optimal I/O cost: n 1 n 2 n 3 /(BM 1/2 )

Conclusion I/O efficiency can be added transparently – Ditch SQL at user level for broader impact! DB-style inter-operation optimization is critical – Need to go beyond developing I/O-efficient algorithms and libraries Integration of DB and programming languages – Lots of interesting analogies and new opportunities 15

Q&A 16 RIOT photos by Zack Gold (