Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Cloud System for Machine Learning Exploiting a Parallel Array DBMS

Similar presentations


Presentation on theme: "A Cloud System for Machine Learning Exploiting a Parallel Array DBMS"— Presentation transcript:

1 A Cloud System for Machine Learning Exploiting a Parallel Array DBMS
Carlos Ordonez Department of Computer Science University of Houston, USA

2 Our contribution A cloud analytic system for machine learning
shared-nothing architecture backed by a parallel array DBMS; no HDFS, Hadoop, Spark, etc. In-DBMS data summarization orders of magnitude performance improvement even wider gap with GPU acceleration

3 System components and data flow

4 2-phase algorithm

5 System components and data flow
Summarization

6 Defining the input data set X

7 Data summarization Finding a compact description of the data set
Very useful technique in machine learning Save space Save I/O Save execution time No accuracy sacrifice

8 What to summarize? Introducing sufficient statistics
Matrix product => Summation of vector outer products.

9

10 Dense matrix algorithm: O(d2 n)
10

11 Sparse matrix algorithm: O(d n) for hyper-sparse matrix
11

12 Array Storage in SciDB: by Chunks

13 Parallel computation Coordinator Worker 1 Worker 2 1 2 d 1 2 d 1 2 d
Coordinator Worker 1 Worker 2 send send

14 1 2 d 1 2 d OK NO! Coordinator 1 2 d Coordinator Worker 1 Worker 1

15 Linear speed up Let Tj be processing time using j nodes, where 1 ≤ j ≤ N. Under our main assumption and Θ fits in main memory then our optimized algorithm gets close to optimal speedup T1/TN ≈ O(N).

16 Space complexity and parallel speedup
16

17 Benchmark

18

19 Optimize summarization with GPU
Transfer Summarize Transfer GPU

20 Optimize summarization with GPU
The C++ operator code is annotated with OpenACC directives to work with GPU The CPU only does the I/O part in the current implementation. Data is transferred from host memory to device (GPU) memory The vector outer products are evaluated and aggregated on GPU, the result is then transferred back.

21

22 Time saved by summarizing on GPU
n = 1M d = 400

23 System components and data flow
Summarization Model

24 Linear regression

25 Computing LR, SciDB vs. Spark

26 Future work Approach applicable in any parallel DBMS Square matrices
Low-level GPU instructions to parallelize the vector outer products on GPU for Gamma. Improve fault tolerance during computation; avoid restarts


Download ppt "A Cloud System for Machine Learning Exploiting a Parallel Array DBMS"

Similar presentations


Ads by Google