Download presentation
Presentation is loading. Please wait.
Published byGun Jakobsson Modified over 5 years ago
1
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Translating Chapel to Use FREERIDE: A Case Study in Using an HPC Language for Data-Intensive Computing Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
2
Outline Background Chapel and Reduction Support FREERIDE Middleware
Transformation Issues and Implementation Experiments Conclusion January 12, 2019
3
Background Data-Intensive SuperComputing New programming paradigms
Data sizes Increasingly large Data analysis Large-scale computations Multi-core and many-core applications New programming paradigms Map-Reduce & similar programming models High level Languages January 12, 2019
4
Map-Reduce Programming Model
A set of high level API Hide the low level communication details Easy to write parallel programs: map and reduce Suitable for large-scale data processing FREERIDE Share a similar processing structure Utilize an explicit user-defined reduction-object data structure Outperformed Map-reduce for a sub-class of data intensive applications January 12, 2019
5
High Level Programming Languages
Data-intensive computing languages Sawzall from Google Pig Latin from Yahoo … Based on Map-Reduce or similar programming models, and provide a higher level programming logic General HPC languages Chapel from Cray X10 from IBM Separate effort from above, and have more general usage January 12, 2019
6
Motivation Question Our method
Are HPC languages suitable for expressing data-intensive computations? Can we utilize the high productivity of General HPC languages without a big performance degradation in data-intensive applications? Our method Start from Chapel A high level abstraction can improve our productivity Invoke FREERIDE by a compilation framework C libraries ensure good performance January 12, 2019
7
Chapel & Reduction Support
Selected features of Chapel High level programming language, which can be compiled into C Support calls to C via extern declarations Support built-in and user-defined reduction operations by multi-level abstractions Local view abstraction Straightforward and flexible Users have to handle low level communication details Global view abstraction Built-in reduction model Users only need to implement the exposed functions January 12, 2019
8
Chapel Reduction Example
; accumulate: local reduction combine: global reduction generate: post-processing January 12, 2019
9
FREERIDE Middleware Both have two stages: local-reduction /map; global-reduction /reduce FREERIDE maintains an explicit user-defined reduction-object to represent the intermediate state Map-Reduce maintains (key, value) pairs as the intermediate result, while sorting, grouping and shuffling it result large amount overhead January 12, 2019
10
Chapel and FREERIDE User Code
January 12, 2019
11
Transformation Issues and Implementation
Three transformations Invoke the split function Transform the hierarchical data structure in Chapel to dense memory buffer in C Call reduction function to update the reduction object Map the operations on Chapel data to FREERIDE data Call combine function Use default combine function Two algorithms are emphasized Linearization: Mapping: January 12, 2019
12
Transformation Issues and Implementation
data[0] data[1] … data[l-1] b1[0] b1[1] b1[n-1] b2 a1[0] a1[1] a1[m-1] a2 data[l]: Linearizing Alg Mapping Alg Linear_data[ ]: a1[0] a1[m-1] … a2 b2 m n l January 12, 2019
13
Transformation Issues and Implementation
Linearization Algorithm Two-stage recursive algorithm: Compute the size of the whole memory buffer Copy the actual data from the high level data structure to the memory buffer For different types, we adopt different strategy Primitive type Iterative type and Record type Collect necessary information for mapping algorithm Data unit size in each level Starting offset for each member January 12, 2019
14
Transformation Issues and Implementation
Illustration for collecting information Information Collected During Linearization levels = 3; unitSize[levels] = {unitSize_B, unitSize_A, sizeof (real)}; unitOffset[levels - 1][2] = {{unitOffset_B[], unitOffset_A[]}}; unitOffset_B[2] = {0, unitSize_A * n}; unitOffset_A[2] = {0, sizeof(real) * m}; January 12, 2019
15
Transformation Issues and Implementation
Mapping Algorithm Recursive algorithm Basic idea: Start from the outer-most level Terminate with the inner-most At each level, calculate the offset caused by index and the position information collected during the linearization stage January 12, 2019
16
Transformation Issues and Implementation
Two level optimization Classic optimization Adaptive optimization For instance: in K-Means, the cluster is also a hierarchical data set accessed frequently, so we can also linearize it January 12, 2019
17
Experiments Configuration Terms explanation
CPU: Intel Xeon E quad-core 2.33GHz Memory: 6GB OS: 64-bit Linux Terms explanation generated: compiler generated C code with FREERIDE opt-1: the version with classic optimization opt-2: the version with adaptive optimization manual FR: manual FREERIDE user code January 12, 2019
18
Experiments K-means Data size = 12MB, k = 100, iter = 10
Scalability is good Adaptive opt is more obvious Overhead is within 20% January 12, 2019
19
Experiments K-means Data size = 1.2G, k = 10, iter = 10 (left)
Data size = 1.2G, k = 100, iter = 1 (right) January 12, 2019
20
Experiments PCA row = 1000, column = 10,000 (left)
row = 1000, column = 100,000 (right) January 12, 2019
21
Conclusion Present a case study for the possible use of a new HPC language for data-intensive computations Show how to transform the reduction features of Chapel down to FREERIDE middleware Combine the productivity of high level language with the performance of a specialized runtime system January 12, 2019
22
Thank you for your attention!
Any questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.