Download presentation
Presentation is loading. Please wait.
1
Frédéric Gava Bulk-Synchronous Parallel ML
Semantics and Implementation of the Parallel Juxtaposition
2
BSML Background Parallel programming Implicit Explicit Concurrent
Automatic parallelization skeletons Data-parallelism Parallel extensions
3
Projects 2002-2004 ACI Grid LIFO, LACL, PPS, INRIA
Design of parallel and Grid librairies for OCaml. ACI « Young researchers » LIFO, LACL Production of a programming environment in which certified parallel programs can be written and safely executed.
4
Outline The BSML language Parallel compositions
Superposition : types and semantics Juxtaposition : types and semantics Implementation of the juxtaposition Conclusion and future works
5
The BSML language
6
Unit of synchronization
The BSP model BSP architecture: Unit of synchronization P/M Network Characterized by: p Number of processors r Processors speed L Global synchronization g Phase of communication (1 word at most sent of received by each processor)
7
T(s) = (max0i<p wi) + hg + L
Model of execution T(s) = (max0i<p wi) + hg + L
8
Example : broadcast cost = png + L cost = 2ng + 2L
Direct broadcast: cost = png + L Broadcast with 2 phases : cost = 2ng + 2L
9
The BSML language -calculus ML BS-calculus Parallel constructions BSML Parallel primitives Structured parallelism as an explicit parallel extension of ML Functional language with BSP cost predictions Allows the implementation of skeletons Implemented as a parallel library for the "Objective Caml" language Using a parallel data structure called parallel vector
10
A BSML program fp-1 … f1 f0 gp-1 … g1 g0 Replicated part Parallel part
Sequential part
11
Parallel primitives of BSML
Asynchronous primitives: Creation of a vector mkpar : (int ) par Parallel point-wize application apply : ( ) par par par Synchronous and communications primitives: Communications put : (int option) par(int option) par Projection of values proj : option par(int option)
12
Semantics Natural semantics Small-steps semantics
Programming model Easy for proofs (Coq) Natural semantics Small-steps semantics Easy for costs Distributed semantics Execution model Make asynchronous steps appear Close to a real implemantation
13
Parallel compositions
14
Multi-programming Several programs on the same machine
New primitives of parallel composition: Superposition Juxtaposition (implanted with the superposition) Divide-and-conquer BSP algorithms
15
Parallel Superposition
super : (unit ) (unit b) b super E1 E2 (E1 (), E2()) Fusion of communications/synchronisations using super-threads Keep the BSP model Pure functional semantics
16
Parallel Superposition
17
Parallel juxtaposition
juxta : int(unit par)(unit par) par Fusion of communications/synchronisations on each sub-machine Keep the BSP model Side-effect on the number of processors v 0 v 1 v m-1 … v i v’ 0 v’ 1 v’ p-1-m … v’ j Juxta m v 0 v m-1 … v i v’ 0 v’ p-1-m v’ j =
18
Parallel juxtaposition
Communications Synchronisation E2 Communications Synchronisation E1 Communications Synchronisation E3 = (juxta 3 E1 E2)
19
Distributed semantics
Semantics = set of parallel rewriting rules SPMD style: Parallel vector Parts of the parallel vector Natural scan op vec = let rec scan' fst lst op vec = if fst>=lst then vec else let mid=(fst+lst)/2 in let vec'= mix mid (super (fun()->scan' fst mid op vec) (fun()->scan'(mid+1) lst op vec))in let com = ...(* send wm to processes m+1…p+1 *) let op’ = ...(* applies op to wm and wi, m<i<p *) in parfun2 op’ com vec’ in scan' 0 (bsp_p()-1) op vec in scan' 0 (bsp_p()-1) op vec scan op vec = (super (fun()->scan' fst mid Prog Distributed evaluation scan op vec = let rec scan' fst lst op vec = if fst>=lst then vec else let mid=(fst+lst)/2 in let vec'= mix mid (super (fun()->scan' fst mid op vec) (fun()->scan'(mid+1) lst op vec))in let com = ...(* send wm to processes m+1…p+1 *) let op’ = ...(* applies op to wm and wi, m<i<p *) in parfun2 op’ com vec’ in scan' 0 (bsp_p()-1) op vec in scan' 0 (bsp_p()-1) op vec scan op vec = (super (fun()->scan' fst mid Prog scan op vec = let rec scan' fst lst op vec = if fst>=lst then vec else let mid=(fst+lst)/2 in let vec'= mix mid (super (fun()->scan' fst mid op vec) (fun()->scan'(mid+1) lst op vec))in let com = ...(* send wm to processes m+1…p+1 *) let op’ = ...(* applies op to wm and wi, m<i<p *) in parfun2 op’ com vec’ in scan' 0 (bsp_p()-1) op vec in scan' 0 (bsp_p()-1) op vec scan op vec = (super (fun()->scan' fst mid Prog scan op vec = let rec scan' fst lst op vec = if fst>=lst then vec else let mid=(fst+lst)/2 in let vec'= mix mid (super (fun()->scan' fst mid op vec) (fun()->scan'(mid+1) lst op vec))in let com = ...(* send wm to processes m+1…p+1 *) let op’ = ...(* applies op to wm and wi, m<i<p *) in parfun2 op’ com vec’ in scan' 0 (bsp_p()-1) op vec in scan' 0 (bsp_p()-1) op vec scan op vec = (super (fun()->scan' fst mid Prog Confluent Equivalent
20
Implementation of the juxtapositon
21
Use of the superposition
2 references that contain the number of processors of a sub-machine and the real PID of the virtual processor 0 (on a sub-machine) Creation of uncompleted vectors Each sub-machine in a super-thread
22
Example, parallel prefixes
scan: () par par scan (+) <v0, …, vp-1> = <v0, v0+v1, …, v0+v1+…+vp-1> a c e g op a b op c d op e f op g h Processors op v v’ v
23
Juxta versu Super Code of a direct method : 12 lines
Code with superposition : 8 lines Code with juxtaposition : 6 lines
24
Performances Time (s) Direct method (BSML+MPI)
D-a-C method with superposition D-a-C method with juxtaposition Time (s) Size of the polynomials
25
Conclusion and future works
26
Conclusion BSML=BSP+ML
Superposition = primitive of parallel composition Juxtaposition is easier for divide-and-conquer algorithms Distributed semantics of the juxtaposition Juxtaposition implemented using superposition Similar performances
27
Future works Proofs of the implementation using semantics
Implentation of bigger algorithms BSP model-checking of high-level Petri-nets (M-nets)
28
Thanks for your attention
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.