Download presentation
Presentation is loading. Please wait.
Published byMaria Hope Anderson Modified over 6 years ago
1
Chen Jin (HT016952H) Zhao Xu Ying (HT016907B)
A Programming Model of Hybrid Distributed/Shared Memory System in Data Mining Field Chen Jin (HT016952H) Zhao Xu Ying (HT016907B) 9/12/2018
2
Outlines Introduction Mixed-mode Parallel Programming
Fuzzy c-Medoids Algorithm (FCMdd) Serial Version of FCMdd MPI Version of FCMdd Mixed-mode Version of FCMdd Comparisons between Pure MPI & Mix-mode Result Analysis Conclusion
3
Introduction This report discusses the benefits of developing mixed-mode MPI/OpenMP applications on clustered SMPs in data mining field, especially focusing on web log clustering. We start with a realistic serial C++ FCMdd program. Next we show some of the modifications that were performed to the program to enable it to run using MPI
4
Introduction (Contd) Then we show the simple modifications that were performed to the source to take advantage of OpenMP. We show that using a combination of MPI and OpenMP can be an effective method of programming hybrid systems.
5
Mixed-mode Parallel Programming
Shared-memory architectures are gradually becoming more prominent Advances in technology have allowed larger numbers of CPUs to have access to a single memory space But it is not immediately clear that message passing is the most efficient parallelization technique within an SMP box. Message-passing codes written in MPI are obviously portable and should transfer easily to clustered SMP systems. In theory a shared memory model such as OpenMP should be preferable.
6
Mixed-mode Cluster Hierarchical Model
7
Mixed-mode Parallel Programming (Contd)
Hypothesis: A combination of shared-memory and message-passing parallelization paradigms within the same application may provide a more efficient parallelization strategy than pure MPI.
8
FCMdd Algorithm Description
Web mining can be viewed as the extraction of structure from unlabeled semistructured data containing information. Three operations of particular interest: Clustering – finding natural groupings of users, pages. Associations – URLs tend to be requested together. Sequential analysis – URLs tend to be accessed.
9
Fuzzy c-Medoids Algorithm(FCMdd)
Dissimilarity: Objective function of FCMdd (1) (2)
10
The Role of FUZZY Granularity in Web Mining
The categories and associations in Web mining do not have crisp boundaries. They overlap considerably and are best described by fuzzy sets. Bad exemplars (outliers) and incomplete data can easily occur in the data set
11
Serial Version of FCMdd
Fix the number of clusters c: Set iter = 0; Pick initial medoids from Repeat Compute memberships for and by using (2) and identify (A) Store the current medoids: Compute the new medoids for (B) Until ( or )
13
Properties of FCMdd Dealing with data which size is extremely large
Time complexity is O(n*c*p), where n is the data size
14
MPI Version of FCMdd Read Data from Disk to File
Different node reads different parts of file, then broadcast to others. Different node reads the same file separately. A specified node reads all the data then broadcasts to the other nodes
15
MPI Version of FCMdd(Contd)
Choose Medoids from sessions The kernel of FCMdd is the replacement of original medoid by appropriate candidates. Medoids can be considered separately in each iteration. Carbine the medoids parts together to get the complete array.
16
MPI Version of FCMdd(Contd)
17
Performance Analysis of MPI mode
Record number is 138,384: 1 node : s 2 nodes : s 4 nodes : s 8 nodes : s
18
Performance Analysis of MPI mode(Contd)
Record Number is 1,882,384: 1 node : s 2 nodes : s 4 nodes : s 8 nodes : s
19
Mix-mode version of FCMdd
Two kinds of Mix-mode programming Models: MPI parallelization occurs across hosts at the top level, and OpenMP parallelization occurring below within each node. Environment variable OMP_NUM_THREADS can be set for each host, which may be different or equivalent to set differently for each host. MPI and OpenMP parallelization occurs within a host. Environment variable can be used to set the number of OpenMP threads for all MPI processes and all MPI processes will use the same number of threads. If each MPI process needs different number of threads, omp_set_num_threads should be called on each MPI process.
20
Mix-mode version of FCMdd(Contd)
The left figure corresponds to the parallelization for a cluster of uniprocessor nodes, where a MPI process is allocated on each node.
21
Mix-mode version of FCMdd(Contd)
The right figure shows how the computation part of the process is split into threads by using OpenMP directives. P is the part of code that can not be parallelized with OpenMP and Pn is the OpenMP parallel part.
22
Mix-mode version of FCMdd(Contd)
23
Mix-mode version of FCMdd(Contd)
24
Comparisons Between the MPI and Mix-mode Programming
we tested on hydra within one node by varied database size
25
Comparisons Between the MPI and Mix-mode Programming
26
Comparisons Between the MPI and Mix-mode Programming
Data size is 610K in Beowulf:
27
Comparisons Between the MPI and Mix-mode Programming
28
Comparisons Between the MPI and Mix-mode Programming
Data size is 8M on Beowulf:
29
Comparisons Between the MPI and Mix-mode Programming
30
Comparisons Between the MPI and Mix-mode Programming (Contd)
The two modes have similar performance. However, there are still differences.
31
Comparisons Between the MPI and Mix-mode Programming (Contd)
Within one host, Mix-mode shows more potential to improve efficiency than pure MPI programming. Our test result in hydra shows that inside one host, Mix-mode is better than pure MPI mode. In theory, OpenMP makes better usage of shared memory architecture in pure shared memory mechanism.
32
Comparisons Between the MPI and Mix-mode Programming (Contd)
In a hybrid shared/distributed memory system, a mix mode implementation could be more efficient for large problems giving poor scalability for a large number of processors.
33
Comparisons Between the MPI and Mix-mode Programming (Contd)
In theory, a mixed mode code, with the MPI parallelization occurring across the SMP hosts and OpenMP parallelization within the hosts, should be more efficient on an SMPs cluster as the model matches the architecture more closely than a pure MPI model. The test results in beowulf also support our conclusion.
34
Conclusion Hypothesis: A combination of shared-memory and message-passing parallelization paradigms within the same application may provide a more efficient parallelization strategy than pure MPI. Our project supports the hypothesis successfully.
35
Contribution Chen Jin Zhao XuYing Data Preprocessing Serial version
MPI version MIX version Comparison
36
Reference [1] Franck Cappello and Daniel Etiemble. MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks. In /2000 (c) 2000 IEEE. [2] L.A.Smith. Mixed Mode MPI/OpenMP Programming. In Edinburgh Parallel Computing Centre, Edinburgh, EH93 JZ [3] Franck Cappello, Olivier Richard and Daniel Etiemble. Investigating the performance of two programming models for clusters of SMPPCs. In High-Performance Computer Architecture, HPCA-6. Proceedings. Sixth International Symposium on , 1999 [4] Piero Lanucara and Sergio Rovida. Conjugate-Gradients Algorithms: An MPI-OpenMP Implementation on. In CASPUR, c/o Universit`a“LaSapienza”, P. Aldo Moro 2,00185 Roma, Italy and Istituto Analisi Numerica-C.N.R., via Ferrata1, Pavia, Italy [5] OpenMP, OpenMP C and C++ Application Program Interface. Http: // [6] North Carolina Supercomputing Center. A Standard Shared Memory Parallel Computing.
37
END
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.