Download presentation
Presentation is loading. Please wait.
1
Star-P and the Knowledge Discovery Suite Steve Reinhardt, spr@InteractiveSupercomputing.com Viral Shah, vshah@InteractiveSupercomputing.com John Gilbert, gilbert@cs.ucsb.edu Stefan Karpinski, sgk@cs.ucsb.edu
2
Context for Knowledge Discovery
3
Goal for Large-scale Knowledge Discovery via Star-P/KDS Enable domain expert to explore big unstructured data interactively –Domain Expert: Scientist or analyst, not math or graph expert –Explore: human-guided characterization, from simple statistics to complex clustering or factoring, even when best algorithm not known –Big: 20GB+ of data commonplace, >1TB largest –Unstructured Data: E.g., arising from metabolic networks, climate change, social interactions, and Internet traffic Allow analyst to discern structure in the data via exploration KDT implements many key algorithms; extensible for other algorithms –Interactively: common queries take O(10 seconds) on 30-128P Altix –Depends on algorithms being general-purpose, reusable, and usable by non-experts
4
Star-P Basics MATLAB® is a registered trademark of The MathWorks, Inc. ISC's products are not sponsored or endorsed by The Mathworks, Inc. or by any other trademark owner referred to in this document.
5
A Comprehensive (?) Picture of Knowledge Discovery Data structures (sparse matrices,...) Parallel constructs Graph primitives (BFS, MIS, conncomp,...) Linear algebra (spmat*vec,...) Solvers (MUMPS, SuperLU,...) Dimensionality reduction / factorization SVD, eigen, NMF, PCA, Clustering K-means,... Input HDF5, OPeNDAP, Hadoop, live data feeds Visualization Renoir, In-Spire,... Classification SVM, Bayesian, HMMs,... Utility (sort, indexing) SVD implemented by ISC Hadoop implemented by others Etc.
6
Knowledge Discovery Suite Simple data analysis operations at very large scale –Sorting, set operations, statistical operations Graph operations on very large graphs –Simple queries, breadth-first search, connected components, independent sets Visualization with desktop tools –Distributed image generation for large graphs and datasets Clustering and decomposition –K-means clustering, non-negative matrix factorization, principal component analysis Bayesian network modeling –Expectation maximization (Baum-Welch), hidden Markov models What would you like to see?
7
Ways to Work Together You use Star-P infrastructure for parallel algorithm development/deployment You develop serial algorithms, we develop parallel versions We jointly develop parallel algorithms We develop or integrate missing functionality you need We target joint client with integrated package
8
A Brief Demo
9
Sample Kernels, Algorithms, and Workflows
10
Simplest KDS operation: Parallel Sorting Simple, widely used combinatorial primitive [W, perm] = sort (V) Used in many sparse matrix and array algorithms: sparse (), indexing, concatenation, transpose, reshape, repmat, etc. Communication efficient 368154729 123456789
11
Sorting performance
12
A complex workflow (SSCA#2, kernel 3) function subgraphs = kernel3 (G, pathlen, starts) % KERNEL3 : SSCA#2 Kernel 3 -- Graph Extraction starts = starts(:,2); nstarts = length(starts); A = grsparse (G); nv = nverts (G); % Use sparse matrix multiplication to do several BFS searches at once. s = sparse (starts, 1:nstarts, 1, nv, nstarts); for k=1:pathlen s = A * s; % Ideally reach should support this. Not yet. s = (s ~= 0); end for i = 1:nstarts x = s(:,i); vtxmap = find(x); S.graph = subgraph (G, vtxmap); S.vtxmap = vtxmap; subgraphs{i} = S; end
13
A complex workflow (SSCA#2, kernel 4) function leader = kernel4f (G) % KERNEL4F : SSCA#2 Kernel 4 -- Graph Clustering % Find a Maximal Independent Set in G [IS, misrounds] = mis (G); fprintf ('MIS rounds: %d. MIS nodes: %d\n', misrounds, length(IS)); % Find neighbours of each node from the IS neighFromIS = G * sparse(IS, IS, 1, n, n); % Pick one of the neighbouring IS nodes as a leader [ign leader] = max (neighFromIS, [], 2); % Collect votes from neighbours [I J] = find (G); S = sparse (I, leader(J), 1, n, n); % Pick the most popular leader among neighbours and join that cluster [ign leader] = max (S, [], 2);
14
Scaling Performance: cSSCA#2 on 128P "scale“ #vertices (== 2^scale) #edges (~= 10 * vertices) graph size (bytes) time (seconds) 224.194E+064.194E+077.550E+08122.51 241.678E+071.678E+083.020E+09402.31 266.711E+076.711E+081.208E+101237.1 Timings scale well – for large graphs, –2x problem size 2x time –2x problem size & 2x processors same time
15
Distributed visualization
16
App#1: Computational Ecology Modeling dispersal of species within a habitat (to maximize range) Large geographic areas, linked with GIS data Blend of numerical and combinatorial algorithms Brad McRae and Paul Beier, “Circuit theory predicts gene flow in plant and animal populations”, PNAS, Vol. 104, no. 50, December 11, 2007
17
Results Solution time reduced from 3 days (desktop) to 5 minutes (14p) for typical problems Aiming for much larger problems: Yellowstone-to-Yukon (Y2Y)
18
App#2 Factoring network flow behavior [Karpinski, Almeroth, Belding]
19
Algorithmic exploration Many NMF variants exist in the literature –Not clear how useful on large data –Not clear how to calibrate (i.e., number of iterations to converge) NMF algorithms combine linear algebra and optimization methods Basic and “improved” NMF factorization algorithms implemented: –euclidean (Lee & Seung 2000) –K-L divergence (Lee & Seung 2000) –semi-nonnegative (Ding et al. 2006) –left/right-orthogonal (Ding et al. 2006) –bi-orthogonal tri-factorization (Ding et al. 2006) –sparse euclidean (Hoyer et al. 2002) –sparse divergence (Liu et al. 2003) –non-smooth (Pascual-Montano et al. 2006)
20
NMF traffic analysis results NMF identifies essential components of the traffic Analyst labels different types of external behavior
21
Future Directions What should KDS contain: –More algorithms ? –Other classes of algorithms? –Visualization ? –Easy use of hardware accelerators (GPU, Cell) ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.