Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework for Grid- Based Data Mining Applications Leonid Glimcher Gagan Agrawal
Computer Science and Engineering Predicting Performance for Grid-Based P. 2 IPDPS’07 Motivating Scenario Data Repository Clusters Compute Clusters User ? 3 stages: Disk i/o, Network, Compute.
Computer Science and Engineering Predicting Performance for Grid-Based P. 3 IPDPS’07 Remote Data Analysis Remote data analysis –Grid is a good fit –Details can be very tedious Middleware abstracts away lots of development details Resource selection – crucial to performance Performance prediction facilitates resource selection
Computer Science and Engineering Predicting Performance for Grid-Based P. 4 IPDPS’07 Presentation Road Map Problem statement and motivation Middleware background Our performance prediction approach Experimental evaluation Related work Conclusions
Computer Science and Engineering Predicting Performance for Grid-Based P. 5 IPDPS’07 Problem Statement Given: Parallel data processing application Execution time break-down (profile) Configurations of available computing resources Dataset replicas in different size repositories Predict application execution time in order to select right dataset replica and resource configuration
Computer Science and Engineering Predicting Performance for Grid-Based P. 6 IPDPS’07 FREERIDE-G Design
Computer Science and Engineering Predicting Performance for Grid-Based P. 7 IPDPS’07 FREERIDE-G Processing KEY observation: most data mining algorithms follow canonical loop Middleware API: Subset of data to be processed Reduction object Local and global reduction operations Iterator While( ) { forall( data instances d) { I = process(d) R(I) = R(I) op d } ……. }
Computer Science and Engineering Predicting Performance for Grid-Based P. 8 IPDPS’07 Performance Prediction Approach 3 Phases of execution: –Retrieval at data server –Data delivery to compute node –Parallel processing at compute node Special processing structure: –Generalized reduction T exec = T disk + T network + T compute
Computer Science and Engineering Predicting Performance for Grid-Based P. 9 IPDPS’07 Needed profile information Numbers of storage nodes (n) compute nodes (c) Available bandwidth between these (b), in profile configuration Execution time breakdown: data retrieval (t d ) network communication (t n ) data processing (t c ) components Dataset size (s) Reduction object information: maximum size communication time Global reduction time
Computer Science and Engineering Predicting Performance for Grid-Based P. 10 IPDPS’07 Data Retrieval and Communication Time Data Retrieval: Dataset size (s) and number of data hosts (n) for base profile and predicted configuration (s’ and n’). Used to scale t d. Data Communication: Also need dataset size and number of data hosts, as well as bandwidth (b and b’). Used to scale t n.
Computer Science and Engineering Predicting Performance for Grid-Based P. 11 IPDPS’07 Initial Data Processing Time Prediction Dataset size (s) and number of compute nodes (c): base profile (s,c) predicted profile (s’, c’) Used to scale up t c. Limitations – not modeling: Inter-processor communication time Global reduction time
Computer Science and Engineering Predicting Performance for Grid-Based P. 12 IPDPS’07 Modeling Interprocessor Communication Parallel computation involves communication of reduction object Communication time (T ro ) Reduction object size (r) Interprocessor bandwidth (w) Latency (l) Reduction object size either remains constant or scales linearly
Computer Science and Engineering Predicting Performance for Grid-Based P. 13 IPDPS’07 Modeling Global Reduction Global reduction time (T g ) is also serialized Depending on application, global reduction time: –Scales linearly with number of nodes but is constant independent of size –Stays constant independent of number of nodes, but scales linearly with data size
Computer Science and Engineering Predicting Performance for Grid-Based P. 14 IPDPS’07 Modeling Across Heterogeneous Clusters Need scaling factors for all 3 stages of computation (from a set of representative applications).
Computer Science and Engineering Predicting Performance for Grid-Based P. 15 IPDPS’07 FREERIDE-G Applications Data mining: K-means clustering KNN search EM clustering Scientific data processing: Vortex extraction (right) Molecular defect detection and categorization
Computer Science and Engineering Predicting Performance for Grid-Based P. 16 IPDPS’07 Experimental Setup Base: 700 MHz Pentiums connected through Myrinet LaNai 7.0 Heterogeneous prediction: 2.4 GHz Opteron 250’s connected through Infiniband (1Gb) Goal – to correctly model changes in: 1.Parallel configuration 2.Dataset size 3.Network bandwidth 4.Underlying resources
Computer Science and Engineering Predicting Performance for Grid-Based P. 17 IPDPS’07 Modeling Parallel Performance Errors for 3 approaches for: 1.Vortex detection, base: 1-1 configuration 710 MB dataset 2.Defect detection, base: 1-1 configuration 130 MB dataset Results: modeling reduction pays off accurate predictions
Computer Science and Engineering Predicting Performance for Grid-Based P. 18 IPDPS’07 Modeling Dataset Size Errors for 1 (best) approach for: 1.EM clustering (1.4 GB), base: 1-1 configuration 350 MB dataset 2.Defect detection (1.8 GB), base: 1-1 configuration 130 MB dataset Results: biggest error when number of data nodes is same as number of compute nodes accurate predictions
Computer Science and Engineering Predicting Performance for Grid-Based P. 19 IPDPS’07 Impact of Network Bandwidth Errors for 1 (best) approach for: 1.EM clustering (250 Kbps), base: 1-1 configuration 500 Kbps 2.Defect detection (250 Kbps), base: 1-1 configuration 500 Kbps Results: biggest error when number of data nodes is same as number of compute nodes Modeling reduction is most accurate
Computer Science and Engineering Predicting Performance for Grid-Based P. 20 IPDPS’07 Predictions for different type of cluster Errors for 1 (best) approach for: 1.Defect detection (1.8 GB), base: 1-1 configuration 710 MB dataset 2.EM clustering (700 MB), base: 8-8 configuration 350 MB dataset Results: Scaling factors different Largest error when predicted configuration has same number of compute nodes as base
Computer Science and Engineering Predicting Performance for Grid-Based P. 21 IPDPS’07 Existing Work 3 broad categories for resource allocation: Heuristic approach to mapping Prediction through modeling: Statistical estimation/prediction Analytical modeling of parallel application Simulation based performance prediction
Computer Science and Engineering Predicting Performance for Grid-Based P. 22 IPDPS’07 Summary Performance prediction approach Exploits similarities in application processing structure to come up with very accurate results Approach accurately models changes in: –Computing configuration –Dataset size –Network bandwidth –Underlying compute resources