1 Supporting a Volume Rendering Application on a Grid-Middleware For Streaming Data Liang Chen Gagan Agrawal Computer Science & Engineering Ohio State University
2 Introduction- Motivation What is data steam –Data stream: data arrive continuously –Enormous volume and must be processed online –Need to be processed in real-time –Data sources could be distributed Data Stream Applications: –Online network intrusion detection –Sensor networks –Network Fault Management system for telecommunication network elements
3 Introduction - Motivation Network Fault Management System (NFM) analyzing distributed alarm streams Switch Network X NFM (Network Fault Management) System
4 Introduction- Motivation Switch Network X Challenges –Data and/or computation intensive –System can be easily overloaded
5 Introduction- Motivation Possible solutions –Grid computing technologies –Automatically adjust processing rate Switch Network
6 Introduction- Our Approach We implemented a middleware to meet the needs Previous work: 1. Utilizing existing grid standards Liang Chen, K. Reddy and G. Agrawal “GATES: A Grid-Based Middleware for Processing Distributed Data Streams”.HPDC, Providing self-Adaptation functionality Liang Chen and G. Agrawal “Supporting Self-Adaptation in Streaming Data Mining Applications”. IPDPS, Supporting automatic resource allocation Liang Chen and G. Agrawal “A Static Resource Allocation Framework for Grid-Based Streaming Applications”. Concurrency Computation: Practice and Experience Journal, Volume 18, Issue 6, Pages Supporting efficient dynamic migration Liang Chen, Q. Zhu and G. Agrawal “A Supporting Dynamic Migration in Tightly Coupled Grid Applications”. SC 2006.
7 Roadmap Introduction GATES Overview Adaptive Volume Rendering Conclusions
8 GATES Architecture and Design Use Globus Toolkit, built on OGSA Allows users to specify their algorithms implemented in Java Take care of plugging user-defined algorithms into the system and running them in Grid. Applications need be broken down into a number of pipelined stages
9 ABC Stage A Stage BStage C :GATES services :Stages of an application :Queues between Grid services :Buffers for applications System Architecture and Design (Architecture) Application Stage A Stage B Stage C
10 Public class Second-Stage implements StreamProcessing { … void work(buffer in, buffer out) { … while(true) { DATA = GATES.getFromInputBuffer(in); Inter-Results = Processing(Data); GATES.putToOutputBuffer (out, Inter-Results); } System Architecture and Design (GATES API Functions)
11 Adaptation Parameter Definition: –A parameter in an application –Changing the parameter’s value can change processing rate of the application, also impact accuracy of the processing Two kinds of adaptation parameters –Performance parameter –Accuracy parameter –Example Sampling rate is an accuracy parameter AccuracyProcessing rateAccuracy Parameter AccuracyProcessing ratePerformance Parameter
12 Pseudo Codes Again with Self-adaptation API Functions Public class Second-Stage implements StreamProcessing { … //Initialize sampling-rate Sampling-rate = (Max+ Min)/2; void work(buffer in, buffer out) { GATES.specifyAccuracyPara(Sampling-rate, Max, Min); while(true) { DATA = GATES.getFromInputBuffer(in); Inter-Results = Processing(Data, Sampling-rate); GATES.putToOutputBuffer (out, Inter-Results); Sampling-rate = GATES.getSuggestedValue(); }
13 Adaptive Volume Rendering Motivation – Grid computing is needed Visualization involves large volumes of dataset We focus on streaming volume data Interactively visualizing volume data in real-time is needed –Computationally intensive –Resources consumed –Real-time processing can not be guaranteed The places where data are generated are distributed Typical client-server architecture is not scalable –Network bandwidths of wide-area networks are low –Computing capability of normal desktop is not enough Grid techniques would be a good solution –Divide the procedure into stages organized in a pipeline –Allocate nodes close to data source to pre-process volume data –The size of intermediate results is much smaller
14 Adaptive Volume Rendering Motivation – GATES is desirable –Automatic adaptation is desirable Volume rendering algorithms running on a grid need to be highly adaptive Adaptation usually achieved by manually adjusting adaptation parameters Such manual parameter adaptation is very challenging in a grid environment –Automatic resource allocation is desirable Grid environment is highly changeable –The GATES middleware could fulfill the needs Grid-based Provide the self-adaptation function to applications Automatically allocate Grid resources
15 Overall design –Two pipelined steps – the first step: Build octrees from volume data –Octree is a tree data structure, in which each internal node has up to 8 children –Here, we use an octree to represent multiresolution information for a volume –Procedure to build an octree for a volume is as follows: »Divide volume space into 8 subvolumes and create 8 children nodes »For each subvolume, calculate standard deviation of all voxels in the subvolume, and store the deviation to the corresponding child node »If the deviation is larger than a pre-defined value, divide the subvolume, repeat the above procedure. Otherwise, stop Adaptive Volume Rendering
16 Adaptive Volume Rendering Overall design –Two pipelined steps – the second step: Use an octree and its corresponding volume to render images Provided an error tolerance (or user-defined resolution), use DFS to traverse the octree and stop at the nodes where the deviation is less than the resolution or error tolerance. Project the corresponding 3D-subvolumes to an image
17 Adaptive Volume Rendering
18 Adaptive Volume Rendering Make the rendering self-adaptive –Two adaptation parameters used in the third stage Error Tolerance – performance parameter Image Size – accuracy parameter –Only one adaptation parameter can be adjusted by GATES. So we fix one and adjust the other
19
20 Adaptive Volume Rendering Experiment 1
21 Adaptive Volume Rendering Experiment 2
22 Adaptive Volume Rendering Experiment 3: compare the performance of two implementations –Java-imple –C-imple
23 Conclusion Grid computing could be an effective solution for distributed data stream processing GATES –Distributed processing –Exploit grid web services –Self-adaptation to meet the real-time constraints –Grid resource allocation schemes and dynamic migration