Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.

Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn

Introduction Information Tracking Capture time-varying system information Available via Query Fundamental to Autonomic Computing Distributed Computing Structure Worker Nodes Execute application tasks Management Nodes Monitor worker node conditions System management Worker nodes report,via sensors, metric values to the management nodes

Introduction Problem: Provide scalable and precise continuous system monitoring Management Nodes Up-to-date Precise knowledge of system Global information High cost of system monitoring Geographically dispersed Size of system Number of metrics

Goal Design and implement InfoTrack Predictive information tracking systemReduce monitoring costMinimize loss of coverage or precision Suppress remote information update Reduce network traffic Lower resource consumption Temporal correlation within one node Self-similarity Spatial correlation among distributed nodes Group-similarity

Correlations Temporal correlations Metric value predictor P i installed at monitoring node and management node If attribute can be predicted, update is suppressed Spatial correlations Group values inferred from one reporting node

System Model

Approach Overview Temporal Correlation a i,t can be inferred using previous m values within user-defined error bound Spatial Correlation Two nodes have correlated attributes if a i,t =f(a i,t ) f(a i,t )=a i,t f(a i,t )=a i,t +C f(a i,t )=a i,t *K

Infomation Tracking Cost Analysis Cost without suppression T=update interval N=number of nodes S i =message size a i =attributes

Infomation Tracking Cost Analysis Temporal Correlation T=update interval N=number of nodes S i =message size a i =attributes p i,1 = percent of nodes who's attributes can be inferred

Infomation Tracking Cost Analysis Spatial Correlation T=update interval N=number of nodes S i =message size a i =attributes p i,2 = percent of nodes who's attributes can be inferred l i =number of groups

Infomation Tracking Cost Analysis Integrated Correlation T=update interval N=number of nodes S i =message size a i =attributes p i,2 = percent of nodes who's attributes can be inferred l i =number of groups p' i,2 =number of nodes inferred based on cluster heads

Compression Ratio C to =Overhead of updating temporal predictors C so =Dynamic cluster update cost

Exploring Temporal Correlation Predictor installed at worker node and management node Need to keep prediction overhead low Last-value predictor Uses the last value of a i,t Simple and no overhead Kalman filterPredictive information trackingn internal statesm observable measurement x=state vector z=measurement vector

Exploring Temporal Correlation Kalman Filter predicts and corrects x t is predicted and then corrected if a true measurement is available Filters at the monitoring site and at the management node make predictions and correct when a sensor measurement is received Tradeoff between update cost and accuracy

Exploring Spatial Correlation Monitored nodes clustered into different groups based on a i Only the head node of the group reports Group by similarity. V&U are vectors of attributes of worker nodes All values pushed to management node to start clustering

Exploring Spatial Correlation Two Ways to cluster objects k-means k objects selected to act as seeds Node assigned to cluster most similar to seed Low computational complexity May suffer from bad seeds Unweighted Pair Group Method with Arithmetic Mean Each node starts out as own cluster Merges similar clusters together Forms as hierarchical tree to create natural clusters

Integrated Approach Combine both temporal and spatial correlations Sensor reports value for attribute only when it can't be inferred by either method within error bound If not reported to the management node, both attributes are accurate Sometimes only one attribute will be accurate Send flag to management node to indicate which is correct Updates periodically sent at certain intervals After Interval: Non-cluster head is removed Cluster head will be replaced

Prototype Implementation Prototype of InfoTrack deployed to PlanetLab and VCL Monitoring sensor collects 66 attributes Temporal correlation Last-value approach Kalman filter Spatial correlation k-means UPGMA

Traces and Their Characteristics Experiment ran for several months on 300 PlanetLab nodes Most attributes collected without interrupting normal workload Trace data for more than a week collected for difficult attributes

Evaluation Methodology Randomly select starting point in trace Evaluate next 9000 samples for CPU-10 and MEM-10 Evaluate next 3000 samples for CPU-30 and MEM30 One day UPGMA produces clusters between 10 and 20 k-means generates the same amount of clusters as UPGMA Error bound range from 0.001 to 0.1 Models evaluated based on Compression Ratio

Results and Analysis - Temporal Compression radio achieved with Kalmal and Last-value model

Results and Analysis - Spatial Compression radio achieved with UPGMA and K-means

Overhead Memory had large variance

Results and Analysis - Integrated

Related Work Most large-scale systems are statically configured with long update intervals Decentralized architectures scale better Infotrack explores correlation patterns to achieve continuous information monitoring Sensor network monitoring Distributed event tracking Resource discovery

Conclusion Infotrack Self-correlating predictive information tracking system Temporal and Spatial correlations integrated Reduces tracking costs Correlation patterns exist in production systems Discoverable using light-weight schemes Reduce information updates by up to 95% Deploy to more complicated systems for further research

Questions

Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.

Similar presentations

Presentation on theme: "Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn.

Similar presentations

Presentation on theme: "Self-Correlating Predictive Information Tracking for Large-Scale Production Systems Zhao, Tan, Gong, Gu, Wambolt Presented by: Andrew Hahn."— Presentation transcript:

Similar presentations

About project

Feedback