Towards Scalable Critical Alert Mining Bo Zong 1 with Yinghui Wu 1, Jie Song 2, Ambuj K. Singh 1, Hasan Cam 3, Jiawei Han 4, and Xifeng Yan 1 1 UCSB, 2.

Slides:



Advertisements
Similar presentations
A Hierarchical Multiple Target Tracking Algorithm for Sensor Networks Songhwai Oh and Shankar Sastry EECS, Berkeley Nest Retreat, Jan
Advertisements

Mining Compressed Frequent- Pattern Sets Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign.
Distance-Constraint Reachability Computation in Uncertain Graphs Ruoming Jin, Lin Liu Kent State University Bolin Ding UIUC Haixun Wang MSRA.
Data Streaming Algorithms for Accurate and Efficient Measurement of Traffic and Flow Matrices Qi Zhao*, Abhishek Kumar*, Jia Wang + and Jun (Jim) Xu* *College.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
NDN in Local Area Networks Junxiao Shi The University of Arizona
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.
Cloud Service Placement via Subgraph matching
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Gueyoung Jung, Nathan Gnanasambandam, and Tridib Mukherjee International Conference on Cloud Computing 2012.
1 Crawling the Web Discovery and Maintenance of Large-Scale Web Data Junghoo Cho Stanford University.
Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.
33 rd International Conference on Very Large Data Bases, Sep. 2007, Vienna Towards Graph Containment Search and Indexing Chen Chen 1, Xifeng Yan 2, Philip.
Query-Based Outlier Detection in Heterogeneous Information Networks Jonathan Kuck 1, Honglei Zhuang 1, Xifeng Yan 2, Hasan Cam 3, Jiawei Han 1 1 University.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Honglei Zhuang1, Jing Zhang2, George Brova1,
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Scalable Approximate Query Processing through Scalable Error Estimation Kai Zeng UCLA Advisor: Carlo Zaniolo 1.
Sensor Networks Storage Sanket Totala Sudarshan Jagannathan.
1 Chapter Overview Monitoring Server Performance Monitoring Shared Resources Microsoft Windows 2000 Auditing.
A Randomized Approach to Robot Path Planning Based on Lazy Evaluation Robert Bohlin, Lydia E. Kavraki (2001) Presented by: Robbie Paolini.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.
EVENT MANAGEMENT IN MULTIVARIATE STREAMING SENSOR DATA National and Kapodistrian University of Athens.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Network Aware Resource Allocation in Distributed Clouds.
1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.
Module 7: Fundamentals of Administering Windows Server 2008.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds Deepak Poola, Kotagiri Ramamohanarao, and Rajkumar Buyya Cloud Computing and Distributed.
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.
Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian.
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
Graph Query Reformulation with Diversity – Davide Mottin, Francesco Bonchi, Francesco Gullo 1 Graph Query Reformulation with Diversity Davide Mottin, University.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Answering pattern queries using views Yinghui Wu UC Santa Barbara Wenfei Fan University of EdinburghSouthwest Jiaotong University Xin Wang.
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Efficient k-Coverage Algorithms for Wireless Sensor Networks Mohamed Hefeeda.
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Dzmitry Kliazovich University of Luxembourg, Luxembourg
Reliable Multicast Routing for Software-Defined Networks.
PhD Dissertation Defense Scaling Up Machine Learning Algorithms to Handle Big Data BY KHALIFEH ALJADDA ADVISOR: PROFESSOR JOHN A. MILLER DEC-2014 Computer.
Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree
Bo Zong, Yinghui Wu, Ambuj K. Singh, Xifeng Yan 1 Inferring the Underlying Structure of Information Cascades
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Scheduling Parallel DAG Jobs to Minimize the Average Flow Time K. Agrawal, J. Li, K. Lu, B. Moseley.
- Sachin Singh. Data Mining - Concepts Extracting meaningful knowledge from huge chunk of ‘raw’ data. Types –Association –Classification –Temporal.
Queensland University of Technology Nagios – an Open Source monitoring solution and it’s deployment at QUT.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Xifeng Yan Philip S. Yu Jiawei Han SIGMOD 2005 Substructure Similarity Search in Graph Databases.
Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.
BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.
International Conference on Data Engineering (ICDE 2016)
Distributed Network Traffic Feature Extraction for a Real-time IDS
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
Pei Lee, ICDE 2014, Chicago, IL, USA
Resource Allocation for Distributed Streaming Applications
Lu Tang , Qun Huang, Patrick P. C. Lee
EdgeWise: A Better Stream Processing Engine for the Edge
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Towards Scalable Critical Alert Mining Bo Zong 1 with Yinghui Wu 1, Jie Song 2, Ambuj K. Singh 1, Hasan Cam 3, Jiawei Han 4, and Xifeng Yan 1 1 UCSB, 2 LogicMonitor, 3 Army Research Lab, 4 UIUC 1

Big Data Analytics in Automated System Management  Complex systems are ubiquitous  Tons of monitoring data generated from complex systems  Big data analytics are desired to extract knowledge from massive data and automate complex system management 2 Aircraft system Nuclear power plant Computer network Software system Social media Chemical production system

Massive Monitoring Data in Complex Systems  Example: monitoring data in computer networks 3 Data center Monitoring #MongoDB backup jobs: Apache response lag: Mysql-Innodb buffer pool: SDA write-time: … 120-server data center can generate monitoring data 40GB/day

System Malfunction Detection via Alerts  Example: alerts in computer networks  Complex systems could have many issues  For the 40GB/day data generated from the 120-server data center, we will collect 20k+ alerts/day 4 Monitoring data 01:20am: #MongoDB backup jobs ≥ 30 01:30am: Memory usage ≥ 90% 01:31am: Apache response lag ≥ 2 seconds 01:43am: SDA write-time ≥ 10 times slower than average performance … 09:32pm: #MySQL full join ≥ 10 09:47pm: CPU usage ≥ 85% 09:48pm: HTTP-80 no response 10:04pm: Storage used ≥ 90% …

Mining Critical Alerts  Example: critical alerts in computer networks 5 Critical! Disk Read #MongoDB backup CPU cores MongoDB Mcollective reg How to efficiently mine critical alerts from massive monitoring data?

Pipeline  Offline dependency rule mining  Online alert graph maintenance  On-demand critical alert mining 6 Our focus user … Dependency rules [0, 1, …, 1, 1] [1, 1, …, 1, 0] [0, 0, …, 1, 1] … History alert log t1t1 t2t2 t3t3 time Alert graph … … … Offline dependency rule mining Online alert graph maintenance On-demand critical alert mining … Incoming alerts

Alert Graph  Alert graphs are directed acyclic (DAG)  Nodes: alerts derived from monitoring data  Edges  Indicate the probabilistic dependency between two alerts  Direction: from one older alert to another younger alert  Weight: the probability that the dependency holds  Example 7 How to measure an alert is critical? A C Alert graph G

Gain of Addressing Alerts 8 The cause of u disappears given S is addressed

Critical Alert Mining 9 NP-hard

Naive Greedy Algorithm 10 S { } A B Alert graph G A B How to speed up greedy algorithms?

Bound and Pruning Algorithm (BnP)  Pruning unpromising alerts by upper and lower bounds  Drawback: pruning might not always work 11 Bound estimation Upper Lower Unpromising LocalGain SumGain A C Alert graph G Can we trade a little approximation quality for better efficiency?

Single-Tree Approximation 12

Single-Tree Approximation (cont.)  Linear-time algorithm to search maximum directed spanning tree  Drawback: accuracy loss in Gain estimation  Edge of the highest weight is always selected  Edges of similar weight never get selected G T*T* Tree sparsification Gain estimation

Multi-Tree Approximation  Sample multiple trees from an alert graph G Tree sampling T1T1 TLTL ……. Gain estimation Average Gain

Experimental Results  Efficiency comparison on LogicMonitor alert graphs  BnP is 30 times faster than the baseline  Multi-tree approximation is 80 times faster with 0.1 quality loss  Single-tree approximation is 5000 times faster with 0.2 quality loss 15

Conclusion  Critical alert mining is an important topic for automated system management in complex systems  A pipeline is proposed to enable critical alert mining  Tree approximation practically works well for critical alert mining  Future work Critical alert mining with domain knowledge Alert pattern mining if two groups of alerts follow the same dependency pattern, they might result from the same problem Alert pattern querying if we have a solution to a problem, we apply the same solution when we meet the problem again 16

Questions? Thank you! 17

Experiment Setup  Real-life data from LogicMonitor  50k performance metrics from 122 servers  Spans 53 days  Offline dependency rule mining  Training data: the latest 7 consecutive days  Mined 46 set of rules (starting from the 8 th day)  Learning algorithm: Granger causality  Alert graphs  Constructed 46 alert graphs  #nodes: 20k ~ 25k  #edges: 162k ~ 270k 18

Case study 19