Event Summarization for System Management Wei Peng†, Chang-shing Perng§, Tao Li†, Haixun Wang§ †Florida International University §IBM T.J.Waston Research.

Slides:



Advertisements
Similar presentations
Web Services Architecture An interoperability architecture for the World Wide Service Network.
Advertisements

A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Clustering Basic Concepts and Algorithms
Discovering Lag Interval For Temporal Dependencies Larisa Shwartz Liang Tang, Tao Li, Larisa Shwartz1 Liang Tang, Tao Li
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Decision Tree Approach in Data Mining
TAODV: A Trusted AODV Routing Protocol for MANET Li Xiaoqi, GiGi March 22, 2004.
R OBERTO B ATTITI, M AURO B RUNATO The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Feb 2014.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Introduction to Statistics and Research
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
A Statistical Model for Domain- Independent Text Segmentation Masao Utiyama and Hitoshi Isahura Presentation by Matthew Waymost.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Funding Networks Abdullah Sevincer University of Nevada, Reno Department of Computer Science & Engineering.
Data Mining Jessica Jackson Kimberli Klein Kevin Wood.
Analyzing System Logs: A New View of What's Important Sivan Sabato Elad Yom-Tov Aviad Tsherniak Saharon Rosset IBM Research SysML07 (Second Workshop on.
A State-based Programming Model and System for Wireless Sensor Networks Reporter : Zhi-Yuan Yang 2010/5/24.
Classification Continued
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Clementine Server Clementine Server A data mining software for business solution.
Social Context Based Recommendation Systems and Trust Inference Student: Andrea Manrique ID: ITEC810, Macquarie University1 Advisor: A/Prof. Yan.
Algorithm: For all e E t, define X e = {w e if e G t, 1 - w e otherwise}. Measure likelihood of substructure S by. Flag S as anomalous if, where is an.
Week 11 Chapter 12 – Association between variables measured at the nominal level.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )
How to do backpropagation in a brain
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Wei Gao1 and Qinghua Li2 1The University of Tennessee, Knoxville
Graph and Topological Structure Mining on Scientific Articles Fan Wang, Ruoming Jin, Gagan Agrawal and Helen Piontkivska The Ohio State University The.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Quantification of the non- parametric continuous BBNs with expert judgment Iwona Jagielska Msc. Applied Mathematics.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
DR Software: Essential Foundational Elements and Platform Components UCLA Smart Grid Energy Research Center (SMERC) Industry Partners Program (IPP) Meeting.
Spatial-Temporal Models in Location Prediction Jingjing Wang 03/29/12.
Web Usage Mining for Semantic Web Personalization جینی شیره شعاعی زهرا.
Prophet Address Allocation for Large Scale MANETs Matt W. Mutka Dept. of Computer Science & Engineering Michigan State University East Lansing, USA IEEE.
Chapter 14 Inference for Regression AP Statistics 14.1 – Inference about the Model 14.2 – Predictions and Conditions.
--He Xiangnan PhD student Importance Estimation of User-generated Data.
CEMiner – An Efficient Algorithm for Mining Closed Patterns from Time Interval-based Data Yi-Cheng Chen, Wen-Chih Peng and Suh-Yin Lee ICDM 2011.
Unit 6 We are reviewing proportional relationships using graphs and tables. We are reviewing how to compare rates in different representations of proportional.
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
April 28, 2003 Early Fault Detection and Failure Prediction in Large Software Systems Felix Salfner and Miroslaw Malek Department of Computer Science Humboldt.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
VAST 2010 Mini Challenge #1 Award: VisWorks Text and Network Visual Analytics Lei Shi, Weihong Qian, Furu Wei and Li Tan IBM Research - China Visualizations.
LogTree: A Framework for Generating System Events from Raw Textual Logs Liang Tang and Tao Li School of Computing and Information Sciences Florida International.
LogSig: Generating System Events from Raw Textual Logs Liang Tang 1, Tao Li 1, Chang-Shing Perng 2 1 Florida International University 2 IBM T.J. Watson.
Discriminative Frequent Pattern Analysis for Effective Classification By Hong Cheng, Xifeng Yan, Jiawei Han, Chih- Wei Hsu Presented by Mary Biddle.
Chapter 8: Adaptive Networks
Motivation FACE architecture encourages modularity of components on data boundaries Transport Services Segment interface is centered on sending and receiving.
Section 9.3 Measures of Regression and Prediction Intervals.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
SIGIR 2005 Relevance Information: A Loss of Entropy but a Gain for IDF? Arjen P. de Vries Thomas Roelleke,
Il-Ahn Cheong Linux Security Research Center Chonnam National University, Korea.
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Source-Resolved Connectivity Analysis
INTRODUCTION AND DEFINITIONS
DATA MINING © Prentice Hall.
Author: Hsun-Ping Hsieh, Shou-De Lin, Yu Zheng
Introduction to Statistics and Research
Basic Concepts PhD Course.
Classification and Prediction
Clustering Wei Wang.
Chapter 14 Inference for Regression
Discovering Important Nodes through Graph Entropy
Presentation transcript:

Event Summarization for System Management Wei Peng†, Chang-shing Perng§, Tao Li†, Haixun Wang§ †Florida International University §IBM T.J.Waston Research Center -presented by: Wei Peng

Introduction Why Event Summarization? – traditional approaches are cumbersome, labor intensive, and error prone – focus on discovering frequent or interesting patterns, scalability, and efficiency – understanding and interpreting patterns A divide-and-conquer method

A Motivating Example

Steps for Event Summarization Preprocess log data and generate events Discover temporal correlation between events (dependency) Rank dependencies Construct Event Relationship Networks (ERNs) Derive Action Rules from Event Summary

Preprocess Log Data and Generate events Preprocess the brief log messages Categorize it into common situations/states – Incorporate time information An event is a pair that e is the situation/state, t is the time stamp of e

Discover Temporal Correlation between Events (Dependency) b depends on a – If the occurrence of b is predictable by the occurrence of a, then the conditional distribution which models the waiting time of event type b given event type a’s presence would be different from the unconditional one Estimate two distributions Dependency test Independent Dependent

Rank Dependencies Forward Entropy Backward Entropy

Event Relationship Networks (ERNs)

Derive Action Rules from Event Summary If condition is true, take action – Event reduction rules – Event correlation rules – Problem avoidance rules

A Case Study State: start, stop, dependency, create, connection, report, request, configuration, other

Decomposition Process in the Case Study

ERN in the Case Study

Thank You !