EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org Fault Detection and Diagnosis in the EGEE grid C. Germain-Renaud, X. Zhang, M. Sebag.

Slides:



Advertisements
Similar presentations
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Advertisements

Clustering Basic Concepts and Algorithms
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Adapted by Doug Downey from Machine Learning EECS 349, Bryan Pardo Machine Learning Clustering.
Data Mining – Intro.
Part I: Classification and Bayesian Learning
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Introduction to Data Mining Engineering Group in ACL.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Anomaly detection with Bayesian networks Website: John Sandiford.
Inductive learning Simplest form: learn a function from examples
Machine Learning CSE 681 CH2 - Supervised Learning.
1 Lecture 10 Clustering. 2 Preview Introduction Partitioning methods Hierarchical methods Model-based methods Density-based methods.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Universit at Dortmund, LS VIII
INFSO-RI Enabling Grids for E-sciencE ES applications in EGEEII – M. Petitdidier –11 February 2008 Earth Science session Wrap up.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Security and Job Management.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
Enabling Grids for E- sciencE EGEE and gLite are registered trademarks EGEE-III INFSO-RI Analysis of Overhead and waiting times.
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status report on Application porting at SZTAKI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Grid Observatory Cluster NA4 F2F meeting 03/28/2008.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Design of an Expert System for Enhancing.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Lecture 2: Statistical learning primer for biologists
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Task tracking SA3 All Hands Meeting Prague.
Data Mining and Decision Support
1 Introduction to data mining G. Marcou + + Laboratoire d’infochimie, Université de Strasbourg, 4, rue Blaise Pascal, Strasbourg.
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Communication tools between Grid Virtual.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Introduction to P-GRADE Portal hands-on Miklos Kozlovszky MTA SZTAKI
EGEE-II INFSO-RI Enabling Grids for E-sciencE Operations procedures: summary for round table Maite Barroso OCC, CERN
EGEE-II INFSO-RI Enabling Grids for E-sciencE P-GRADE overview and introduction: workflows & parameter sweeps (Advanced features)
Mr. Idrissa Y. H. Assistant Lecturer, Geography & Environment Department of Social Sciences School of Natural & Social Sciences State University of Zanzibar.
EGEE-II INFSO-RI Enabling Grids for E-sciencE A Glance Towards the Future Mike Mineter Training Outreach and Education University.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Grid Observatory: goals and challenges.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Study on Authorization Christoph Witzig,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Mining Job Monitoring Data Automatic Error.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Towards a statistical model of the EGEE load C. Germain-Renaud, CNRS & Paris-Sud University.
Experience Report: System Log Analysis for Anomaly Detection
Advanced Data Analytics
Data Mining – Intro.
School of Computer Science & Engineering
Boosting and Additive Trees (2)
Lit part of blue dress and shadowed part of white dress are the same color
Data Analysis Learning from Data
Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.
3.1.1 Introduction to Machine Learning
Feature Selection Methods
Statistical Thinking and Applications
Supervised machine learning: creating a model
Tel Hope Foundation’s International Institute of Information Technology, (I²IT). Tel
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE Fault Detection and Diagnosis in the EGEE grid C. Germain-Renaud, X. Zhang, M. Sebag – LRI C. Loomis – LAL CNRS & Paris-Sud University Manchester 9-11 May 2007 Des Donnés Massives Aux Interprétations

Enabling Grids for E-sciencE EGEE-II INFSO-RI Outline Mining the Logging and Bookkeeping data –Dataset: L&B logs of the LAL site October October 2005 Blackhole (and other failures) detection in Torque logs –Dataset: Torque logs of the LAL site March 2006-February 2007

Enabling Grids for E-sciencE EGEE-II INFSO-RI Pre-processing Circa 300K jobs, 3Mevents, 2GB Operational log: a lot of information in blobs = incremental verbatim of the reports from the various services LB2F: A software suite for filtering and conversion towards a propositional vector –Flatten compound attribute e.g. requirements –Tag with the job outcome –Prune attributes values –Normalize numerical atts. (dates) –Automatic identification of functional dependencies and trivial correlations –Anonymization –408 attributes

Enabling Grids for E-sciencE EGEE-II INFSO-RI Issues Simple classifiers fail –Feature construction –Integration of weak learners may produce good results No gold standard Probably not linear –Unsupervised clustering High variability following users and date –Aggressive subsampling

Enabling Grids for E-sciencE EGEE-II INFSO-RI Constructive feature induction 36 users-consistent slices and 47 week-consistent slices Each slice has a lower variability, so something can be learned Here we use the linear learner ROGER: ROC based genetic learner Technically optimization of the Area Under ROC Curve, equivalent to Wilcoxon-Man-Whitney statistics Maps the boolean features onto the real-valued learned hypothesis x= (x 1, x 2, …, x n ) -> h (x)= w.x

Enabling Grids for E-sciencE EGEE-II INFSO-RI Constructive feature induction (cont’d) 36 users consistent slices and 47 week consistent slices Maps the boolean features onto the real-valued learned hypothesis x= (x 1, x 2, …, x n ) -> h(x)= w.x Because the optmization is stochastic, multiple hypotheses must be kept: l = 50 U-representation: h i,u (x) with i=1...l and u varying in the set of users: 1800 features W-representation: h i,w (x) with i=1...l and w varying in the set of weeks: 2350 features

Enabling Grids for E-sciencE EGEE-II INFSO-RI Clustering Meaningful features but need to eliminate useless redundancy and keep the useful ones Double clustering (Slonim & Tishby 2000): –first clustering: “compress” the features along the examples –second clustering: cluster the examples along the synthetic features K-means algorithm: discover a pre-defined number of clusters –T feature clusters, K example clusters –Empirical optimization of K and T T=30 K=29 W-rep Mostly pure clusters Natural use for detection Diagnostic ?

Enabling Grids for E-sciencE EGEE-II INFSO-RI Classification U-rep W-rep

Enabling Grids for E-sciencE EGEE-II INFSO-RI Blackhole detection What is a blackhole? A site fault which results in an ultra-fast (erroneous) execution Goal: on-line detection of blackholes – alarm Quantitative measurements –Anomalous job arrival rate and job service rate –And regular users and queues distributions?

Enabling Grids for E-sciencE EGEE-II INFSO-RI Changepoint detection Page-Hinckley statistics Time-sequential version of Wald’s statistics – also known as CUSUM Provides an « intelligent threshold » test Minimizes the expected time before a change detection for a fixed false positive rate

Enabling Grids for E-sciencE EGEE-II INFSO-RI Changepoint detection Page-Hinckley statistics Time-sequential version of Wald’s statistics – also known as CUSUM Provides an « intelligent threshold » test First event: VO software bug

Enabling Grids for E-sciencE EGEE-II INFSO-RI Changepoint detection with Page- Hinckley Page-Hinckley statistics Time-sequential version of Wald’s statistics – also known as CUSUM Provides an « intelligent threshold » test First event: VO software bug Second event: blackhole

Enabling Grids for E-sciencE EGEE-II INFSO-RI Details (unscaled)

Enabling Grids for E-sciencE EGEE-II INFSO-RI Robustness mean E7 std 28 minutes mean E7 std 33 minutes Assume Everything OK until 10e4, Threshold =max(phstats) on this interval

Enabling Grids for E-sciencE EGEE-II INFSO-RI Conclusion Mining the Logging and Bookkeeping data –Exemplifies the effectiveness and issues when applying advanced machine leaning workflows to grid data Blackhole (and other failures) detection in Torque logs –Classical statistical quality-control methods provide efficient on- line detection