InteMon: Intelligent monitoring system for large clusters Evan Hoke, Jimeng Sun and Christos Faloutsos.

Slides:



Advertisements
Similar presentations
Beyond Streams and Graphs: Dynamic Tensor Analysis
Advertisements

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
Clustering.
Streaming Pattern Discovery in Multiple Time-Series Spiros Papadimitriou Jimeng Sun Christos Faloutsos Carnegie Mellon University VLDB 2005, Trondheim,
© 2007, Roman Schmidt Distributed Information Systems Laboratory Evergrow workshop, Jerusalem, IsraelFebruary 19, 2007 Efficient implementation of BP in.
“ The Anatomy of a Large-Scale Hypertextual Web Search Engine ” Presented by Ahmed Khaled Al-Shantout ICS
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Principal Component Analysis
1 In-Network PCA and Anomaly Detection Ling Huang* XuanLong Nguyen* Minos Garofalakis § Michael Jordan* Anthony Joseph* Nina Taft § *UC Berkeley § Intel.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.
Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.
Scaling Personalized Web Search Glen Jeh, Jennfier Widom Stanford University Presented by Li-Tal Mashiach Search Engine Technology course (236620) Technion.
1 Toward Sophisticated Detection With Distributed Triggers Ling Huang* Minos Garofalakis § Joe Hellerstein* Anthony Joseph* Nina Taft § *UC Berkeley §
© John M. Abowd 2005, all rights reserved Statistical Tools for Data Integration John M. Abowd April 2005.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Kathryn Linehan Advisor: Dr. Dianne O’Leary
Privacy Preservation for Data Streams Feifei Li, Boston University Joint work with: Jimeng Sun (CMU), Spiros Papadimitriou, George A. Mihaila and Ioana.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Check Disk. Disk Defragmenter Using Disk Defragmenter Effectively Run Disk Defragmenter when the computer will receive the least usage. Educate users.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Internal Consistency Reliability Analysis PowerPoint.
Radial Basis Function Networks
Ch. 31 Q and A IS 333 Spring 2015 Victor Norman. SNMP, MIBs, and ASN.1 SNMP defines the protocol used to send requests and get responses. MIBs are like.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Mobile Application Abstract Future Work The potential applications and integration of this project are vast – many large department and grocery stores.
Systems analysis and design, 6th edition Dennis, wixom, and roth
COMP 4923 Green IT audit for Registrar’s Office John, Michael, Sarah.
Virtualization Dr. John P. Abraham Professor. Grid computing Multiple independent computing clusters which act like a “grid” because they are composed.
SFT 2841 IN CONNECTED MODE Prepare setting files.
Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.
1 Information Retrieval through Various Approximate Matrix Decompositions Kathryn Linehan Advisor: Dr. Dianne O’Leary.
Distributed Computing Rik Sarkar. Distributed Computing Old style: Use a computer for computation.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Sensor Database System Sultan Alhazmi
1 Motivation Web query is usually two or three words long. –Prone to ambiguity –Example “keyboard” –Input device of computer –Musical instruments How can.
Dimensionality Reduction Motivation I: Data Compression Machine Learning.
Dan Fakulteta elektrotehnike i računarstva Sveučilišta u Zagrebu, 25. studenoga Conclusion 3. ISEMIC application 2. Goal of the project Intelligent.
Location Aware Information System (LAIS) Neftali Alverio Bryan Halter Jeff Cardillo Brian Reed Advisor: Prof. Tilman Wolf.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
APRICOT 2010 Kuala Lumpur, Malaysia
Central Arizona Phoenix LTER Center for Environmental Studies Arizona State University Database Design Peter McCartney (CAP) RDIFS Training Workshop Sevilleta.
1 Lattice QCD Clusters Amitoj Singh Fermi National Accelerator Laboratory.
Network Management  introduction  Internet SNMP: Simple Network Management Protocol  required reading: section 7.3 in text.
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
Approximate NN queries on Streams with Guaranteed Error/performance Bounds Nick AT&T labs-research Beng Chin Ooi, Kian-Lee Tan, Rui National.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
1 CS 430: Information Discovery Lecture 5 Ranking.
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
MRTG / RRDTool Network Management Workshop intERlab at AIT Thailand March 11-15, 2008.
GC 211:Data Structures Week 2: Algorithm Analysis Tools Slides are borrowed from Mr. Mohammad Alqahtani.
The Anatomy of a Large-Scale Hypertextual Web Search Engine S. Brin and L. Page, Computer Networks and ISDN Systems, Vol. 30, No. 1-7, pages , April.
Using SAS Stored Processes and the SAS Portal for Delivering Statistics to Drug Discovery Volker Harm PhUSE/PSI One-day Event 2009, Marlow.
Motivation Give the users a quick overview of the signaling pathways activated by selected ligands. Provide an easy way to navigate through the data. Offer.
TaxWise Online Enhancements 06/14/ USA CCH Small Firm Services a Wolters Kluwer business 1.
Minimum spanning tree diameter estimation in random sensor networks in fractal dimension Students: Arthur Romm Daniel Kozlov Supervisor: Dr.Zvi Lotker.
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
Data Science Practical Machine Learning Tools and Techniques 6.8: Clustering Rodney Nielsen Many / most of these slides were adapted from: I. H. Witten,
Enabling Real Time Alerting through streaming pattern discovery Chengyang Zhang Computer Science Department University of North Texas 11/21/2016 CRI Group.
GC 211:Data Structures Week 2: Algorithm Analysis Tools
GC 211:Data Structures Algorithm Analysis Tools
File System Implementation
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs (cont.) Dr. Xiao.
Dimension Reduction via PCA (Principal Component Analysis)
CSE 4705 Artificial Intelligence
Jimeng Sun · Charalampos (Babis) E
EE513 Audio Signals and Systems
Ron Carovano Manager, Business Development F5 Networks
Presentation transcript:

InteMon: Intelligent monitoring system for large clusters Evan Hoke, Jimeng Sun and Christos Faloutsos

What is InteMon? Monitoring software for large clusters, specifically targeted for the Data Storage System Center Monitors, analyzes and displaces time value streams Detects abnormalities in data and calls attention to them Greatly reduces amount of human monitoring required

How it Works 3 parts: Monitoring, Analyzing Data, Presentation

Monitoring List of host names and signals (MIBs) stored in database Daemon running on server queries all hosts for all signals in database via SNMP protocol Querying is staggered over course of a minute to reduce load on network Returned values stored in database, indexed by time, machine and signal type

Database Design Entry in database for each stream to be monitored and the machines they belong to Entries grouped into “SPIRIT instances”, i.e. sets that are analysed together Each “SPIRIT instance” associated with a normalization function Each set of hidden variables / reconstructed data associated with “SPIRIT instance”

Data Analysis / Abnormality Detection Uses SPIRIT algorithm [Papadimitrou05] Data analysed every minute Correlations are searched for across all signals on a given machine and all signals of the same type across all machines Both raw data, and data normalized by logarithms analysed Raw data

Data Analysis (Cont.) Hidden variables that represent correlations calculated and stored in database Data reconstructed from hidden variables Change in number of hidden variables signifies correlations break down – abnormality Weights of streams that contribute to new hidden variable stored Hidden variables Reconstruction

How SPIRIT Works Stream values represent a vector in n dimensional space given time Calculate m<n dimensional projections (hidden variables) for vectors s.t. squared residuals is minimized The squared residuals are bounded by a minimum and maximum energy, dropping below or exceeding these causes m to grow or shrink error 20 o C30 o C 20 o C 30 o C Temperature T 2 Temperature T 1

How SPIRIT Works (Cont.) Each new vector projected onto hidden variable space and error calculated Projection matrix updated by averaging in error vector scaled s.t. effect of old data decreases exponentially Algorithm runs in O(n) where n= # of streams, no need to access old data error 20 o C 30 o C 20 o C30 o C Temperature T 1

Front End Main page displays abnormalities that have occurred in last 24 hours, with links to pertinent graphs. Graph pages display raw data, normalized data, hidden variables and reconstructed data. Abnormalities marked with red boxes Links on the side for analysis of each abnormality. Abnormality analysis page displaces weights contributing to abnormality

Screen Shots

Acknowledgments Spiros Papadimitriou, Jimeng Sun and Christos Faloutsos for their work on SPIRIT John Strunk and Greg Ganger for advice from the PDL side National Science Foundation, Pennsylvania Infrastructure Technology Alliance, Intel, NTT and Hewlett-Packard for funding