Presentation Title August 8, 2019

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

Jorge Muñoz-Gama Josep Carmona
Han-na Yang Trace Clustering in Process Mining M. Song, C.W. Gunther, and W.M.P. van der Aalst.
A Case-based Approach to Business Process Monitoring S. Montani 1, G. Leonardi 1 1 Dipartimento di Informatica, University of Piemonte Orientale, Alessandria,
Lecture 8: Three-Level Architectures CS 344R: Robotics Benjamin Kuipers.
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Software Quality Assurance Plan
Net-Centric Software and Systems I/UCRC Copyright © 2011 NSF Net-Centric I/UCRC. All Rights Reserved. High-Confidence SLA Assurance for Cloud Computing.
Models vs. Reality dr.ir. B.F. van Dongen Assistant Professor Eindhoven University of Technology
Dynamic Service Composition with QoS Assurance Feb , 2009 Jing Dong UTD Farokh Bastani UTD I-Ling Yen UTD.
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
Intrusion and Anomaly Detection in Network Traffic Streams: Checking and Machine Learning Approaches ONR MURI area: High Confidence Real-Time Misuse and.
Scientific Workflows Within the Process Mining Domain Martina Caccavale 17 April 2014.
Introduction to RUP Spring Sharif Univ. of Tech.2 Outlines What is RUP? RUP Phases –Inception –Elaboration –Construction –Transition.
Linear and Branching Time Safety, Liveness, and Fairness
RUP Fundamentals - Instructor Notes
Chapter 13 Genetic Algorithms. 2 Data Mining Techniques So Far… Chapter 5 – Statistics Chapter 6 – Decision Trees Chapter 7 – Neural Networks Chapter.
Data Profiling
Jorge Muñoz-Gama Universitat Politècnica de Catalunya (Barcelona, Spain) Algorithms for Process Conformance and Process Refinement.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
Slide 1 Mixed Model Production lines  2000 C.S.Kanagaraj Mixed Model Production Lines C.S.Kanagaraj ( Kana + Garage ) IEM 5303.
By Ritesh Reddy Nagaram.  Organizations which are developing software processes are facing many problems regarding the need for change of already existing.
Copyright © 2007 OSIsoft, Inc. All rights reserved. Ekho - MES Applications that leverages AF 2.0 Yannick Galipeau Inexcon Technologies Patrick Ramsey.
Copyright 2000, Odyssey Research Associates, Inc. SL Semantic Data Integrity DARPA Program Review Cornell Business & Technology Park 33 Thornwood.
April 2004 At A Glance CAT is a highly portable exception monitoring and action agent that automates a set of ground system functions. Benefits Automates.
Enhancing Interactive Visual Data Analysis by Statistical Functionality Jürgen Platzer VRVis Research Center Vienna, Austria.
KS3 Phase4 Client Server Monitoring System October 1, 2008 by Stephen, Seema, Kam, Shpetim.
Net-Centric Software and Systems I/UCRC A Framework for QoS and Power Management for Mobile Devices in Service Clouds Project Lead: I-Ling Yen, Farokh.
MTBC Cloud Computing Initiative  Applications of cloud computing  Overview of the NSF Net-Centric Software and Systems (NCSS) I/UCRC  MTBC and NCSS.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University July 21, 2008WODA.
Teaching material for a course in Software Project Management & Software Engineering – part V.
Process 3a 1 A Spiral Model of Software Development and Enhancement Barry Boehm Computer, May 1988 text pp34-45.
Allen D. Malony Department of Computer and Information Science TAU Performance Research Laboratory University of Oregon Discussion:
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Geospatial Semantic Web Interim (2.5 months) Progress Review for Phase 1 Dr. Bhavani Thuraisingham (Computer Science) Dr. Latifur Khan (Computer Science)
Copyright © 2015 NSF Net-Centric I/UCRC. All Rights Reserved. Rev 4 Net-Centric and Cloud Software and Systems I/UCRC Net-Centric and Cloud Software and.
03:31:41Service Oriented Cyberinfrastructure Lab, Xgrid Calendar Elvis Montero
© 2005 Cisco Systems, Inc. All rights reserved. BGP v3.2—7-1 Optimizing BGP Scalability Improving BGP Convergence.
Anomaly Detection. Network Intrusion Detection Techniques. Ştefan-Iulian Handra Dept. of Computer Science Polytechnic University of Timișoara June 2010.
Net-Centric Software and Systems I/UCRC Self-Detection of Abnormal Event Sequences Project Lead: Farokh Bastani, I-Ling Yen, Latifur Khan Date: April 1,
Net-Centric Software and Systems I/UCRC A Framework for QoS and Power Management for Mobile Devices in Service Clouds Project Lead: I-Ling Yen, Farokh.
1. WELCOME Project Management Cycle (P.M.C.) What is a project? : What is project management?: Project management life cycle : Phase 1 st : Phase 2 nd.
Introduction to Machine Learning, its potential usage in network area,
Experience Report: System Log Analysis for Anomaly Detection
Project Lead: Joe Engineer, UNT Date: October 2015
Automate Does Not Always Mean Optimize
Chapter 11: Software Configuration Management
Lesson Objectives Aims From the spec:
CS4311 Spring 2011 Process Improvement Dr
Project Management Lifecycle Phases
CS 325: Software Engineering
Wireless Sensor Network Architectures
The value of a project-oriented approach to IT and how we do it in IBM
Prediction as Data Mining Task
BIM - Building Information Modeling A well-structured BIM is a key to achieve success in the Building Design and Construction process. Qatar Green Leaders.
CPSC 531: System Modeling and Simulation
Research Areas Christoph F. Eick
To learn more, visit The Neural Engineering Data Consortium Mission: To focus the research community on a progression of research questions.
Microsoft Project Past, Present and Future
Software Engineering: A Practitioner’s Approach, 6/e Chapter 2 Process: A Generic View copyright © 1996, 2001, 2005 R.S. Pressman & Associates, Inc.
Chapter 11: Software Configuration Management
ENGG*6140 Optimization for Engineering
Kostas Kolomvatsos, Christos Anagnostopoulos
Neo4j for Process Mining
Evaluating Classifiers for Disease Gene Discovery
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Presentation Title September 22, 2019
Presentation transcript:

Presentation Title August 8, 2019 Net-Centric Software and Systems I/UCRC Self-Detection of Abnormal Event Sequences Project Lead: Farokh Bastani, I-Ling Yen, Latifur Khan Date: October 21, 2010 Copyright © 2010 NSF Net-Centric I/UCRC. All Rights Reserved.. Speaker Name

Presentation Title August 8, 2019 2010/Current Project Overview Self-Detection of Abnormal Event Sequences Project Schedule: Project Scope: Given a set of event sequences, determine the normal and abnormal transitions using data mining and automata techniques Develop techniques for problem-specific anomaly detection, including data collection and extraction, a suite of techniques for detecting abnormal event sequences The industry members can share the techniques for abnormal event sequence detection to achieve high quality systems Task 4. Visualization Task 3. Experiment and refinement Task 5. Additional datasets Task 1. On the fly processing Task 2. Prefix tree integration A M J J A S O N D J F M A 10 11 Tasks: Modify the anomaly detections tools for on-the-fly anomaly detection Enhance the anomaly detection techniques using knowledge in prefix tree Continue to Refine the preprocessor Apply the techniques to the datasets Compare the results (time/precision) Develop visualization tool for PFSA Adapt the tools for different datasets Deliverables: Anomaly detection algorithms with real-time on-the-fly anomaly detection capability Anomaly detection results Success Criteria: Identify injected anomalies with high precision and recall 8/8/2019 Speaker Name

Tools have detected 100% of all injected anomalies! Presentation Title August 8, 2019 2010 Project Results Significant Finding/Accomplishment! Complete Partially Complete Not Started TASK STAT PROGRESS and ACCOMPLISHMENT 1. Modify the anomaly detection tools to enable real-time on-the-fly anomaly detection  Completed MDI (minimal divergence inference) approach to detect anomalies on-the-fly. 2. Enhance the anomaly detection techniques based on the knowledge in the prefix tree Completed the program. Need to apply the technique to Cisco dataset. 3. Continue to refine the program and apply the techniques to the datasets and compare the results Need to explore different parameter settings in the approaches (such as alpha in MDI) and consider further improvements. 4. Develop visualization tools 5. Adapt the tools for different datasets Preparing datasets from “Software-artifact repository” for testing. Tools have detected 100% of all injected anomalies! 8/8/2019 Speaker Name

Major Accomplishments, Discoveries and Surprises Presentation Title Presentation Title August 8, 2019 August 8, 2019 Major Accomplishments, Discoveries and Surprises Use prefix tree to greatly enhance the efficiency of the algorithms Event sequences can be built into a prefix tree Prefix tree can be used to group event sequences at different levels of granularity (this is especially the case for datasets containing execution traces) Prefix tree can provide some distance information On-the-fly anomaly detection Collect data in time T to build the anomaly detection model, Detect anomalies as soon as an event is generated 2nd closest neighbor 8/8/2019 Speaker Name Speaker Name 4

Presentation Title Presentation Title August 8, 2019 August 8, 2019 Our Solution Enhance existing tools using information provided by prefix tree Clustering-based approaches: Use prefix tree to determine the sequence groups at different granularity levels (object level, method level, exact sequence level); clustering algorithms can then be used to merge these groups into clusters Density-based approaches: Use prefix tree to help determine the k-th nearest neighbor PFSA-based approaches: Always start from prefix tree Enhance existing tools for on-the-fly anomaly detection Collect data Dt in (t, t+T], use Dt to build the anomaly detection model At in (t+T, t+2T], use At for anomaly detection in (t+2T, t+3T] Experimentally determine an optimal T … … Collect Dt Build At–T Apply At–2T Collect Dt+T Build At Apply At–T Collect Dt+2T Build At+T Apply At t t+T t+2T t+3T 8/8/2019 Speaker Name Speaker Name 5

New Problems How to detect anomalies as soon as an event is generated? Presentation Title Presentation Title August 8, 2019 August 8, 2019 New Problems How to detect anomalies as soon as an event is generated? After data is collected, use MDI (minimal divergence inference) algorithm to build PFSA (probabilistic finite state automata) Has transition probability for each event Each new sequence: Start from the root of the tree Each new event in the sequence: Check whether the transition is anomalous Mark the new location for the sequence No “end of sequence” mark Will never know whether a sequence ends  Need to keep track of too many marks in the tree for all concurrent sequences Solution: Keep the sequences in a priority queue in the order of the timestamp of the last event of the sequence, delete the mark when a sequence has an outdated timestamp A sequence is considered to be terminated if no new events come after a specified time period 8/8/2019 Speaker Name Speaker Name 6