Algorithm: For all e E t, define X e = {w e if e G t, 1 - w e otherwise}. Measure likelihood of substructure S by. Flag S as anomalous if, where is an.

Slides:



Advertisements
Similar presentations
Xiaolei Li, Zhenhui Li, Jiawei Han, Jae-Gil Lee. 1. Motivation 2. Anomaly Definitions 3. Algorithm 4. Experiments 5. Conclusion.
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
1 Aggregate Short Selling during Earnings Seasons Paul Brockman, Lehigh University Andrew Lynch, University of Missouri Andrei Nikiforov, Rutgers University.
Anomaly Detection in Communication Networks Brian Thompson James Abello.
Financial Performance Measurement chapter 16. Foundations of Financial Performance Measurement OBJECTIVE 1: Describe the objectives, standards of comparison,
“How Well Am I Doing?” Financial Statement Analysis
Graduate Center/City University of New York University of Helsinki FINDING OPTIMAL BAYESIAN NETWORK STRUCTURES WITH CONSTRAINTS LEARNED FROM DATA Xiannian.
CSC 380 Algorithm Project Presentation Spam Detection Algorithms Kyle McCombs Bridget Kelly.
Small-World Graphs for High Performance Networking Reem Alshahrani Kent State University.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Model In order to automatically identify technical indicators our model: quantizes real-time market trades in 15 second intervals scans ~40,000 data points.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin.
Potential Future Exposure (PFE) Q Presentation Randy Baker Director, Credit Risk 19 January 2010 ERCOT Board of Directors Meeting.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Streaming Models and Algorithms for Communication and Information Networks Brian Thompson (joint work with James Abello)
The max-divergence of E’ is: Intuitively, p-divergence of d means that the probability of at least X E’,p edges occurring p-recently is 1/d A (maximal)
Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
AUDIT PROCEDURES. Commonly used Audit Procedures Analytical Procedures Analytical Procedures Basic Audit Approaches - Basic Audit Approaches - System.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Data Mining for Intrusion Detection: A Critical Review Klaus Julisch From: Applications of data Mining in Computer Security (Eds. D. Barabara and S. Jajodia)
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 17: Code Mining.
On Anomalous Hot Spot Discovery in Graph Streams
Patterns And A Generative Model Jan 24, 2014 Authors: Jianwei Niu, Wanjiun Liao, Jing Peng, Chao Tong Presenter: Guoming Wang Published: Performance Computing.
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.
Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN SIGKDD 2015.
NSF Critical Infrastructures Workshop Nov , 2006 Kannan Ramchandran University of California at Berkeley Current research interests related to workshop.
Presented By : Abirami Poonkundran.  This paper is a case study on the impact of ◦ Syntactic Dependencies, ◦ Logical Dependencies and ◦ Work Dependencies.
Presented by Abirami Poonkundran.  Introduction  Current Work  Current Tools  Solution  Tesseract  Tesseract Usage Scenarios  Information Flow.
Scalable and Efficient Data Streaming Algorithms for Detecting Common Content in Internet Traffic Minho Sung Networking & Telecommunications Group College.
Using Identity Credential Usage Logs to Detect Anomalous Service Accesses Daisuke Mashima Dr. Mustaque Ahamad College of Computing Georgia Institute of.
Automated Social Hierarchy Detection through Network Analysis (SNAKDD07) Ryan Rowe, Germ´an Creamer, Shlomo Hershkop, Salvatore J Stolfo 1 Advisor:
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
23-aug-05Intrusion detection system1. 23-aug-05Intrusion detection system2 Overview of intrusion detection system What is intrusion? What is intrusion.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
S14: Analytical Review and Audit Approaches. Session Objectives To define analytical review To define analytical review To explain commonly used analytical.
Date: 2012/4/23 Source: Michael J. Welch. al(WSDM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou Topical semantics of twitter links 1.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
Basic Implementation and Evaluations Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
1 Knowledge Discovery from Transportation Network Data Paper Review Jiang, W., Vaidya, J., Balaporia, Z., Clifton, C., and Banich, B. Knowledge Discovery.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
Analytical Review and Audit Approaches
© 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
© 2005 Accounting 1/e, Terrell/Terrell External Reporting for Public Companies Chapter 13.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
© Vipin Kumar IIT Mumbai Case Study 2: Dipoles Teleconnections are recurring long distance patterns of climate anomalies. Typically, teleconnections.
By, CA K RAGHU, PAST PRESIDENT – INSTITUTE OF CHARTERED ACCOUNTANTS OF INDIA.
James Hipp Senior, Clemson University.  Graph Representation G = (V, E) V = Set of Vertices E = Set of Edges  Adjacency Matrix  No Self-Inclusion (i.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Of Financial Accounting, 3e CORNERSTONES. © 2014 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
Profitability Analysis
Workshop on Data Mining in Networks ICDM 2015
Patterns extraction from process executions
Financial Statement Analysis
Chao Zhang1, Yu Zheng2, Xiuli Ma3, Jiawei Han1
A survey of network anomaly detection techniques
Anomaly Detection in Crowded Scenes
GhostLink: Latent Network Inference for Influence-aware Recommendation
Presentation transcript:

Algorithm: For all e E t, define X e = {w e if e G t, 1 - w e otherwise}. Measure likelihood of substructure S by. Flag S as anomalous if, where is an anomalicity threshold. Substructures are then analyzed in decreasing order of anomalicity as resources and time allow. DAPA-V10 is efficient: run-time complexity is O(|E T |). A persistent pattern is a collection of vertices that (1) form a connected component. (2) communicate regularly. Given a volatile time-evolving network: (1) Find persistent patterns. (2) Detect local and global anomalous activity. Challenges: Volatility: Network changes drastically, frequently. Sparsity: A single snapshot is extremely sparse. Scalability: Algorithms must be efficient for large networks of potentially millions of members. DAPA-V10: Discovery and Analysis of Patterns and Anomalies in Volatile Time-Evolving Networks Problem Statement Persistent Patterns We discover persistent patterns in volatile time-evolving networks and use them to find and rank anomalous events. Previous work focuses on identifying times of higher activity overall. DAPA-V10 detects local anomalies, pinpointing sources of unusual behavior for further analysis. Our approach is scalable to very large networks. Our Algorithm: DAPA-V10Experimental Results Conclusions Dataset: a collection of correspondence between 672 Enron employees from Found 6 persistent patterns that represent connected components of employees with regular communication. Substructures of edges within and between persistent patterns are monitored over time for anomalous behavior. Anomalies found by DAPA-V10 correspond with events surrounding the Enron scandal. The close correspondence illustrates the effectiveness of our approach. Brian Thompson Rutgers University Tina Eliassi-Rad Lawrence Livermore Lab Anomaly Detection Network Representation Model a network as a dynamic graph G=(V,E T ). To capture temporal information, we construct a weighted cumulative graph G’=(V,E t ’). Edge weights are defined by a decay function f: SourceDest.t_startt_end v49273v71192t = 5t = 9 v83492v12987t = 12t = 14 v40927v62198t = 13t = 16 v98364v39872t = 20t = 21 v18964v38719t = 20t = Timestamped edges are used to construct a dynamic graph. 2. A cumulative graph is used to measure the average strengths of relationships. 3. Persistent patterns are identi- fied. Substructures are selected to track activity both within and between components. 4. Substructures are monitored, flagging abnormal activity for investigation and analysis. Algorithm: Consider only edges with weight above threshold θ. Decrease θ until a component of size appears. Remove edges and iterate on the remaining graph. Goal: identify anomalies on a local and global scale. Monitor substructures: sets of edges (1) within each persistent pattern, and (2) between each pair of patterns. A substructure is anomalous if recent activity across its edges differs significantly from what is expected. Timeline of Enron Scandal TimeEvent 2/01Executives get $1M bonuses; stock is soaring 4/01Q1 profit $536M; Wall St. analyst suspicious 7/01Reported earnings $50B; share price dropping 8/01Public criticism of Enron accounting practices 9/019/11 attacks; Enron director sells 500K shares 10/01Q3 loss of $618M; SEC begins investigation 11/01Acquisition offer, revoked; ‘junk’ credit rating 12/01Enron files for bankruptcy, lays off employees Distribution of edge weights and threshold points for Enron data. Resulting persistent patterns shown in figure at center. Future work: Conduct experiments on a variety of domains (e.g. cyber). Use Enron dataset to evaluate effectiveness of DAPA-V10 as an early predictor of high-impact events. Normalize at each time step to find local anomalies independent of global trends in network activity. Incorporate semantic information from complex networks.