Download presentation
Presentation is loading. Please wait.
Published bySharyl Doyle Modified over 8 years ago
1
ADAPTIVE EVENT DETECTION USING TIME-VARYING POISSON PROCESSES Kdd06 University of California, Irvine
2
ABSTRACT Time-series of count data aggregated behavior of individual person periodic bursty periods of unusual behavior In this paper statistical estimation techniques time-varying Poisson process model unsupervised learning Two data sets with ground truth freeway traffic data building access data performs better than a non-probabilistic, threshold-based technique
3
CONTENT Introduction Related work Data set characteristics A baseline model and its limitations Probabilistic modeling Learning and inference Adaptive event detection Estimating event attendance Conclusion
4
INTRODUCTION
5
Focus on time-series data where time is discrete and N(t) is a measurement of the number of individuals or objects recorded over the time-interval [t-1, t].
7
DEFINITION OF EVENT Events Sustained (bursty) periods of anomalous behavior sometimes refer to individual measurements Here, a large-scale activity that is unusual relative to normal patterns such as a large meeting in a building, a malicious attack on a Web server, or a traffic accident on a freeway. Chicken and egg problem requires some knowledge of what constitutes normal behavior historical data consists of both normal and anomalous (event) data mixed together.
8
Goal define a model of uncertainty (how unusual is the measurement?), and additionally incorporate a notion of event persistence. learn a model that reflects the bimodal nature of such data, namely a combination of the normal traffic patterns to which is occasionally added additional counts caused by aperiodic events.
9
RELATED WORK
10
Techniques Markov model Likelihood-based method A combination of Poisson models and Bayesian estimation methods Infinite automaton Common goal Detect novel and unusual data points or segments in time-series
11
DATA SET CHARACTERISTICS
12
BUILDING DATA
13
FREEWAY TRAFFIC DATA
14
Holiday data should be removed before modeling, because they involve relatively different behavior.
15
A BASELINE MODEL AND ITS LIMITATIONS
16
Threshold test based on a Poisson model estimate the Poisson rate λ of a particular time and day by averaging the observed counts on similar days at the same time The max likelihood estimate and λ < N
18
Limitations Is adequate when events cause a large increase in count data Fail when facing the chicken and egg problem Thresholds and the false alarms
19
PROBABILISTIC MODELING
20
Model N(t) Normal behavior: N 0 (t) Event caused: N e (t)
21
MODELING PERIODIC COUNT DATA Poisson distribution λ (t) d(t) Indicates the weekday on which time t falls h(t) Indicates the interval in which time t falls δ and η
22
MODELING PERIODIC COUNT DATA The effect of δ d(t)
23
MODELING PERIODIC COUNT DATA The effect of η d(t),h(t)
24
MODELING PERIODIC COUNT DATA
25
MODELING RARE, PERSISTENT EVENTS Use binary process z(t) to indicate the presence if an event Transition probability matrix Length of period between events is with expected value 1 / z 0 Length of each event is with expected value 1 / z 1 z 0 and z 1 priors
26
N E (t) γ (t) is independent at each time t
27
Markov-modulated Poisson model
28
LEARNING AND INFERENCE
29
MCMC Markov chain Monte Carlo methods Monte Carlo 方法的基本思想是 :为 了求解某个 问题, 建立一个恰当的概率模型 或随机 过 程 , 使得其参量 ( 如事件的概率 、 随机 变 量的数学期望等 ) 等于所求 问题 的解 , 然后 对 模型或 过 程 进 行反复多次的随机抽 样试验, 并 对结 果 进 行 统 计 分析 , 最后 计 算所求参量 , 得到 问题 的近似解 。 Hidden variables {z(t), N 0 (t), N E (t)}
30
SAMPLING THE HIDDEN VARIABLES GIVEN PARAMETERS Likelihood functions Sample If z(t) = 0 N 0 (t) = N(t) If z(t) = 1
31
SAMPLING THE PARAMETERS GIVEN THE COMPLETE DATA Integral number of weeks T = 7 * D * W The complete data likelihood In this case, only involve λ 0, δ and η Sufficient statistics of the data
32
Posterior distributions
33
ADAPTIVE EVENT DETECTION
38
ESTIMATING EVENT ATTENDANCE
40
CONCLUSION
41
Described a framework for building a probabilistic model of time- varying counting processes Observe a superposition of both time-varying but regular(periodic) and aperiodic processes Applied this model to two different time series of counts both over several months Described how the parameters of the model may be estimated using MCMC sampling
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.