Andrea Bertozzi University of California Los Angeles Thanks to contributions from Martin Short, George Mohler, Jeff Brantingham, and Erik Lewis.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Sampling Distributions (§ )
An Introductory to Statistical Models of Neural Data SCS-IPM به نام خالق ناشناخته ها.
Christine Smyth and Jim Mori Disaster Prevention Research Institute, Kyoto University.
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Linear Regression.
Discrete Event Simulation How to generate RV according to a specified distribution? geometric Poisson etc. Example of a DEVS: repair problem.
TDC 369 / TDC 432 April 2, 2003 Greg Brewster. Topics Math Review Probability –Distributions –Random Variables –Expected Values.
Earthquake predictability measurement: information score and error diagram Yan Y. Kagan Department of Earth and Space Sciences, University of California.
1 Some Current Problems in Point Process Research: 1. Prototype point processes 2. Non-simple point processes 3. Voronoi diagrams.
CHAPTER 6 Statistical Analysis of Experimental Data
Computer vision: models, learning and inference
Space-time Modelling Using Differential Equations Alan E. Gelfand, ISDS, Duke University (with J. Duan and G. Puggioni)
Yan Y. Kagan Dept. Earth and Space Sciences, UCLA, Los Angeles, CA , Global.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
JUMP DIFFUSION MODELS Karina Mignone Option Pricing under Jump Diffusion.
If we build an ETAS model based primarily on information from smaller earthquakes, will it work for forecasting the larger (M≥6.5) potentially damaging.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Stability and accuracy of the EM methodology In general, the EM methodology yields results which are extremely close to the parameter estimates of a direct.
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
On Model Validation Techniques Alex Karagrigoriou University of Cyprus "Quality - Theory and Practice”, ORT Braude College of Engineering, Karmiel, May.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 4 – Modeling Basic Operations and Inputs  Structural modeling: what we’ve done so far ◦ Logical aspects – entities, resources, paths, etc. 
The Triangle of Statistical Inference: Likelihoood
A Statistical Model of Criminal Behavior M.B. Short, M.R. D’Orsogna, V.B. Pasour, G.E. Tita, P.J. Brantingham, A.L. Bertozzi, L.B. Chayez Maria Pavlovskaia.
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Random Sampling, Point Estimation and Maximum Likelihood.
Introduction Random Process. Where do we start from? Undergraduate Graduate Probability course Our main course Review and Additional course If we have.
Measuring Repeat and Near- Repeat Burglary Effects.
Lab 3b: Distribution of the mean
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
Mathematical Model for the Law of Comparative Judgment in Print Sample Evaluation Mai Zhou Dept. of Statistics, University of Kentucky Luke C.Cui Lexmark.
Relative quiescence reported before the occurrence of the largest aftershock (M5.8) with likely scenarios of precursory slips considered for the stress-shadow.
Learning Simio Chapter 10 Analyzing Input Data
CY1B2 Statistics1 (ii) Poisson distribution The Poisson distribution resembles the binomial distribution if the probability of an accident is very small.
Sampling and estimation Petter Mostad
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Case Study 1 M. B. Short, M. R. D’Orsogna, V. B., G. E. Tita, P. J. Brantingham, A. L. Bertozzi and L. B. Chayes, Math. Models and Methods in Applied Sciences,
Andrea Bertozzi University of California Los Angeles Thanks to contributions from Laura Smith, Rachel Danson, George Tita, Jeff Brantingham.
1 1.Definitions & examples 2.Conditional intensity & Papangelou intensity 3.Models a) Renewal processes b) Poisson processes c) Cluster models d) Inhibition.
Near repeat burglary chains: describing the physical and network properties of a network of close burglary pairs. Dr Michael Townsley, UCL Jill Dando Institute.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Computacion Inteligente Least-Square Methods for System Identification.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Manuel Gomez Rodriguez
Maximum Likelihood Estimation
Measuring Repeat and Near-Repeat Burglary Effects
CHAPTER 29: Multiple Regression*
Chapter 4 – Part 3.
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Modelling data and curve fitting
Discrete Event Simulation - 4
Stochastic Hydrology Hydrological Frequency Analysis (I) Fundamentals of HFA Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Basic Practice of Statistics - 3rd Edition Inference for Regression
Computing and Statistical Data Analysis / Stat 7
Learning Theory Reza Shadmehr
Parametric Methods Berlin Chen, 2005 References:
Sampling Distributions (§ )
Mathematical Foundations of BME
Rotational grid, PAI maximizing
Presentation transcript:

Andrea Bertozzi University of California Los Angeles Thanks to contributions from Martin Short, George Mohler, Jeff Brantingham, and Erik Lewis.

repeat crime is much more likely to happen in a short interval of time after the first event Short et al J. Quant. Crim. 2009

 burglars return to places to replicate the successes of and/or exploit vulnerabilities identified during previous offenses: “I always go back [to the same places] because, once you been there, you know just about when you been there before and when you can go back. An every time I hit a house, it’s always on the same day [of the week] I done been before cause I know there ain’t nobody there. “ (Subject No. 51) Wright and Decker Burglars on the Job (1996: 69)

On right, histogram of times between pairs of burglaries separated by 200m or less. On the left, similar histogram for Southern California earthquake (magnitude 3.0 or greater) pairs separated by 220km or less.

 Events occur entirely at random, defining a stochastic process where each event occurs independently of prior events.  Mathematically, such a phenomenon can be modeled as a Poisson process characterized by a rate parameter, representing the expected number of events per unit time.  the probability that one burglary occurs within a time interval t to t + dt is given by  The probability that k burglaries occur is given by the general Poisson distribution  The probability that no events occur within a time interval dt, then, is given by

 The time T1 until the first event occurs  Probability that first event occurs between times t and t+dt  Poisson process probability density function for time interval between events

 Suppose we have different types of events associated with different locations, e.g. residential burglaries whose rates vary by spatial location. Then the composite probability is  Where w i is the fraction of homes exhibiting rate constant i.

 Fit to With N = 3

 At first glance the good fit with N=3 suggests that the Long Beach data satisfies the REH.  However it turns out that only a fraction of the total number of houses fit into the N=1, N=2, N=3 bins as determined by “house order” the total number of times burgled during the time period of evaluation.  Suggests we need another method for measuring repeat victimization.

 Parameter free method  Pick a fixed window time period D  Probability distribution of time intervals between victimization for order 2 homes (homes that have exactly two events during this window perios, assuming REH):

 Comparison to REH shown as black line.  D=364

On right, histogram of times between pairs of burglaries separated by 200m or less. On the left, similar histogram for Southern California earthquake (magnitude 3.0 or greater) pairs separated by 220km or less.

 A space-time point process is characterized by its conditional intensity given a history Ht  Epidemic Type Aftershock Sequence models (ETAS) divide earthquakes into two categories: background events and aftershock events.

 Background events occur according to a stationary process  with magnitudes distributed independently of  with probability j(M).  Each of these earthquakes then elevates the risk of aftershocks and the elevated risk spreads in space and time according to the kernel g(t; x; y;M).

 Parameter selection for ETAS models is most commonly accomplished through maximum likelihood estimation, where the log likelihood function (Daley and Vere-Jones, 2003), is maximized over all parameter sets.

 Measure of goodness of fit of a statistical model – used for model selection  AIC=2K-2ln(L) where K is the number of parameters in the model and L is the maximized value of the likelihood function of the model.  The AIC methodology attempts to find the model that best explains the data with a minimum of free parameters.  If model errors are normally and independently distributed, then AIC is equivalent to 2K+n[ln(RSS)], RSS is residual sum of squares (difference between data and model prediction) where n is number of observations.  Preferred model has the lowest AIC value.

Rivalry network among 29 street gangs in Hollenbeck, Los Angeles Tita et al. (2003)

 event dependence is a common process driving repeat victimization across all crime types  specific behavioral mechanism—street smarts/street justice—may differ in detail, but outcome is the same  Hawkes Process is a flexible representation of self-excitation

background rate of violence retaliation strength retaliation duration rivalry intensity self-excitation time since the most recent incident

simulated actual Mike Egesdal, Chris Fathauer, Kym Louie, and Jeremy Neuman, Statistical Modeling of Gang Violence in Los Angeles, submitted to SIURO.

Here k0 is the expected number of retaliations per attack, 1/w is the expected waiting time for retaliation (in days)

Percentage of crimes predicted vs percentage of cells flagged for 2005 burglary (left) and 2007 robbery (right). Curve for CHM is point wise max over a variety of hotspot map prediction methods discussed in the criminological literature.

inter-event times Najaf, Iraq n events Najaf, Iraq Data from Iraq Body Count, analysis by Erik Lewis, UCLA

 Iraqi data shows a clear temporal dependence on background rate likely linked to troop presence.  We consider several models for change in background rate :  (a) step model,  (b) linear increase,  (c ) variable bandwidth kernel smoothing.

 Example – linear background rate

 Time period: March 20, 2003 – Dec. 31, 2007  15,977 events  Start date, end date, min and max # deaths, town and/or district.  In the analysis no distinction is made between different # deaths per event.  Do not distinguish between type of event (e.g. IED or gunfire).  Only consider start date. (93% of events have same start/end date)

A histogram of all 149 events in Najaf with 30 bins is plotted on the left. The estimated fit with a linear background rate is plotted on the right (the jagged curve). The linear fit without self excitation is shown as well.

 M.B. Short, M.R. D'Orsogna, P.J. Brantingham, and G.E. Tita, Measuring and modeling repeat and near-repeat burglary effects, J. Quant. Criminol. 25 (2009).Measuring and modeling repeat and near-repeat burglary effects  G.O. Mohler, M.B. Short, P.J. Brantingham, F.P. Schoenberg, and G.E. Tita, Self-exciting point process modeling of crime, preprint (2010).Self-exciting point process modeling of crime  Feller W (1968) An introduction to probability theory and its applications, 3rd edn., vol 1. Wiley, New York.  Daley, D. and Vere-Jones, D. (2003). An Introduction to the Theory of Point Processes, 2nd edition. New York: Springer.  Statistical Modeling of Gang Violence in Los Angeles Mike Egesdal, Chris Fathauer, Kym Louie, Jeremy Neuman, SIAM J. Undergraduate Research Online, Statistical Modeling of Gang Violence in Los Angeles  Mark Allenby, Kym Louie, and Marina Masaki, project report, Tim Lucas mentor, A Point Process Model for Simulating Gang-on-Gang Violence, 2010 REU program at UCLA.A Point Process Model for Simulating Gang-on-Gang Violence  E. Lewis, G. Mohler, P. J. Brantingham, and A. L. Bertozzi, Self-Exciting Point Process Models of Civilian Deaths in Iraq, preprint 2010.Self-Exciting Point Process Models of Civilian Deaths in Iraq

 Johnson, S. (2008). Repeat burglary victimisation: a tale of two theories. IEEE Trans. Automatic Control, 4,  Townsley, M., Johnson, S. D., & Ratclie, J. H. (2008). Space time dynamics of insurgent activity in Iraq. Security Journal, 21,  Iraq Body Count. (2008). Iraq body count.  Akaike, H. (1974). A new look at the statistical model identication. IEEE Trans. Automatic Control, AC-19,  Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. Budapest: Akademiai Kiado.