Leveraging the Absence of Observations: Pattern Recognition in Spatiotemporal Behavioral Data for Site and Purpose Discovery Michael J. Haass, Mark H.

Slides:



Advertisements
Similar presentations
© Telelogic AB [1] Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company for the United States Department of Energys.
Advertisements

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys National Nuclear.
Conclusion Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
Active Appearance Models
Data Mining in Computer Games By Adib Adam Hussain & Mohammed Sarfraz.
Multi-Scale Analysis of Crime and Incident Patterns in Camden Dawn Williams Department of Civil, Environmental & Geomatic.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Spatial Data Mining-Applications Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.
Ensemble Emulation Feb. 28 – Mar. 4, 2011 Keith Dalbey, PhD Sandia National Labs, Dept 1441 Optimization & Uncertainty Quantification Abani K. Patra, PhD.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
1 Approved for unlimited release as SAND C Verification Practices for Code Development Teams Greg Weirs Computational Shock and Multiphysics.
Unstructured Data Partitioning for Large Scale Visualization CSCAPES Workshop June, 2008 Kenneth Moreland Sandia National Laboratories Sandia is a multiprogram.
Exploring Communication Options with Adaptive Mesh Refinement Courtenay T. Vaughan, and Richard F. Barrett Sandia National Laboratories SIAM Computational.
Databases and Data Mining
What is Program Management?
PCFG Based Synthetic Mobility Trace Generation S. C. Geyik, E. Bulut, and B. K. Szymanski Department of Computer Science, Rensselaer Polytechnic Institute.
Developing a Chemical Risk Management Program
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
SAND Number: P Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department.
Improving Contaminant Mixing Models For Water Distribution Pipe Networks Siri Sahib S. Khalsa University of Virginia Charlottesville, VA
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Data Mining Techniques
Automated Computer Account Management in Active Directory June 2 nd, 2009 Bill Claycomb Systems Analyst Sandia National Laboratories Sandia is a multiprogram.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Combinatorial Scientific Computing is concerned with the development, analysis and utilization of discrete algorithms in scientific and engineering applications.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Data Intensive Computing at Sandia September 15, 2010 Andy Wilson Senior Member of Technical Staff Data Analysis and Visualization Sandia National Laboratories.
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
1 Developing a methodology for network simulation E*Space Conference Naples and Benevento June 2005 Colin Arrowsmith 1, Bob Itami 2 and Silvester (Sungchan)
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation,
Garrett Poppe, Liv Nguekap, Adrian Mirabel CSUDH, Computer Science Department.
SAND C 1/17 Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference.
Density-Based Clustering Algorithms
Geoprofiling and other geo-spatial methods against metal theft CONFIDENTIAL SNCF London, September 15th, 2015.
Strategies for Solving Large-Scale Optimization Problems Judith Hill Sandia National Laboratories October 23, 2007 Modeling and High-Performance Computing.
Laboratory Acquired Infections (LAIs) Physical Security Personnel Security Material Control & Accountability Transport Security Information Security International.
Material Control and Accountability SAND No C Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,
EXTENSION OF LATIN HYPERCUBE SAMPLES WITH CORRELATED VARIABLES C. J. SALLABERRY, a J. C. HELTON b – S. C. HORA c aSandia National Laboratories, New Mexico.
LAMMPS Users’ Workshop
Danny Dunlavy, Andy Salinger Sandia National Laboratories Albuquerque, New Mexico, USA SIAM Parallel Processing February 23, 2006 SAND C Sandia.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DBSCAN Data Mining algorithm Dr Veljko Milutinović Milan Micić
Threading Opportunities in High-Performance Flash-Memory Storage Craig Ulmer Sandia National Laboratories, California Maya GokhaleLawrence Livermore National.
Sandia is a multi-program laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
The USDOE North Slope of Alaska ARM Climate Research Facility: It's Part of a National User Facility Now Sandia is a multiprogram laboratory operated by.
Site Report DOECGF April 26, 2011 W. Alan Scott Sandia National Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated.
Sandia’s Roles on Behalf of ARM North Slope of Alaska (NSA) April 28, 2003 Jeff Zirzow NSA Technical Operations Manager Sandia is a multiprogram laboratory.
Applying Category Theory to Improve the Performance of a Neural Architecture Michael J. Healy Richard D. Olinger Robert J. Young Thomas P. Caudell University.
Identifying and Analyzing Patterns of Evasion HM Investigator: Shashi Shekhar (U Minnesota) Collaborators: Renee Laubscher, James Kang Kickoff.
CLUSTERING DENSITY-BASED METHODS Elsayed Hemayed Data Mining Course.
Scientific Data Analysis via Statistical Learning Raquel Romano romano at hpcrd dot lbl dot gov November 2006.
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
Monitored Natural Attenuation of Metals and Radionuclide-Contaminated Sites Pat Brady Sandia National Laboratories Mike Truex Pacific Northwest National.
Clustering Microarray Data based on Density and Shared Nearest Neighbor Measure CATA’06, March 23-25, 2006 Seattle, WA, USA Ranapratap Syamala, Taufik.
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories is a multi-program laboratory.
Performing Fault-tolerant, Scalable Data Collection and Analysis James Jolly University of Wisconsin-Madison Visualization and Scientific Computing Dept.
Virtual Directory Services and Directory Synchronization May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department.
A Kriging or Gaussian Process emulator has: an unadjusted mean (frequently a least squares fit: ), a correction / adjustment to the mean based on data,
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.
Results of Eddy Current Analysis
Yongli Zhang and Christoph F. Eick University of Houston, USA
Data Warehousing Data Mining Privacy
Presentation transcript:

Leveraging the Absence of Observations: Pattern Recognition in Spatiotemporal Behavioral Data for Site and Purpose Discovery Michael J. Haass, Mark H. Van Benthem, Edward M. Ochoa, Hyrum Anderson, Kristina R. Czuchlewski A CKNOWLEDGEMENTS This work was supported by the Laboratory Directed Research and Development (LDRD) Program at Sandia National Laboratories. Sandia National Laboratories is a multiprogram laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL SAND # A BSTRACT C ASE S TUDY – S ITE D ISCOVERY M ETHOD C HALLENGES & A PPROACH C ONCLUSIONS  Site discovery through analysis of patterns of life obtained from gaps in observations may reduce the dependency on preexisting models which may not adapt to, or account for, new variances introduced as conflicts evolve.  Integration of GIS query and temporal analysis provides contextual cues to aid analyst’s interpretation of purpose.  Future work should explore impact of uncertainties characteristic of denied access environments such as aperiodic sampling rates with large time gaps. R ESULTS Denied access environments play a major role in shaping ISR operations and analysis in conflicts. A common strategy to perform analysis in denied access environments is to develop statistical models of geospatial activity patterns. To date, less effort has been devoted to identifying latent information contained in patterns of missing observations. This poster describes a case study of site discovery based on patterns of life (POL) identified through computational analysis of gaps in observations. First, significant sites are identified, then a GIS database is queried and this contextual information is used to cue an analyst to associate purpose with significant patterns of activity at detected sites. This work provides a basis of experience for development of predictive algorithms robust to missing observations in denied access environments. Contact for reprints, questions, or more Purpose : Provide a basis of experience for further development of predictive algorithms that are robust to limited observation environments Problem : Given GPS tracks from taxi cabs in San Francisco area, determine which cab company participated in study* Data set : Approximately 500 taxis over 25 days  Latitude  Longitude  Occupancy  Time *M. Piorkowski,N. Sarafijanovic-Djukic, M. Grossglauser, A Parsimonious Model of Mobile Partitioned Networks with Clustering, The First International Conference on COMmunication Systems and NETworkS (COMSNETS), Bangalore, India, GPS Tracks from One Cab Density Based Spatial Clustering of Applications with Noise (DBSCAN*) is well suited to POL analysis: *Ester, M., Kriegel, H., Sander, J., Xu,X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA,  The total number of clusters is not needed a priori  A single and intuitive parameter, the neighborhood radius (Eps), influences cluster identification  We chose Eps equal to the diagonal of a typical San Francisco business district block (~200 m) Clusters with Most Members C HALLENGES :  Simple heat map of activity is too busy to search by eye  Cab fare state is not likely to change at depot  At least eight cab services in San Francisco area  Laborious to search for activity at each address A PPROACH : 1. Query for gaps in observations (i.e. when sensors are turned off – determined by time gap between consecutive observations) 2. Identify patterns of life using geolocation of observation gaps 3. Correlate coordinates with GIS database 4. Rank results to aid analyst’s final assessment Activity Level on Lat-Lon Grid