1 John Walker An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity.

Slides:



Advertisements
Similar presentations
Clustering Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data. The example below demonstrates.
Advertisements

Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.
CS & CS Multimedia Processing Lecture 2. Intensity Transformation and Spatial Filtering Spring 2009.
Transforming images to images
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
Digital Image Processing In The Name Of God Digital Image Processing Lecture3: Image enhancement M. Ghelich Oghli By: M. Ghelich Oghli
Introduction to Data Mining with XLMiner
Application of image processing techniques to tissue texture analysis and image compression Advisor : Dr. Albert Chi-Shing CHUNG Presented by Group ACH1.
Statistical Methods Chichang Jou Tamkang University.
Signal Analysis and Processing for SmartPET D. Scraggs, A. Boston, H Boston, R Cooper, A Mather, G Turk University of Liverpool C. Hall, I. Lazarus Daresbury.
Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.
Wavelet-based Coding And its application in JPEG2000 Monia Ghobadi CSC561 project
EEM332 Design of Experiments En. Mohd Nazri Mahmud
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Lecture 2. Intensity Transformation and Spatial Filtering
Fault Prediction and Software Aging
Presented by Arun Qamra
Chapter 12 Fast Fourier Transform. 1.Metropolis algorithm for Monte Carlo 2.Simplex method for linear programming 3.Krylov subspace iteration (CG) 4.Decomposition.
Hydrologic Statistics
Application of reliability prediction model adapted for the analysis of the ERP system Frane Urem, Krešimir Fertalj, Željko Mikulić College of Šibenik,
Copyright 2000, Media Cybernetics, L.P. Array-Pro ® Analyzer Software.
A Bit-Serial Method of Improving Computational Efficiency of Dot-Products 1.
Carolina Environmental Program UNC Chapel Hill The Analysis Engine – A New Tool for Model Evaluation, Sensitivity and Uncertainty Analysis, and more… Alison.
Frequency Distribution
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
1/ , Graz, Austria Power Spectral Density of Convolutional Coded Pulse Interval Modulation Z. Ghassemlooy, S. K. Hashemi and M. Amiri Optical Communications.
1 Seoul National University Logic Design. 2 Overview of Logic Design Seoul National University Fundamental Hardware Requirements  Computation  Storage.
CARDIAC ELECTROPHYSIOLOGY WEB LAB Developing your own protocol descriptions.
Automatic Minirhizotron Root Image Analysis Using Two-Dimensional Matched Filtering and Local Entropy Thresholding Presented by Guang Zeng.
Touchstone Automation’s DART ™ (Data Analysis and Reporting Tool)
Cryptography and Network Security (CS435) Part One (Introduction)
1 John Walker An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
Doc.: IEEE /0553r1 Submission May 2009 Alexander Maltsev, Intel Corp.Slide 1 Path Loss Model Development for TGad Channel Models Date:
Histograms: Summarizing Data 1. SUMMARIZING DATA Raw data, such as company records, often contains a wealth of information that would be of use in making.
Author: Vera Kukić Supervisors: Shaun Bangay Adele Lobb George Wells
POINTLESS & SCALA Phil Evans. POINTLESS What does it do? 1. Determination of Laue group & space group from unmerged data i. Finds highest symmetry lattice.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
1 Overview Importing data from generic raster files Creating surfaces from point samples Mapping contours Calculating summary attributes for polygon features.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.
APPLICATION OF A WAVELET-BASED RECEIVER FOR THE COHERENT DETECTION OF FSK SIGNALS Dr. Robert Barsanti, Charles Lehman SSST March 2008, University of New.
September 28, 2000 Improved Simultaneous Data Reconciliation, Bias Detection and Identification Using Mixed Integer Optimization Methods Presented by:
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
1 Unbiased All-Sky Search (Michigan) [as of August 17, 2003] [ D. Chin, V. Dergachev, K. Riles ] Analysis Strategy: (Quick review) Measure power in selected.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
March 2004 At A Glance ITPS is a flexible and complete trending and plotting solution which provides user access to an entire mission full-resolution spacecraft.
Worked examples and exercises are in the text STROUD PROGRAMME 27 STATISTICS.
Brookhaven Science Associates U.S. Department of Energy Description and Use of HSICC ( Hager-Seltzer Internal Conversion Coefficients ) and BrIcc (Band-Raman.
STROUD Worked examples and exercises are in the text Programme 28: Data handling and statistics DATA HANDLING AND STATISTICS PROGRAMME 28.
The first AURIGA-TAMA joint analysis proposal BAGGIO Lucio ICRR, University of Tokyo A Memorandum of Understanding between the AURIGA experiment and the.
Image Enhancement Band Ratio Linear Contrast Enhancement
DeLiDAQ-2D ─ a new data acquisition system for position-sensitive neutron detectors with delay-line readout F.V. Levchanovskiy, S.M. Murashkevich Frank.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Exploring Data: Summary Statistics and Visualizations
Information Retrieval in Practice
PRISM: PROCESSING AND REVIEW INTERFACE FOR STRONG MOTION DATA SOFTWARE
Seoul National University
PROGRAMME 27 STATISTICS.
Statistical Analysis with Excel
Statistical Analysis with Excel
Neurochip3.
MMT Observatory 3rd Trimester 2008 Elevation Tracking
Digital Image Processing
Statistics for Managers Using Microsoft® Excel 5th Edition
Cortical Mechanisms of Smooth Eye Movements Revealed by Dynamic Covariations of Neural and Behavioral Responses  David Schoppik, Katherine I. Nagel, Stephen.
Volume 5, Issue 4, Pages e4 (October 2017)
Descriptive Statistics
Review and Importance CS 111.
Presentation transcript:

1 John Walker An Attempt to Replicate the Shnoll et al. Effect with Algorithmic Classification of Histogram Similarity

2 Principal Goal Development of an easily-replicated stochastic source and an accompanying computer-based toolkit for exploring time-dependence in histogram structure and automated techniques for histogram similarity ranking.

3 Stochastic Source  Oxford Nuclear 5.0 µCi 137 Cs 661.6keV gamma source (US$40)  Aware Electronics RM-80 Geiger- Müller detector with serial port interface (US$319)  Generic PC with MS-DOS and a serial port  Modified HotBits generator software (public domain)  Event rate  200,000 counts/min  Background  60 counts/min

4 Generator Software  MS-DOS (not Windows) 16-bit program  Direct port access from assembly language  Interval timing from PC ROM BIOS clock  Time of day synchronised with Network Time Protocol  Small footprint: “retired” PCs suitable as generators  Consistent hardware-based interval timing  Accurate detection and accumulation of counts  Measurements precisely labeled with date and time Design Goals:

5 Raw Data Format  One Measurement Record per minute, beginning at start of the minute  100 consecutive Count Windows per minute, each consisting of nine ticks of the 18.2 Hz PC hardware clock  Mean ticks per Count Window  1900  CSV output record written in “housekeeping time” between end of 100th Count Window and start of next minute: Unix time() at start of minute followed by 100 comma separated count values  File size: 735 Kb/day ,1890,1964,1898,1902,1840,1901,1842,1916,1886,1901,1838,1932,1880,1985,1910,1883, 1919,1903,1895,1913,1899,1902,1870,1914,1897,1858,1854,1855,1893,1860,1948,1837,1887,1865, 1888,1882,1914,1914,1905,1903,1898,1930,1892,1883,1926,1903,1861,1899,1951,1900,1856,1877, 1861,1861,1865,1882,1850,1882,1910,1874,1870,1893,1926,1923,1880,1889,1911,1885,1913,1863, 1883,1918,1910,1933,1945,1891,1873,1910,1861,1850,1888,1948,1902,1881,1939,1948,1861,1870, 1897,1938,1895,1896,1889,1912,1919,1867,1847,1899,1937,1890

6 Analysis Software  Reads one or more days’ Raw Data CSV files  Assembles count histograms into Experiments of 10 minutes each Raw Data Histogram Compilation Transformation Modules Histogram Pair Assembly Matching Modules Closeness SortRanking Table

7 Histogram Compilation  Arranges raw data into 10 minute Experiments, each beginning at a round 10 minute interval. (Intervals with missing data are discarded.)  Builds in-memory raw histograms (number of occurrences of a given count in interval)  Computes exponentially smoothed moving average (P = 0.2) of histogram, symmetrically from the mean  Creates histogram CSV files (raw and smoothed) for each experiment for subsequent analysis  Plots each experiment’s histogram as a GIF file

8 Transformation Modules Open-ended plug-in modules transform experiment histograms in place:  NORMALISE: Scales histogram values so that maximum value is 1  FOURIER: Replaces histogram with its Fourier transform  WAVELET: Replaces histogram with its discrete wavelet transform using the Daubechies 4-coefficient filter coefficients Multiple transforms can be enabled; new transforms can be added. Transformed histograms and their inverses can be plotted for debugging.

9 Histogram Pair Assembly  All pairs of histograms are tabulated in memory  Assumes matching algorithm is commutative (but this can be changed)

10 Matching Modules (1) Plug-in modules which, given a pair of experiment histograms, return a floating-point metric of how “close” they are in morphology.  MEAN-ALIGNED  ²: Histograms are shifted so mean values align, then  ² distance between the curves is computed.  SLIDING, MIRRORED  ²: Histograms are initially aligned at their mean value, then the histogram pair and pair with one mirrored about the mean are shifted along the X axis and the minimum  ² distance is reported.

11 Matching Modules (2)  SLIDING, MIRRORED, STRETCHED  ²: Histograms are initially aligned at their mean value, then the histogram pair and pair with one mirrored about the mean, and histograms scaled along the X axis within a defined range, are shifted along the X axis and the minimum  ² distance is reported. (Work in progress.)  HUMAN-DIRECTED: It would be possible to input the ranking table from similarity measures made by human judges.

12 Closeness Sort  Sorts histogram pairs by closeness as determined by the Matching Module.  Produces aligned plots of best and worst matches to evaluate effectiveness of Matching Module.

13 Ranking Table  CSV format file lists histogram pairs in descending order of closeness evaluated by the Matching Module.  Closeness metric and free matching parameters included for downstream analysis programs , , , , ,1, , , , , ,-1, , , , , ,1, , , , , ,-1, , , , , ,1, , , , , ,1,4 31.8, , , , ,1, , , , , ,1, , , , , ,-1, , , , , ,1,2

14 Time Binning  Reads Ranking Table and bins into deciles by closeness metric, creating a histogram of time difference between histograms for each decile.  Creates expectation value table for null hypothesis.  Normalises decile histograms vs. null hypothesis and plots results by decile.

15 Ranking Table Randomiser  Shuffles lines in the ranking table produced by the Closeness Sort.  Time binning randomised ranking provides null hypothesis control for closeness matching.

16 Pilot Experiment  Data collected continuously from through ; no gaps in data set.  Data set contains: 12,960 one minute measurement records 1,296,000 equal duration count windows 1,296 ten-minute experiments 839,160 histogram pairs, excluding self/self and assuming commutative comparison

17 Complete Data Set Histogram µ = ,  = 26.4

18 Representative Experiment Histograms

19 Closely Matching Histograms

20 Closely Matching Histograms

21 Closely Matching Histograms

22 Closely Matching Histograms

23 Poorly Matching Histograms

24 Poorly Matching Histograms

25 Null Hypothesis Time Distribution Expectation

26 Closeness Ranking: Closest 2000

27 Closeness Ranking: Decile 1

28 Closeness Ranking: Decile 2

29 Closeness Ranking: Decile 9

30 Closeness Ranking: Decile 10

31 Control Ranking: Decile 1

32 Control Ranking: Decile 2

33 Control Ranking: Decile 9

34 Control Ranking: Decile 10

35 Conclusions From Pilot Experiment  No evidence found for time dependence in fine structure of smoothed histograms.  Not a refutation due to very small data set, single generator at one location, limitations in automated histogram similarity scoring, and inability to correlate automated scoring vs. human judging reported by Shnoll et al.

36 Toolkit Availability  All software developed for this project is in the public domain and all ancillary software is free software included in a standard Linux distribution.  Hardware cost for the stochastic generator is less than US$500, plus a generic MS-DOS PC.  Analysis source code and pilot experiment data set available to all investigators.  Open framework for exploring automated histogram similarity ranking.

37 References Shnoll, S. et al., “Realization of discrete states during fluctuations in macroscopic processes”, Physics–Uspekhi 41 (10) 1025 –1035 (1998). Shnoll, S. et al., “Regular variation of the fine structure of statistical distributions as a consequence of cosmophysical agents”, Physics–Uspekhi 43 (2) 25 –209 (2000).

38