Forged Handwriting Detection Hung-Chun Chen M.S. Thesis in Computer Science Advisors: Drs. Cha and Tappert.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Advanced Data Analysis – Mean Comparisons Research Issues  Common for marketers to need to compare an empirical mean to other means Theoretical – some.
Emerging Computer Applications to Multidisciplinary Security Issues Charles Tappert and Sung-Hyuk Cha School of Computer Science and Information Systems.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Mean for sample of n=10 n = 10: t = 1.361df = 9Critical value = Conclusion: accept the null hypothesis; no difference between this sample.
Topic 2: Statistical Concepts and Market Returns
IEEM 3201 One and Two-Sample Tests of Hypotheses.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
Detection of Forged Handwriting Using a Fractal Number Estimate of Wrinkliness Experts are required to differentiate between authentic and forged signatures.
A Decision-Making Approach
Major Points An example Sampling distribution Hypothesis testing
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Sample Size Determination In the Context of Hypothesis Testing
8-4 Testing a Claim About a Mean
Tests of Hypothesis [Motivational Example]. It is claimed that the average grade of all 12 year old children in a country in a particular aptitude test.
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Inferential Statistics
POLS 7000X STATISTICS IN POLITICAL SCIENCE CLASS 7 BROOKLYN COLLEGE-CUNY SHANG E. HA Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for.
Conceptual Understanding Complete the above table for an ANOVA having 3 levels of the independent variable and n = 20. Test for significant at.05.
Difference Two Groups 1. Content Experimental Research Methods: Prospective Randomization, Manipulation Control Research designs Validity Construct Internal.
Confidence Intervals and Hypothesis Testing - II
Lecture 8 1 Hypothesis tests Hypothesis H 0 : Null-hypothesis is an conjecture which we assume is true until we have too much evidence against it. H 1.
Statistical Techniques I
Mid-semester feedback In-class exercise. Chapter 8 Introduction to Hypothesis Testing.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
Inferential Stats, Discussions and Abstracts!! BATs Identify which inferential test to use for your experiment Use the inferential test to decide if your.
Basic Statistics Introduction to Inferential Statistics.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
PowerPoint presentations prepared by Lloyd Jaisingh, Morehead State University Statistical Inference: Hypotheses testing for single and two populations.
Hypothesis Testing Testing Outlandish Claims. Learning Objectives Be able to state the null and alternative hypotheses for both one-tailed and two-tailed.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 10-1 Chapter 2c Two-Sample Tests.
Economics 173 Business Statistics Lecture 7 Fall, 2001 Professor J. Petry
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
S-012 Testing statistical hypotheses The CI approach The NHST approach.
Photos Belize Experience
Correct decisions –The null hypothesis is true and it is accepted –The null hypothesis is false and it is rejected Incorrect decisions –Type I Error The.
Statistics 101 Chapter 10 Section 2. How to run a significance test Step 1: Identify the population of interest and the parameter you want to draw conclusions.
Psych 230 Psychological Measurement and Statistics
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
Inference with Proportions Review Mr. Hardin AP STATS 2015.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Independent Samples T-Test. Outline of Today’s Discussion 1.About T-Tests 2.The One-Sample T-Test 3.Independent Samples T-Tests 4.Two Tails or One? 5.Independent.
Lecture Slides Elementary Statistics Twelfth Edition
T tests comparing two means t tests comparing two means.
More about tests and intervals CHAPTER 21. Do not state your claim as the null hypothesis, instead make what you’re trying to prove the alternative. The.
Practice You recently finished giving 5 Villanova students the MMPI paranoia measure. Determine if Villanova students’ paranoia score is significantly.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Chapter 9 Introduction to the t Statistic
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Lecture Nine - Twelve Tests of Significance.
Lecture Slides Elementary Statistics Twelfth Edition
Estimation & Hypothesis Testing for Two Population Parameters
Lecture Slides Elementary Statistics Twelfth Edition
Math 4030 – 10a Tests for Population Mean(s)
Forged Handwriting Detection
Inferential Statistics
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
Hypothesis Theory PhD course.
Statistical Inference about Regression
CHAPTER 12 Inference for Proportions
CHAPTER 12 Inference for Proportions
Lecture Slides Elementary Statistics Twelfth Edition
Section 11.1: Significance Tests: Basics
Interpreting Computer Output
Presentation transcript:

Forged Handwriting Detection Hung-Chun Chen M.S. Thesis in Computer Science Advisors: Drs. Cha and Tappert

Motivation Important documents require signatures to verify the identity of the writer Experts are required to differentiate between authentic and forged signatures Important to develop an objective system to identify forged handwriting, or at least to identify those handwritings that are likely to be forged

Key Idea It seems reasonable that successful forgers often forge handwriting shape and size by carefully copying or tracing the authentic handwriting Forensic literature indicates that this is true

Hypotheses Good forgeries – those that retain the shape and size of authentic writing – tend to be written more slowly (carefully) than authentic writing Good forgeries are likely to be wrinklier (less smooth) than authentic handwriting

Methodology Handwriting sample collection Measurement (feature) extraction –Speed –Wrinkliness Statistical analysis

IBM Thinkpad Transnote

Database Construction Record format for the handwriting samples 1.ID of subject 2.online or offline 3.ID of copied subject 4.word written 5.first/second/third try 6.sampling rate (online) or resolution (offline) 7.file extension

Subject ID Rate Resolution Extension. TApril - yyyy ON OFF xxxx online offline ID of copied subject word written first try second try third try 100 Hz 300 dpi 600 dpi file extension

Handwriting Samples

Feature Extraction Speed Wrinkliness

Speed The digitizer records the x-y coordinates of the pen movement at a sampling rate of 100Hz This information is used to calculate the average speed of each handwriting sample

Speed The original file of the points ** Page 10 has 4 scribbles: PageSize is cm wide by cm high. Scribble 0: time 2002/12/11 23:37 Stroke has 93 points: Point ( 4.73, 5.02 Point ( 4.73, 5 ) Point ( 4.73, 4.99 ) Point ( 4.73, 4.97 ).... Scribble 1: time 2002/12/11 23:37 Stroke has 113 points: Point ( 5.82, 5.26 ) Point ( 5.83, 5.26 ) Point ( 5.85, 5.25 ) Point ( 5.88, 5.24 )... Scribble 2: time 2002/12/11 23:37 Stroke has 7 points: Point ( 7.93, 4.61 ) Point ( 7.94, 4.61 ) Point ( 7.96, 4.61 ) Point ( 7.99, 4.62 )... Scribble 3: time 2002/12/11 23:37 Stroke has 47 points: Point ( 8.26, 5.75 ) Point ( 8.27, 5.75 )....

Wrinkliness Wrinkliness = log( high_resolution / low_resolution) / log(2) high_resolution – the number of pixels on the boundary of the high resolution handwriting sample low_resolution – the number of pixels on the boundary of the low resolution handwriting sample Note that the wrinkliness of a straight line = 1.0

Original handwriting sample

Find the edge of the handwriting

Edges of 300 and 600 dpi

Number of pixels on the boundary Convert the scanned images to color images Count the number of pixels whose (Red < 50, Green < 50, Blue < 50) in two different resolutions Get the wrinkliness value

Sample Results Filename 300dpi 600dpi Wrinkliness Speed 0101T T T T T T

Information of the ten subjects UserIDAgeEthnicityEducationGenderSchoolingHandiness 130CaucasianMasterFEnglishR 230AsianMasterFForeignR 320AsianBachelorFForeignR 427AsianMasterMForeignR 528AsianMasterFForeignR 635CaucasianBachelorMEnglishR 760CaucasianMasterMEnglishR 867AsianBeyond H.SFForeignL 935CaucasianPHDFEnglishR 1070AsianBeyond H.SMForeignL

Summary of handwriting samples 10 subjects Each subject wrote –3 authentic handwriting samples –3 forgeries of each of the other 9 subjects Total 300 handwriting samples –30 authentic –270 forgeries Total 900 database records –One online and two resolutions offline for each handwriting sample

Speed Hypothesis Test H 0 (null hypothesis): the mean speed for the authentic and forged handwritings are about equal H a (alternate hypothesis): the mean speed of the authentic handwriting is greater than that of the forged

Mean equality test output Alpha (level of significance) = 5% AuthenticForged Mean Variance Observationsn a= 30n f =270 Pooled Variance Hypothesized Mean Difference0 df298 t Stat5.87 P(T<=t) one-tail 5.90E-09 t Critical one-tail1.65

Reject the null hypothesis Alpha (level of significance) = 0.05 p (probability) value is 5.90E-09 which is much less than alpha Successfully prove the hypothesis Reject null hypothesis with a 95% confidence interval

Wrinkliness Hypothesis Test H 0 (null hypothesis): log 2 ( 600dpi f / 300dpi f ) ~ log 2 ( 600dpi a / 300dpi a ) H a (alternative hypothesis): the mean wrinkliness of the authentic handwriting is less than the mean wrinkliness of the forged handwriting

Mean equality test output Alpha (level of significance) = 5% ForgedAuthentic Mean Variance Observations27030 Pooled Variance Hypothesized Mean Difference0 df298 t Stat1.52 P(T<=t) one-tail t Critical one-tail1.65

Accept the null hypothesis Alpha (level of significance) = 0.05 p (probability) value is which is greater than alpha Fail to prove the hypothesis Accept null hypothesis with 95% confidence interval

The first possible reason for failure Different writing styles among the three tries of the authentic handwriting First try Second tryThird try

The second possible reason for failure Some subjects didn’t forge other subjects’ handwritings carefully AuthenticForged

Revised hypothesis test Eliminate the different authentic writing styles and the poorly forged handwriting samples Run the hypothesis test again

Mean equality test output Alpha (level of significance) = 5% ForgedAuthentic Mean Variance Observations19023 Pooled Variance Hypothesized Mean Difference0 df211 t Stat2.06 P(T<=t) one-tail t Critical one-tail1.65

Reject the null hypothesis Alpha (level of significance) = 0.05 p (probability) value is which is less than alpha Successfully prove the hypothesis Reject null hypothesis with 95% confidence interval

Conclusion The average writing speed of the forged handwritings tends to be slower than the speed of the authentic handwritings “Good” (well formed) forged handwritings tend to be wrinklier (less smooth) than authentic ones

Future Extensions Redo the study using signatures rather than arbitrary words since writing signatures is a highly learned automatic process Investigate using different resolutions to improve the estimate of wrinkliness Devise pattern recognition algorithms to filter out the “bad” forged samples automatically Compute features over portions of the writing rather than over the whole word or signature

The End