Benford’s Law Keyang He Probability & Statistic
History 1881: Simon Newcomb noticed that the early pages of log table books were more grubby than the later pages
History If the first digit is d, then the probability of occurrence of the first digit is Log 10 (1 + 1/d) First Digit12345 Probability30%18%12%10%8% First Digit6789 Probability7%6%5%<5%
History 1938: Physicist Frank Benford rediscovered Newcomb’s formula: Log 10 (1+1/d)
History 1995: While Benford's law unquestionably applies to many situations in the real world, a satisfactory explanation has been given through the work of Theodore Hill.
History In 1992, Mark Nigrini published a thesis noting that Benford’s Law could be used to detect fraud.
Caution Because human choices are not random, invented numbers are unlikely to follow Benford’s Law, when people invent numbers, their digit patterns will cause the data set to appear unnatural.
Types of Data That Conform When Benford Analysis Is Likely Used Examples Sets of numbers that result from mathematical combination of numbers Accounts receivable (number sold * price), Accounts payable (number bought * price) Transaction-levelDisbursements, sales, expenses On large data setsFull year’s transactions AccountsMost sets of accounting numbers
Types of Data That Do Not Conform When Benford Analysis Is Not Likely Used Examples Data set is comprised of assigned numbers Check numbers, zip codes Numbers that are influenced by human thought Prices set at psychological thresholds $1.99, $499 Accounts with a large number of firm An account specifically set up to record $100 refunds Accounts with a built in minimum or maximum Set of numbers that must meet a threshold to be recorded
Summary Benford’s Law provides a data analysis method that can help alert us to possible errors, biases, potential fraud, costly processing inefficiencies or other irregularities.
Resources html Issues/2011/Volume-3/Pages/Understanding- and-Applying-Benfords-Law.aspx