Data Mining Journal Entries for Fraud Detection: A Pilot Study by Roger S. Debreceny & Glen L. Gray Discussed by Severin Grabski.

Slides:



Advertisements
Similar presentations
THE USE OF STATISTICS AND DATA MINING TO INCREASE AUDIT EFFICIENCIES AND EFFECTIVENESS Abraham Meidan, Ph.D. WizSoft Inc.
Advertisements

DISCUSSANT’S COMMENTS - Data Mining Journal Entries for Fraud Detection: A Pilot Study – R S Debreceny & Glen L Gray Symposium 2009 Eckhardt Kriel.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Anita M. Baker, Ed.D. Jamie Bassell Evaluation Services Program Evaluation Essentials Evaluation Support 2.0 Session 2 Bruner Foundation Rochester, New.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan,
Statistical Issues in Research Planning and Evaluation
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
Slide 1 Basis for Pattern Detection Analytical review Isolate the “significant few” Detection of errors Quantified approach Objective 2.
McGraw-Hill/Irwin ©2007 by the McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Management Fraud and Audit Risk "It takes 20 years to build a.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan
Chi-square Test of Independence
Roger S. Debreceny Shidler College of Business University of Hawai‘i at Mānoa Glen L. Gray College of Business & Economics California State University,
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
A RESEARCH TAXONOMY: THE APPLICATION OF DATA MINING TO FRAUD DETECTION Glen L. Gray California State University at Northridge Roger Debreceny University.
Chapter 14 Inferential Data Analysis
AUDIT PROCEDURES. Commonly used Audit Procedures Analytical Procedures Analytical Procedures Basic Audit Approaches - Basic Audit Approaches - System.
Categorical Data Prof. Andy Field.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
© EZ-R Stats, LLC Duplicate Payments Slide 1 Auditing for Duplicate Payments A better way … Presentation of
Introduction To Biological Research. Step-by-step analysis of biological data The statistical analysis of a biological experiment may be broken down into.
Risk Management Benchmarking Approaches and Options
Chapter 3 Audit Planning, Types of Audit Tests, and Materiality McGraw-Hill/IrwinCopyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
S14: Analytical Review and Audit Approaches. Session Objectives To define analytical review To define analytical review To explain commonly used analytical.
Audit Sampling: An Overview and Application to Tests of Controls
BENFORDSlide Number 1 Using Benford on Expense Reports ACL Users Group Wednesday, June 17, 2009 Richmond, Virginia Charles R. Gauntt.
Discussion of: M&A Operations and Performance in Banking by Beccalli and Frantz Emilia Bonaccorsi di Patti Bank of Italy Structural Economic Analysis Dept.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
© EZ-R Stats, LLC Duplicate Payments Slide 1 Auditing for Duplicate Payments A better way … Web CAAT.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
Social Science Inquiry Model. Scientific inquiry has 5 steps Identify a problem Develop a hypothesis Gather data Analyze the data Draw conclusions.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous.
Chapter 06 Audit Planning, Understanding the Client, Assessing Risks, and Responding McGraw-Hill/IrwinCopyright © 2014 by The McGraw-Hill Companies, Inc.
Analytical Review and Audit Approaches
BENFORD’S LAW.  History  What is Benford’s Law  Types of Data That Conform  Uses in Fraud Investigations  Examples  Other uses of Benford’s Law.
Copyright © 2007 Pearson Education Canada 1 Chapter 11: Overall Audit Plan and Audit Program.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Chapter 8: Probability: The Mathematics of Chance Probability Models and Rules 1 Probability Theory  The mathematical description of randomness.  Companies.
3/13/2016 Data Mining 1 Lecture 2-1 Data Exploration: Understanding Data Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB)
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
F8: Audit and Assurance. 2 Designed to give you knowledge and application of: Section A: Audit Framework and Regulation Section B: Internal audit Section.
Audit Planning, Understanding the Client, Assessing Risks and Responding Chapter 6.
Probability Models Probability Models and Rules Discrete Probability Models Equally Likely Outcomes Continuous Probability Models The Mean and Standard.
Define risk in AUDITING
Introduction to Marketing Research
PLANNING, MATERIALITY AND ASSESSING THE RISK OF MISSTATEMENT
Audit Planning, Types of Audit Tests and Materiality
Detecting the dubious digits: Benford’s law in forensic accounting
Chapter 7: Computer Assisted Analytical Techniques
Developing the Overall Audit Plan and Audit Program
Chapter 7: Computer Assisted Analytical Techniques
Understanding Results
Statistical Analysis Chi Square (X2).
Categorical Data Aims Loglinear models Categorical data
LEARNING OBJECTIVES AFTER READING THIS CHAPTER YOU SHOULD BE ABLE TO:
Audit Planning, Types of Audit Tests, and Materiality
Chapter 7: Computer Assisted Analytical Techniques
Chapter 5 Benford’s Law formulas Benford’s Law research
Hypothesis Theory examples.
CHAPTER 18: Inference in Practice
Third year project – review of basic statistical concepts
Overview and Chi-Square
Analyzing the Association Between Categorical Variables
Chapter 8: Probability: The Mathematics of Chance Lesson Plan
CHAPTER 16: Inference in Practice
Skills 5. Skills 5 Standard deviation What is it used for? This statistical test is used for measuring the degree of dispersion. It is another way.
AUDIT TESTS.
Chapter 8: Probability: The Mathematics of Chance Lesson Plan
Presentation transcript:

Data Mining Journal Entries for Fraud Detection: A Pilot Study by Roger S. Debreceny & Glen L. Gray Discussed by Severin Grabski

Objective Explore research issues related to the application of statistical data mining to fraud detection in journal entries –Is this important? –YES! Most significant frauds are not conducted by the users of the ERP systems, they are done “outside” of these well controlled systems. Was this accomplished? –Maybe

Accomplished? Used Benford’s Law in examining Journal Entries Statistically significant differences in First Digit distributions were found (Chi Square test), should these be investigated? –A 0% difference (Omicron) gives a statistically significant p < What does this tell me? –Is a 1% difference between observed and predicted indicative of a problem? –Could use Mean Absolute Deviation

EntityTotal DevMAD Beta Chi ChiEta ChiNu ChiPi Delta Eta EtaNu EtaPi Nu

Benford’s Law & First 5 Firms

Accomplished? Identification of “violations” of the Benford’s First Digit Law only provides a preliminary indication –Nigrini and Mittermaier (1997) recommend using the first digit as an initial test of reasonableness

Other “Benford’s Law” Digit Tests Second Digit Test –This also only gives a preliminary indication First Two Digits Test –Provide more direction Number Duplication –Identify and rank order duplicate numbers

Other Benford’s Law Research Carslaw (1988) found support for rounding up of income figures using the expected second digit frequencies (more 0s, fewer 9s than expected). Thomas (1989), again using second digits found support for rounding up of income and down for losses. Nigrini –(1994) used first two digit frequencies to analyze payroll fraud, and –(1996) used first two digit frequencies to examine tax compliance

Fourth Digit Test Chi Square to test for distributional difference of fourth digit –“…distribution of the fourth digit for each organization for all dollar amounts over $999.” –Was this the fourth digit to the left or right? –What if the transaction was for $100,000? While statistically significant differences were found, should these be investigated?

Three Digit Test Examined Last (Three) Digits in dollar amounts –Used the “top 5” of the last three digit pattern –Found that 4 of 29 entities had 30-60% of their transactions consisting of the top 5 last three digit patterns Would be interesting to note if these were the entities that “failed” Benford’s Law

Data Mining J/E Questions Would have liked a more reasoned/theoretical approach in specifying where and why data mining techniques should be applied Sources of J/E? –Influence Data Mining Unusual patterns between classes of J/Es? Class of J/E influence nature of J/E (i.e., do any type of J/E have a higher probability of fraud)? Evidence from Benford’s Law or Right Most Digits? Underlying issues that will guide effective and efficient data mining of JEs

Descriptive Statistics Any way to group the firms by industry? What can be found based upon grouping and analyzing by size?

Other Questions What other approaches (than Benford’s Law) can be applied to mining journal entries? What is currently done by audit teams for computerized analysis of journal entries? The analysis expects to see a “large enough” number of Journal Entries in order to highlight that fraud might be occurring. What if only a few JEs are made? What is the sensitivity of this approach?

Confusion Number of organizations? –36 organizations – 8 data sets had less than 1 year – 1 data set was incomplete – 27  why 29 observations? Did you count each year for the 2 organizations that provided 2 years of data as separate observations? –What is the justification? –Why not do a year-to-year comparison for those organizations?

What’s Missing? Interpretation and more detailed analysis of the data –Know that there are “violations” but never know if there is really fraudulent activity What are the other data mining techniques that are planned? Analytical reasoning as to what tests should be done or what is revealed by certain tests

Data Mining Extensions Compare the entities with “larger” average line items per journal entry (e.g. >10) in one pool? Alternatively look at those in which the maximum number of line items is large (e.g. >100)

Summary Objective – explore research issues related to the application of statistical data mining to fraud detection in journal entries Good first step – and this is a pilot study Would like more theoretical motivation for tests & research issues Would have liked more data analysis Could I apply this in an audit? I’m not sure more research is needed

Thank You