Predictive Analytics in Customs Administration Duncan Cleary Fiscal Affairs Department – Revenue Administration International Monetary Fund WCO IT Conference.

Slides:



Advertisements
Similar presentations
©2011, Cognizant Fraud Control - IT Interventions and Solutions.
Advertisements

1. Abstract 2 Introduction Related Work Conclusion References.
Continuous Audit at Insurance Companies
Introduction to Systems Analysis and Design
Chapter 5 Data mining : A Closer Look.
WHT/ HPCC Systems Flavio Villanustre VP, Products and Infrastructure HPCC Systems Risk Solutions.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Copyright © 2010, SAS Institute Inc. All rights reserved. Advanced Business Analytics.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
1 Chapter 1: Introduction 1.1 Introduction to SAS Enterprise Miner.
Chapter 1: Introduction
April 11, 2008 Data Mining Competition 2008 The 4 th Annual Business Intelligence Symposium Hualin Wang Manager of Advanced.
Comparison of Classification Methods for Customer Attrition Analysis Xiaohua Hu, Ph.D. Drexel University Philadelphia, PA, 19104
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
THE SCIENCE OF RISK SM 1 Interaction Detection in GLM – a Case Study Chun Li, PhD ISO Innovative Analytics March 2012.
Overview DM for Business Intelligence.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
Copyright © 2006, SAS Institute Inc. All rights reserved. Predictive Modeling Concepts and Algorithms Russ Albright and David Duling SAS Institute.
DATA MINING Team #1 Kristen Durst Mark Gillespie Banan Mandura University of DaytonMBA APR 09.
Understanding Data Analytics and Data Mining Introduction.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION.
1 Presentation to OG6 Canberra, Australia May 2011 Statistical Uses of Administrative Data in Canada.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Use of web scraping and text mining techniques in the Istat survey on “Information and Communication Technology in enterprises” Giulio Barcaroli(*), Alessandra.
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Introduction to SQL Server Data Mining Nick Ward SQL Server & BI Product Specialist Microsoft Australia Nick Ward SQL Server & BI Product Specialist Microsoft.
Workshop Risk management Dutch Tax administration Jon Hornstra.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Consul- ting Services Outsour- cing Services Techno- logy Services Local Profes- sional Services Competence Centers Business Intelligence WebTech SAP.
A way to integrate IR and Academic activities to enhance institutional effectiveness. Introduction The University of Alabama (State of Alabama, USA) was.
David Kloeden June Presentation Overview RA-FIT Project RA-FIT Questionnaire Preliminary Results of the 1 st Iteration Launching the 2 nd Iteration.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Machine Learning Extract from various presentations: University of Nebraska, Scott, Freund, Domingo, Hong,
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
Copyright © 2015, SAS Institute Inc. All rights reserved. Business & Analytics unite VS.
Saskatoon SAS user group
Oracle Advanced Analytics
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
SNS COLLEGE OF TECHNOLOGY
Decision Support Systems
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
MIS2502: Data Analytics Advanced Analytics - Introduction
The Internet of Things (IoT) and Analytics
Business Analytics Applications in Budget Modelling
Lecture 6. Information systems
Advanced Analytics Using Enterprise Miner
Machine Learning & Data Science
MIS5101: Data Analytics Advanced Analytics - Introduction
Data Quality By Suparna Kansakar.
Daniel Sinnott Principal Officer, Revenue
Software Systems for Survey and Census
Analytics: Its More than Just Modeling
Course Introduction CSC 576: Data Mining.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
What's New in eCognition 9
Presentation transcript:

Predictive Analytics in Customs Administration Duncan Cleary Fiscal Affairs Department – Revenue Administration International Monetary Fund WCO IT Conference & Exhibition, 6-8 May 2015, Bahamas

Talk Outline What are Predictive Analytics (PA)? What are the performance benefits from applying PA in Customs Administration? How to apply PA in Customs Administration –People, Processes and Technology RA-FIT: Revenue Administration’s Fiscal Information Tool – Customs Module 2

Who Uses Analytics…?

What are Predictive Analytics? Predictive Analytics - A Definition: The application of the Scientific Method to solve business problems. –Analysis of current and historic data to make predictions of events of (future) interest; uses statistical techniques including data mining and machine learning; needs training data to create models and score new cases; leverages computing power now available. Most important thing: The Target Definition!

What is the Value of Using Analytics? Data = Valuable Assets that should be leveraged Data Driven Decisions – Objective & Scientific Able to handle ‘Big Data’ ‘Compete with Analytics’ Complements other risk management methods Less false positives – leave the ‘good guys’ alone Success is measurable Robust and scalable Strategic Tactical Operational Goals Outcomes Measures Outputs Specific Targets

What is Needed to Apply Analytics? Process Technology People Results IntelRulesModels

Applying Analytics – People & Processes

Applying Analytics – Technology Targets for Analytics in Customs Data and Data Quality Software & Hardware Analytical Methodologies Performance and Evaluation

"Shipping containers at Clyde" by Steve Gibson from Airlie Beach, Australia - shipping containers. Licensed under CC BY 2.0 via Wikimedia Commons - Targets…

Opportunities for using Analytics in Customs Origin - Valuation - Misclassification ID theft, anomaly detection, importer risk, manifests, pattern recognition, GIS Prevent: Manifests, bills of lading, open source Red-Yellow-Green: Predictive models with targets Supplement to risk rules Prevent/ Detect: Customs declarations/ profiles Predictive models, segmentation/ sector based risk, re-audit programs Detect: Post clearance audit/ checks

Data and Data Quality Some Data Management Issues Governance and Accessibility Structure/ or lack of Matching records – data integration Spreadsheets and unformatted records Storage Data entry errors – staff and customers Misclassification – e.g. Goods, sectors Completeness/ missing data Versions/ Changes Outliers/ anomaly detection Fraud & intentional or unintentional error Timeliness/ Real Time Recording results

Software and Hardware Open Source or Off-the Shelf...? Integrated Tax & Customs Systems Data Integration – Common Identifier Data Quality: both source and end product Hardware: Cloud or Stand-alone?

Analytical Methodologies Standard Statistical Univariate/ Bivariate Exploratory Techniques Unsupervised Techniques : without a target –Principal Components Analysis –Association Analysis –Cluster Analysis/ Segmentation Anomaly/ Outlier Detection, sector based approaches Text Mining/ ‘Unstructured’ data analytics Network Analysis, linkage analysis GIS - Geographic Information Systems Supervised Techniques: with a target –Predictive Models Decision Trees Neural Networks Regression (Linear, Logistic, Stepwise etc.) Ensemble models (combination of those above and others) Semi-supervised, rare target data, hybrid approach (more later)... Many Others!

Applying Predictive Analytics – Process Extract data from Source Review, integrate, transform, Load ABT in to modeling environment

Input Data ABT Training data Scoring data SAMPLE Data partition: Training/ Validation/ Test Cross validation EXPLORE Summary Stats Visualization Univariate Bivariate Scatterplots Crosstabs MODIFY Transform Filter cases (rows) Filter variables (Columns) Select variables MODEL Regression Decision Trees Neural Networks Ensemble ASSESS Compare models’ performance using validation and test data, lift charts, residuals/ error, misclassification ROC charts SCORE Use selected model code to score full population ABT of cases with p score, decide on cut off for target cases. E. G. see Sarma, Kattamuri S., Predictive Modelling with SAS Enterprise Miner: Practical Solutions for Business Applications. Cary, NC: SAS Institute Inc.

Cases above cut-off are at least twice as risky. Scored population cases & cut-off points

Evaluation: Testing, Dashboards & Feedback Within Modelling Start small and test samples Feedback of good quality Back Validation

Interaction/ Benefits of Hybrid Methods Predictive Models Data + Results + New Data = Better Models and Rules Existing Business Rules New Risk Rules Raw Data & Information Intelligence Network Analysis Network Analysis

Micro Macro From micro to macro…

Revenue Administration’s Fiscal Information Tool Purpose and Benefits RA-FIT provides the platform for a single international revenue administration (tax and customs) data gathering tool Encourages and supports performance measurement to developing countries Used to establish key baselines and identifies key risk areas for revenue (tax and customs) administration Makes aggregated data & analysis available to member countries Improves the quality of Technical Assistance delivery Customs Module 2015 now live 20

RA-FIT – Cost of Collections – Customs respondents provided total annual expenditure information from the 63 countries completing customs operations Total Cost of Collection Operating Cost % Cap Ex Cost % LOW INCOME COUNTRIES2.60%95%5% LOWER MIDDLE INCOME COUNTRIES2.79%95%5% UPPER MIDDLE INCOME COUNTRIES4.08%83%17% HIGH INCOME COUNTRIES5.53%83%17%

RA-FIT – Cost of Collections – Customs 22 Ratio of cost of collection and customs revenue (17)

RA-FIT – Customs: Traffic by Channel 23

RA-FIT – Customs: Release Times on Imports 24

Full Circle: Potential for getting predictive… 25 Attributes of tax and customs administrations that have been captured in RA-FIT (‘00s dimensions) can be correlated with targets of interest… A B 1 A B 0

Thanks for your attention! Duncan Cleary Fiscal Affairs Department - Revenue Administration 2 International Monetary Fund 1900 Pennsylvania Ave., N.W.| HQ | Washington, DC T: | M: | Skype: duncancleary |