Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.

Slides:



Advertisements
Similar presentations
Brief introduction on Logistic Regression
Advertisements

1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Assignment Four Underwriting. Definitions Underwriting – The process of selecting policyholders by recognizing and evaluation hazards, establishing prices.
F29IF2 : Databases & Information Systems Lachlan M. MacKinnon The Domain of Information Systems Databases & Information Systems Lachlan M. MacKinnon.
1 BA 275 Quantitative Business Methods Simple Linear Regression Introduction Case Study: Housing Prices Agenda.
Measuring the effectiveness of government IT systems Current ANAO initiatives to enhance IT Audit integration and support in delivering Audit outcomes.
Remedy, a BMC Software company Change Management Maximize Speed and Minimize Risk in the Change Process.
EFFECTIVE PREDICTIVE MODELING- DATA,ANALYTICS AND PRACTICE MANAGEMENT Richard A. Derrig Ph.D. OPAL Consulting LLC Karthik Balakrishnan Ph.D. ISO Innovative.
Laboratory Information Management Systems (LIMS) Lindy A. Brigham Div of Plant Pathology and Microbiology Department of Plant Sciences PLS 595D Regulatory.
SALES Sales Development Manager Developing and coordinating: the implementation of marketing campaigns, the company category management procedure, and.
CAS Predictive Modeling Seminar Evaluating Predictive Models Glenn Meyers ISO Innovative Analytics October 5, 2006.
2006 CAS RATEMAKING SEMINAR CONSIDERATIONS FOR SMALL BUSINESSOWNERS POLICIES (COM-3) Beth Fitzgerald, FCAS, MAAA.
1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.
Project Management For Class Plan Projects CAS Special Interest Seminar on Predictive Modeling October 11-12, 2007 Jonathan White.
Information Systems in Organisations
Application of SAS®! Enterprise Miner™ in Credit Risk Analytics
THE SCIENCE OF RISK SM 1 Interaction Detection in GLM – a Case Study Chun Li, PhD ISO Innovative Analytics March 2012.
Dr. K. Garth-James, CEO CEE, Inc. I ceeservices.org I CEE1 (2331)
S/W Project Management
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
KNOWLEDGE MANAGEMENT TEAM. KM Skills in General 1. Time management → to acquire knowledge 2. Learning technique → to absorb knowledge 3. Networking skill.
Moving into Design SYSTEMS ANALYSIS AND DESIGN, 6 TH EDITION DENNIS, WIXOM, AND ROTH © 2015 JOHN WILEY & SONS. ALL RIGHTS RESERVED. 1 Roberta M. Roth.
© Deloitte Consulting, 2005 Predictive Modeling – Panacea or Placebo? Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Spring Meeting Scottsdale, AZ May.
ANALYTICS BUSINESS INTELLIGENCE SOFTWARE STATISTICS Kreara Solutions | 9 years | 60 members | ISO 9001:2008.
Data Profiling
Integrating Reserve Risk Models into Economic Capital Models Stuart White, Corporate Actuary Casualty Loss Reserve Seminar, Washington D.C September.
Copyrights I Global Manager Group | Revision 0.1 Feb 2009 | 1 GMG DEMO OF ISO: ENERGY MANAGEMENT SYSTEM AUDITOR TRAINING PRESENTATION KIT.
CERTIFICATION In the Electronics Recycling Industry © 2007 IAER Web Site - -
Copyright © 2010, SAS Institute Inc. All rights reserved. Applied Analytics Using SAS ® Enterprise Miner™
PROJECT MANAGEMENT. A project is one – having a specific objective to be completed within certain specifications – having defined start and end dates.
@ Hanover Insurance Group: Catherine Eska 1 FROM CLASS TO INDIVIDUAL RATING CAS Predictive Modeling Seminar October 4 th, 5 th 2006 Data Challenges and.
2007 CAS PREDICTIVE MODELING SEMINAR PROJECT MANAGEMENT FOR PREDICTIVE MODELS BETH FITZGERALD, ISO.
Integrating the Broad Range Applications of Predictive Modeling in a Competitive Market Environment Jun Yan Mo Mosud Cheng-sheng Peter Wu 2008 CAS Spring.
2007 CAS Predictive Modeling Seminar Estimating Loss Costs at the Address Level Glenn Meyers ISO Innovative Analytics.
© Deloitte Consulting, 2005 What To Do When You Cannot Use Credit? (Personal Lines) Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Special Interest Seminar.
Project quality management. Introduction Project quality management includes the process required to ensure that the project satisfies the needs for which.
Predictive Modeling for Small Commercial Risks CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006.
© 2012 Towers Watson. All rights reserved. GLM II Basic Modeling Strategy 2012 CAS Ratemaking and Product Management Seminar by Len Llaguno March 20, 2012.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
2008 CAS SPRING MEETING PROJECT MANAGEMENT FOR PREDICTIVE MODELS JOHN BALDAN, ISO.
1999 CAS RATEMAKING SEMINAR PRODUCT DEVELOPMENT (MIS - 32) BETH FITZGERALD, FCAS, MAAA.
OESAI COMPREHENSIVE LIFE INSURANCE TECHNICAL TRAINING.
Glenn Meyers ISO Innovative Analytics 2007 CAS Annual Meeting Estimating Loss Cost at the Address Level.
CANE 2007 Spring Meeting Visualizing Predictive Modeling Results Chuck Boucek (312)
2004 CAS RATEMAKING SEMINAR PRODUCT DEVELOPMENT (COM - 4) BETH FITZGERALD, FCAS, MAAA.
An EDI Testing Strategy Rosemary B. Abell Director, National HIPAA Practice Keane, Inc. HIPAA Summit V October 30 – November 1, 2002.
ISO Registration Common Areas of Nonconformances.
Copyright 2010, The World Bank Group. All Rights Reserved. Recommended Tabulations and Dissemination Section B.
Practical GLM Analysis of Homeowners David Cummings State Farm Insurance Companies.
Evaluate Phase Pertemuan Matakuliah: A0774/Information Technology Capital Budgeting Tahun: 2009.
The Second Annual Medical Device Regulatory, Reimbursement and Compliance Congress Presented by J. Glenn George Thursday, March 29, 2007 Day II – Track.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
1 Deloitte Consulting LLP Predictive Modeling for Commercial Risks Cheng-Sheng Peter Wu, FCAS, ASA, MAAA CAS 2005 Special Interest Seminar Chicago September.
Session 10 Implementing & Managing Market-Driven Strategies group3.
Preparing for the Future with Decision Support Systems Copyright © 2001 by Harcourt, Inc. All rights reserved.
Risk Solutions & Research © Copyright IBM Corporation 2005 Default Risk Modelling : Decision Tree Versus Logistic Regression Dr.Satchidananda S Sogala,Ph.D.,
Assessing Logistics System Supply Chain Management 1.
A Decision Support Based on Data Mining in e-Banking Irina Ionita Liviu Ionita Department of Informatics University Petroleum-Gas of Ploiesti.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Audit Analytics --An innovative course at Rutgers Qi Liu Roman Chinchila.
2006 CAS RATEMAKING SEMINAR PRODUCT DEVELOPMENT (COM -2)
TITLE Subtitle Using Data Analytics in Audits.
Moving into Design Chapter 8.
2002 CAS RATEMAKING SEMINAR PRODUCT DEVELOPMENT (COM - 20)
Introduction to Data Mining and Classification
Dr. Morgan C. Wang Department of Statistics
Systems analysis and design, 6th edition Dennis, wixom, and roth
Systems analysis and design, 6th edition Dennis, wixom, and roth
Analytics: Its More than Just Modeling
Business Administration Programs School of Business and Liberal Arts Fall 2016 Assessment Report
Presentation transcript:

Special Challenges With Large Data Mining Projects CAS PREDICTIVE MODELING SEMINAR Beth Fitzgerald ISO October 2006

        

Agenda Project Overview Prior to Modeling Modeling Business Issues

Development of a Model - Project Overview Data Statistical Tools Computer Capacity Team Skills – Data management – Analytical/statistical – Technology – Business Knowledge

Prior to Modeling Formulate the Problem Evaluate Possible Data Sources Prepare the Data Develop Understanding of Modeling Procedures and Diagnostics Explore the Data with Simple Modeling Techniques

What percent of a model building project is the data preparation and data management?  25%  50%  75%  85%

Formulate the Problem What problem are you trying to solve? What results do you expect to see? How will you know if the results are reasonable?

Prepare the Data Do quality checks in level of detail needed for project Understand how to prepare individual variables for use in models Need to be practical about number of classification categories models can handle Need to decide on truncation and bucketing of variables that are continuous Create new variables

Develop Understanding of Modeling Procedures and Diagnostics Basic modeling training – GLM, Data Mining What software is available? What software/models work for my data investigation, modeling problem, etc. What computer capacity do I need? Learn how to use software Learn how to interpret the diagnostics

Development of a Model Analyze historical policy and loss data – Policy level detail – Location level detail Link policy and loss data with external and/or internal data: – Specific business risk data – operational, financial – Specific location data – demographic, weather – Other data – building, vehicle, agency Need link between policy detail and other data

Explore the Data with Simple Modeling Techniques Start with sample of data Try different classical analysis on sample such as: – regression – linear models – correlation matrices Make use of graphical options to explore data

Data Management Issues Matching additional internal policy information to premium/loss data – Different points in time – Tracking & balancing audited exposures Different summarization keys – handling of mid-term endorsements Address scrubbing Matching to external data for correct point in time Significance of missing values within variable

Modeling Activities Selection of Predictors – variable elimination, variable transformation Start with classical models prior to evaluating more complex models Methodology Understanding and Evaluation Evaluation of Model Performance

Data Mining Techniques Balance good fit with explanatory power Generalized Linear Models Classification Trees Regression Trees Multivariate Adaptive Regression Splines Neural Networks

Data Mining Process Business Knowledge Data Linking Data Cleansing Analyze Variables Determine Predictive Variables Evaluation Data Gathering Data Mining

Model Performance Lift Curve Analysis – Score all risks in sample – Rank risks by score from Bad to Good – Compare loss ratio of risks in each decile to loss ratio for all risks

Sample Lift Curve Analysis

Business Issues Model uses information from a third- party vendor Model needs to be accessible electronically Technology Issues Implementation Decisions

Technology Issues Develop/Modify Systems Integrate into underwriting/rating workflow – Decision process – Agency system Decide on technology – Web-based interface – API, FTP, MQ, TCP/IP, HTTPS webservices

Implementation of Model Solution focus/usage: Suitability of risk for underwriting decision Source for additional pricing factors Consistency in underwriting/pricing decisions Compliance with regulations based on implementation decision Consider model alone or model with other information available from application

Implementation of Model Workflows: Underwriting – New Business – Renewal business Rating – Pricing – Coverage Adjustment

Business Implementation of Model Strategic Plan - need management involvement Prepare Announcement/Training Material for Internal & External Customers Coordinate Implementation Monitor Feedback/Adjust Implementation

Future Plans Determine Process for Updates to Model – Use of Updated Data – Use of New Data Variables – Use of New Techniques

       