Subrogation Prediction Through Text Mining and Data Modeling Sergei Ananyan, Ph.D. Megaputer Intelligence www.megaputer.com
Why Subrogating? While only a few percent of cases have subrogation potential, significant amounts of money can be recovered Estimates: Missed subro opportunities in USA ~ $15Billion annually Efficient subrogations facilitate in keeping insurance premiums low, providing an extra competitive edge
Challenges of Subrogation Overwhelming volume of claims: Over 5 million reported workplace injuries in the USA annually Over 6 million auto insurance claims in the USA annually Subrogation opportunities comprise only a few percent of all claims Subro decisions involve manual analysis of textual notes in claims Thorough investigations can be lengthy and costly Missed subrogation opportunities can be even more costly Subro decisions should be made soon after the accident. Relevant evidence may disappear quickly.
Who makes a subro decision?
Traditional Way: Adjusters Individual Adjusters determine subrogation cases Pros: Subro decisions can be made at early stages of claim handling Investigation can be conducted on the spot Cons: Subrogation determination is at the bottom of a long list of actions Verifying coverage Determining compensation Approving payments Reporting Different experience of adjusters: no consistency across organization Either the lack of formal rules or a set of rules that is too rigid to determine subrogation potential of many cases Looking for “a needle in a haystack”: easily overlooked
Traditional Way: Recovery Teams Specialized Recovery Teams determine subrogation opportunities Pros Highly trained professionals: better determination of opportunities Consistency across the organization Cons Small group of investigators: overloaded with large numbers of claims Located remotely: need to coordinate efforts with local adjusters Delays in starting investigations
Recovery Teams are Overloaded
Subrogation Prediction Objectives A perfect solution for subrogation prediction should be Accurate Automated Objective Consistent Fast
New Way: Automated Modeling New predictive modeling tools can identify subro opportunities They provide many benefits Timely detect good new candidate claims for subrogation Capture missed opportunities throughout closed cases Focus attention of investigators on cases with high potential Eliminate wasted time and efforts Standardize subrogation prediction practice across the enterprise Enhance customer satisfaction
Modeling and Text Mining Knowledge discovery tools for business users Easy-to-understand actionable results Data Overload Useful Knowledge
What is Data Modeling? Computer models learn from historical data and predict outcomes of future situations Models are developed through training on data with known outcomes Training is based on machine learning and statistical algorithms The Megaputer solution PolyAnalyst™ for Subrogation Prediction offers a selection of modeling algorithms: Decision Trees Neural Networks CHAID Bayesian Networks Random Forest Best model can be selected automatically Developed models are used for scoring new data to predict: Probability of the subrogation success Potential recovered amount
Training and Applying the Model Model Training: Modeling is carried out on data collected from claim forms and notes Successful past subrogation cases are considered as positive examples “No subrogation” cases are negative examples A model learns combinations of features determining positive cases Another model predicts the amount of possible subrogation The developed model is stored for future use Model Application Models are applied to new data to produce scores Calculate: Subrogation probability Subrogation amount Claims with the highest scores on these two attributes are presented for investigation by a human
Investigations involve data analysis Decision Maker Interactive up-to-date reports Data Analyst Visual analytic scenario
Behind the Scenes
Output: Subrogation Prediction Probability of the subrogation success Estimated recovery amount
Data Integration
Data Cleansing
Aggregation – keys and attributes
Aggregations - measures
Derivative Attributes
Complications of Text Analysis The need to analyze free text notes further complicates things Statistical tools are good at processing structured data, but not text Human analysts had to read text notes to extract relevant features
Text Mining Technology Text Mining is an automated process of analyzing text to extract information from it for particular purposes Text Mining is different from traditional search technology: In search, the user is typically looking for something that is already known and has been written by someone else Text Mining involves pushing aside irrelevant material in order to extract relevant information Text Mining extracts relevant features from natural language notes. These features are included in modeling.
Typical Text Mining Tasks Categorization Feature and entity extraction Summarization
Complications of Text Analysis Typical textual descriptions SLIPPED OFF BACK OFVAN LOADING TOOLS PUSHED WHILE CONFRONTING AN ALLEGED SHOPLIFTER TRIPPED ON A SHEET OF WIRE MESH & FELL ON PAKRING LOT REACHING FOR PAKAGES ON BELT WHEN HE TRIPPED OVER PAKAGES THAT WERE IN FRONT OF BELT AND FELL EE WAS CUTTING ONIONS ON THE SLICER AND HE CUT OFF THE TIP OF HIS RIGHT THUM CLT WAS STRUCK ON HEAD WITH ICE IN THE FREEZER EMP WAS WALKING BACK TO PKG CAR WHEN 2 DOGS BEGAN TO CHASE HIM, HE RAN & SLIPPED ON STEPS OF PKG CAR EE WAS USING A BAND SAW TO CUT IRON FOREIGN BODY ENTERED LT EYE
Intelligent Spell-Checking
Categorization: V2 rear ended V1 Key points of the claim
Categorization: policy holder arrested Key points of the claim
Domain-specific Dictionaries
Patterns related to Pain
Predicted Subro Probability for a Claim
Predicted Subro Amount for a Claim
PolyAnalyst Subro Prediction flow New claim Text Mining Extracted Features Historical claims data Modeling Subrogation Model Subrogation prediction
Touch Points for Modeling First Report of Incident Detect subro opportunities, while evidence is still available Focus efforts only on claims that have good subro potential Perform timely and thorough investigations Retrospective Analysis of Claims Check closed and still open claims Identify missed subro opportunities Pursue recovery whenever still possible
First Report of Incident (work comp) Available data Date Injury Type Body part injured Textual description of the incident Build models based on historical data Use a pre-built model to score new claims
Retrospective Claims Analysis Extra data (new) Claim notes Financial results Applicable legislation, Arbitration notices, etc. Build models based on historical examples Discover missed subrogation opportunities
PolyAnalyst Benefits Dramatic time and cost reduction Increase in quality and speed of the analysis Objective and uniform data-driven analysis Discovery of even unexpected issues suggested by data Automated monitoring of known problems Timely discovery of newly developing issues Utilization of 100% of available data: structured and text Up-to-date reports for executives Easy to use and to maintain solution
Data and Text Mining in Insurance Fraud Detection Subrogation Prediction Database Marketing Response Prediction Cross-sell Analysis Market Segmentation Text Analysis Call Center transcripts analysis Survey analysis Competitive intelligence Compliance analysis
Select Customers Government Insurance Financial High Tech Pharmaceutical Marketing Manufacturing
Contacting Megaputer (812) 330-0110 info@megaputer.com Call or email 120 W Seventh Street, Suite 314 Bloomington, IN 47404 USA