Ensuring big data is supporting financial analytics Gaining a thorough understanding of big data in order to understand analytics Transforming unstructured data into structured intelligence Using big data to predict client behaviour Bhavani Pacific Brands
financial analytics Ensuring big data is supporting financial analytics big data Gaining a thorough understanding of big data in order to understand analytics unstructured data Transforming unstructured data into structured intelligence predict client behaviour Using big data to predict client behaviour Bhavani Pacific Brands
Agenda Big data Unstructured data Framework for embedding financial analytics – data analysis leading to decisions that – impact company financials Bhavani Pacific Brands
Big Data “Big data” is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. -- Gartner Inc Volume: Number of entries (rows) – NOT attributes (columns) Velocity: Rate of change, usually rate of arrival of rows – Impacts volume Variety: attributes (columns) of entries – Structured: Transaction data: value, timestamp, type, location, … Customer data: gender, age, occupation, … – Unstructured: Text: Customer interactions, blogs & twitters about the company … Image: Customer signature, photo, …. Big Data
Predictive Analytics to Make Decisions Predict fraudulent credit card transactions 1. Learn a model from historical data Known Categories NO Known Categories Classification ClusteringClassification 2. Categorise new data to make decisions Back-end process Time consuming Built from sampled data Volume NOT an issue Models simple Quick process Velocity & Volume NOT an issue New data Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Structuring Unstructured Data Unstructured Data Get freeform source text Load sentiment word list love, best, … hate, worst, … Score sentiment: +1 for positive words -1 for negative words Normalise if needed 1. Sentiment Analysis2. Topic Detection Use in Business: For model building Competitor comparisons Mood change over time Addressing negative scores Get freeform source text Load stop words list: the, on, of, a, an, by, … Create clusters with term frequency matrix Use in Business: For model building Major customer issues Impact of initiatives Determine clusters of interest and label Learn cluster models for classification Business input
Sentiment Analysis Output Unstructured Data No sarcasm detection
Topic Detection Output Unstructured Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data Variety NOT an issue Financial Analytics NOT impacted by Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data
Impact of Variety on Analytics IndustryStructured DataUnstructured DataAnalytics Maturity Finance Transactions Customer & Product Call centre Twitter/blogs/complaints Wholesale / Retail Inventory, sales, POS Customer & Product Sourcing & supply chain Twitter/blogs/complaints Utilities (gas, electricity, …) Usage data Customer & Product Sensor data Twitter/blogs/complaints Telecom Call records Customer & Product Call centre Twitter/blogs/complaints Inbound s/SMS GPS data Web services Usage Product & subscriber Call centre Twitter/blogs/complaints Inbound s Click stream transactions Benchmark Big Data Do BIG analytics with enough data!!
Decision Big $ impact Many people Timely Framework for Embedding Analytics Initiative InsightPilotDataDeploy Just the facts needed for decision making Prioritise entries Support actioning Sell to whom, what & $$ Ordered by $$ Framework Right People $ Decision Support Initiative for Wholesale Sales
Demand In-stock % ·R1 ·R2 Demand Sell Rate Example of Additional Support for Actioning Framework Sell rate vs Consumer Demand plot Each point is a store R1 & R2 are comparable retailers Values for the same product Possible reasons for difference Competing product at R2 Pricing at R2 vs R1 Lack of stock at R2
Decision Big $ impact Many people Timely Framework for Embedding Analytics Initiative InsightPilotDataDeploy Just the facts needed for decision making Prioritise entries Support actioning Sell to whom, what & $$ Ordered by $$ Framework Automated feed Objective Data Most specific Complete POS feed from retailers SKU & store master Pick champions early in the process Develop pilot Validate outputs from pilot Iterate pilot with champions until it is accepted Automate Helpdesk Training Right People $ Decision Support Initiative for Wholesale Sales
Conclusion Big data is not an impediment to analytics Unstructured data can be structured using sentiment analysis & topic detection Key success factors for doing BIG analytics is to have the right people to choose – Right decisions – Right insights – Right data – Right process to pilot & deploy
Questions?