Operations, BI, and Analytics Stefano Grazioli
Critical Thinking Easy meter
Using the SmallBank DB for Business Operations, BI & Analytics
Reading the Data Model Primary key: a unique identifier used to retrieve the record One manages Many has has has
Different types of business information needs lead to different queries Business transactions Business intelligence questions Analytics questions
Enrolling a new customer Bruce Wayne, Gotham, NY manages has has has
The SQL Insert into Customer (c_id, f_name, l_name, city, state) values (7759, 'Bruce', 'Wayne', 'Gotham', 'NY')
Selling insurance to a customer Bruce Wayne, Gotham, NY C_id = 7759 Coverage $100K Premium $500 manages has has has
The SQL Insert into insurance_plan (c_id, coverage, premimum) values (7759, 1000000, 500)
Changing an address cid 7759 Bruce Wayne, Cville, VA manages has has
The SQL Update customer set city = ‘Cville', state = ‘VA' where c_id = 7759
Granting a new loan Bruce Wayne, C_id 7759 L_id = 1070 $10,000,000 5% Due Dec 31, 2020 Barbara Goodhue Lo_id 16 manages has has has
The SQL Insert into loan (l_id, principal, rate, date_due, lo_id) Values (1070, 10000000, 0.05, '12/31/2020', 16) Insert into customer_in_loan (c_id, l_id) values (7759, 1070)
The previous queries reflect business transactions Directly related to business operations Single customer, single contract, deal, service… Often INSERTs, sometimes UPDATES “Small” amount of data Large numbers of fast, “simple” queries “Real time”
Homework Demo
What Is New In Technology? WINIT What Is New In Technology?
Different types of business information needs lead to different queries Business transactions Business intelligence questions Analytics questions
Finding our IP exposure by state manages has has has
The SQL select customer.state, sum(coverage) from customer, insurance_plan where customer.c_id = insurance_plan.c_id group by customer.state
Finding our top three customers manages has has has
The SQL select top 3 customer.c_id, customer.l_name, sum(loan.principal) from customer, customer_in_loan, loan where customer.c_id = customer_in_loan.c_id and customer_in_loan.l_id = loan.l_id group by customer.c_id, customer.l_name order by sum(loan.principal) desc
Finding the average interest rate by city manages has has has
The SQL select customer.city, avg(loan.rate) from customer, customer_in_loan, loan where customer.c_id = customer_in_loan.c_id and customer_in_loan.l_id = loan.l_id group by customer.city order by avg(loan.rate) desc
The previous queries generate reports and answer aggregate questions (BI) Relate to decision making more than business operations Aggregate customers, contracts, deals, services… Mostly SELECTs, often joins Larger amount of data Small number of larger, complex queries
Different types of business information needs lead to different queries Business transactions Business intelligence questions Analytics questions
Assess the correlation between loan rate and loan size manages has has has
The SQL n/a
Analytics requires more sophisticated stat tools (typically non-SQL) Questions relate to decision making, more than business operations More similar to BI queries than operational queries. SQL provides the input data, but is not sufficient. Analytics require additional software (SPSS, SAS, R, Data miner…)
The Big Picture… Products Transactions / Operations Real time, individual, action Business intelligence Analytics Historical, aggregate, decision Orders Extract Clean Transform Load Query Report Analyze Visualize Data Warehouse Customers Products Managers & Decision makers Recommended reading: TDWI Smart Companies Report 2003, available at www.tdwi.org Data warehousing includes two parts – getting data in, and getting data out. Getting data in is the hard part – it includes taking data from source systems, transforming the data, and loading it into an integrated data store. Getting data in is 80 % of time and resources, and 50% of unexpected costs. Getting data out is the fun part – it include the BI tools that casual and power users use to access the data warehouse data. When users use the data, they can deliver value to the organization. The data store in the middle can be an enterprise data warehouse, a data warehouse with dependent data marts, independent data marts, or a federated database environment. Typically, the independent data mart approach is least effective. The focus of today is on designing the data structures for a dependent or independent data mart that is tuned for on-line analytical processing (OLAP). Technical consultants Data scientists Business consultants Data scientists
You do the talking Name, major Learning objectives Things you like about the class Things that can be improved Attitude towards the Tournament