Download presentation
Presentation is loading. Please wait.
Published byMitchell Nichols Modified over 9 years ago
1
Fall 2015 BUSA 3110 - Statistics for Business Set 1: Orientation and Context Library Readings Chapters 24 and 1.1 (Current Role of Statistics) Kim I. Melton, Ph.D.
2
Module 1 Day 1 Orientation Logistics Who are we: Roll and Introduction Resources: Physical and Virtual Location(s): Physical and Virtual Expectations Course Description Content and Organization Homework 2
3
Logistics Roll Brief Syllabus (note HW assignment at the end) Homework for the first two weeks posted at: faculty.ung.edu/kmelton/busa3110.html During the second week, all of my sections will be combined in D2L under Section E (There will be no change to your official Banner registration.) After the second week a master list of all assignments will be posted in D2L. 3
4
4 System of Profound Knowledge About me
5
Materials That You Need to Purchase Custom packet (available at Dahlonega and Gainesville Bookstores) Includes: Selected chapters from Business Statistics, 3 rd Edition by Sharpe, De Veaux, and Velleman Access to MyStatLab with Homework management system Chapter quizzes Complete text of Business Statistics, 3 rd edition Answers to odd problems Data for problems in the text And more 5
6
Setting up MyStatLab.com 6 Ignore - I will not be linking MyStatLab to D2L this semester. Ignore (for now): You will be able to use temporary access until Drop/Add ends.
7
D2L Other Support Materials 7 Software.ung.edu UNG Library All of my sections will be combined into Section E after Drop/Add
8
Software for Calculations 8 Pros Available for Windows and Macs (born on the Mac) Statistical software with an interface that is similar to a spreadsheet Dynamically linked analysis Graphics first; computations follow Wide array of statistical analyses available in a menu driven approach Cons Not as widely available Less familiar Requires new thought processes Excel Pros Spreadsheet software that is widely available Familiarity Quick basic insight Cons Analysis ToolPak needed for statistical analysis beyond the most basic is not available on all platforms (e.g., Mac and Office 365) Analyses are not linked (especially graphical and inferential) Requires the user to have more statistical knowledge to use effectively and appropriately Limited in terms of analyses available Think “and” not “or”
9
Learning Expectations for Class Attendance Arrive on time Stay the entire time Preparation Spend time before class reading (text and homework) Take notes while in class Work homework after class Professionalism Take responsibility for learning Believe you can learn statistics Ask questions Try to answer questions Seek help EARLY when you are struggling Be ethical Put phones away 9 INVEST
10
What is Statistics (as a field of study)? Videos BA and Optimization [2010] (1:15 minutes)BA and Optimization Think (1 minute)Think Turning Data into Insight (3:26 minutes)Turning Data into Insight 10 How (and why) is the field of Statistics changing? Source: http://www.datasciencecentral.com/profiles/blogs/data-veracity
11
Data/Information/Knowledge/Wisdom 11 Doing things right (Efficiency) Doing the right things (Effectiveness) DATA INFORMATION KNOWLEDGE/ UNDERSTANDING WISDOM Symbols (raw values) that represent properties of objects/events Describes; provides answers to who, what, where, and when questions Explains; provides answers to how to and why questions Evaluates knowledge/understanding; deals with values; uses judgment; answers what is best and why Based on the work of Russell Ackoff. See “From Data to Wisdom” in Ackoff’s Best, pp. 170-174, 1999.
12
The Course 12 Applied Statistics Uses data From situations where variation exists As input to quantitative models That provide information To help develop the knowledge For making decisions That inform action For use in a practical setting Where theoretical assumptions may not apply perfectly and results and limitations need to be communicated in the language of the situation
13
Models (and theories) must be viewed in a context 13
14
Course Description A second course in statistical methods with special orientation to applications in business. Emphasis will be placed on application of statistical techniques, assessing their appropriateness, and communicating results to various audiences. Topics include: data collection, sampling, data visualization, data analysis, model building using regression, and other statistical techniques. Statistical software is used extensively in the course. This course should be taken as soon as the prerequisite is satisfied. Prerec: MATH 2400 with a grade of C or higher. 14
15
Learning Outcomes (Course Level) Upon completion of this course, students should be able to: select appropriate statistical methods to guide decision- making generate and use statistical output to analyze data identify the limitations of the statistical methods covered communicate how statistical studies were conducted and the results of those studies recognize ethical issues related to the collection and analysis of data and the communication of the results of the analysis 15
16
Day 1 Homework Details of your homework assignment (to be completed prior to coming to class on Wednesday) is posted at faculty.ung.edu/kmelton/busa3110.html This assignment does include reading the entire syllabus accessing and reading an article available through the UNG library bringing written material with you to class obtaining your book and access to MyStatLab NOTE: We meet in the lab next class period (Dahlonega: NOC 109 ; Gainesville: MTN 5100) 16
17
Day 2 Syllabus questions Analytics 3.0 Article Analytics 1.0, 2.0, and 3.0 Role that statistics plays in each And how the field of statistics has changed (and continues to change) 17
18
Syllabus Text, MyStatLab, other resources Accessing material D2L and MyStatLab Software: JMP and MS Office Course Format Grading General expectations (especially deadlines, make-ups, extra credit, academic integrity, phones) 18
19
Content Eight “Modules” (Sets of Slides) 1.Orientation and Context 2.Basics of Data Collection 3.Data Visualization and Quantitative Summary 4.Data Collection in Practice 5.Inference Involving One Variable 6.Simple Linear Regression 7.Multiple Regression Basics 8.Model Building 19 Data Information Knowledge Wisdom
20
Format Each Module will include: Reading material (from the text and/or supplemental assignments) Daily graded homework (app. 30 minutes in MyStatLab) Chapter quizzes (app. 15 minutes in MyStatLab) Most Modules will include: A short video related to the topic JMP computer assignments using In-class or out-of-class graded assignments developed by the instructor 20 Monday classes meet in the traditional classroom (NOC 016 or MTN 3214). Wednesday classes meet in the computer lab (NOC 109 or MTN 5100).
21
Grading MyStatLab Homework (16 points) MyStatLab Quizzes (16 points) Instructor Supplied Assignments (66 points) Preparation and Participation (10 points) Final (0-12 points) 21 90 and aboveA 80 – 89B 70 – 79C 60 – 69D Below 60F
22
General Expectations Learning is not a divided responsibility (I teach, you learn)—learning is a joint responsibility (we learn together) My “hot buttons” Timeliness Ethical behavior Professional orientation toward learning Recognition that “true” learning involves more than getting the right answer. 22
23
23 Analytics
24
What role does statistics play in the developing field of Analytics? OR How does Analytics imply a need for increased understanding of statistics? 24
25
Impact of Analytics on the Way we Think about ___ Evolution vs. revolution Improvement vs. innovation 1 st order change vs. 2 nd order change change in how we do something (1 st order) vs. change in what we do (2 nd order) Paradigm shift…makes us go back to the most basic assumptions 25 Statistics?
26
The Changing Face of Statistics Methods Calculations and Graphs Calculators and Tables Mainframe Computers and Specialized Software Personal Computers and Spreadsheets Personal Computers and Specialized Software Users/Creators From: Limited individuals with specialized training/education To: Democratization of Statistics Major Areas of Focus Descriptive (no attempt to go beyond the current data set) Inferential (a sample to draw conclusions about a population) Process oriented (uses past data collected over time to make decisions for the future) Big Data and Analytics Predictive Prescriptive 26
27
Your List? (Role of Statistics in Analytics) 1.___7.___ 2.___8.___ 3.___9.___ 4.___10.___ 5.___11.___ 6.___12.___ 27
28
Day 2 Homework Posted at: faculty.ung.edu/kmelton/BUSA3110.html Includes: Reading assignment MyStatLab Homework One due Sunday, 8/23/15 11:30pm One available (due Wed., 8/26/15 7:30am) 28
29
Week 2 (Chapters 24 and 1.1 and Intro to JMP) Why start with the LAST chapter in the book? This is a second course in statistics. This chapter lets you reflect on the tools/ techniques from the first course…and sets the stage for this course. 29 CONTEXT
30
Copyright © 2015 Pearson Education. All rights reserved. 24-30 24.1 The Big Data Revolution Data mining as the process that uses a variety of data analysis tools to discover patterns and relationships in data to help build useful models and make predictions. The more general term business analytics (or sometimes simply analytics) describes any use of statistical analysis to drive business decisions from data, whether the purpose is predictive or simply descriptive. Big data is a term for data sets so large and complex that it becomes difficult to use traditional methods to capture, store, visualize, and analyze them.
31
Copyright © 2015 Pearson Education. All rights reserved. 24-31 24.1 The Big Data Revolution Most of the models used in analytics and data mining are based on the regression ideas that you’ve studied. The skills you’ve acquired by learning to fit models, diagnose them, and assess their limitations already prepare you to start entering the world of big data.
32
Copyright © 2015 Pearson Education. All rights reserved. 24-32 24.3 The Goals of Data Mining Differences in Data Mining compared to Traditional Statistical Analysis The size of the databases. Data mining examples are enormous compared to traditional analysis. The exploratory nature of data mining. The outcome of data mining is a predictive model, not running a hypothesis test or producing a confidence interval. The data are “happenstance”. Data used in data mining has not been collected in a systematic way (designed experiment or survey).
33
Copyright © 2015 Pearson Education. All rights reserved. 24-33 24.8 The Data Mining Process Data mining should be a team effort, because projects require a number of different skills. A group of data mining experts have created the following graphic called the Cross Industry Standard Process for Data Mining:
34
Copyright © 2015 Pearson Education. All rights reserved. 24-34 24.8 The Data Mining Process The process must start with the Business Understanding phase. Data Understanding is central to the entire data mining project – it is crucial to understand the data warehouse, what it contains, and what limitations are present. Once variables are selected and the response variable has been agreed upon, the Data Preparation phase begins. Following preparation is the Data Modeling phase. The more knowledge of the data and the variables that goes into the model, the higher the chances of success for the entire project. Finally, if the model seems to give business insight, it’s time for the Deployment phase – just keep in mind that the business environment changes rapidly, so models can become stale quickly.
35
Copyright © 2015 Pearson Education. All rights reserved. 24-35 24.4 Data Mining Myths Myth 1: Find answers to unasked questions. Myth 2: Automatically monitor a database for interesting patterns. Myth 3: Eliminate the need to understand the business. Myth 4: Eliminate the need to collect good data. Myth 5: Eliminate the need to good data analysis skill.
36
Copyright © 2015 Pearson Education. All rights reserved. 24-36 24.5 Successful Data Mining The first step is to have a well-defined business problem, which can help you avoid going down a lot of blind paths. Typically, 65% to 90% of the time is spent in data preparation – investigating missing values, correcting wrong entries, reconciling data definitions, or creating new variables from old ones.
37
Copyright © 2015 Pearson Education. All rights reserved. 24-37 24.5 Successful Data Mining Example: Data Preparation The PVA dataset includes 470 potential predictor variables for modeling whether a donor will donate to the next campaign. One variable, Age, may have needed corrections or adjustments so that it can be a more effective predictor in a regression-based model. Graphs are provided on the next slide for all 94,649 potential donors:
38
Copyright © 2015 Pearson Education. All rights reserved. 24-38 24.5 Successful Data Mining The group of cases with ages below 20 should immediately draw our attention – are people that young likely to be donors?
39
Copyright © 2015 Pearson Education. All rights reserved. 24-39 24.5 Successful Data Mining Example: Data Preparation Closer examination reveals 23,479 missing values, and the potential donors with ages below 15 and possibly above 100 or 110 should warrant a further investigation before proceeding with the model build phase.
40
Copyright © 2015 Pearson Education. All rights reserved. 24-40 24.6 Data Mining Problems When the goal is to predict a quantitative response variable, the problem is generically called a regression problem. When the response variable is categorical, the problem is referred to as a classification problem. Collectively, both types of problems are called supervised problems, because we are give a set of data for which we know the value of the response.
41
Copyright © 2015 Pearson Education. All rights reserved. 24-41 24.6 Data Mining Problems Often, the data miner constructs a model based on a portion of the original data, called the training set. To assess how well the model will work on future data, the modeler uses the data not used in the training set as a test set. A final type of data mining problem is an unsupervised problem – one in which there is no particular response variable. The goal may be to build clusters of data with similar attributes.
42
Copyright © 2015 Pearson Education. All rights reserved. 24-42 Be sure that the question to be answers is specific. A goal as vague as “improving the business” is not likely to be successful. Be sure that the data have the potential to answer the question. Check the variables to see whether a model can reasonably be built to predict the response. Be aware of overfitting the data. Make sure you validate the model on a test set.
43
Copyright © 2015 Pearson Education. All rights reserved. 24-43 Make sure that the data are ready to use in the data mining model. Missing values, incorrect entries, and different time scales are all challenges that need to be overcome. Don’t try it alone. Data mining projects require a variety of skills and a lot of work. Assemble the right team of people.
44
Copyright © 2015 Pearson Education. All rights reserved. 1-44 Data collected for recording the companies’ transactions is called transactional data. The process of using transactional data to make other decisions and predictions, is sometimes called data mining or predictive analytics. Business analytics describes any use of statistical analysis to drive business decisions from data. 1.1 What Are Data?
45
Copyright © 2015 Pearson Education. All rights reserved. 1-45 All data have a context. Data values or observations are information collected regarding some subject. The “Five W’s”: who, what, when, where, and (if possible) why. Often we add how to the list. Data can be numbers, names, etc., and tells us the “Who and What”. Data are often organized into a data table like that below. 1.1 What Are Data?
46
Copyright © 2015 Pearson Education. All rights reserved. 1-46 The rows of a data table correspond to individual cases about Whom we record some characteristics. These characteristics may be collected on or about … respondents – individuals who answer a survey subjects or participants – people in an experiment experimental units – animals, plants, websites, or other inanimate objects 1.1 What Are Data?
47
Copyright © 2015 Pearson Education. All rights reserved. 1-47 The characteristics recorded about each individual or case are called variables. These are usually shown as the columns of a data table and identify What has been measured. 1.1 What Are Data?
48
Copyright © 2015 Pearson Education. All rights reserved. 1-48 Metadata typically contains information about how, when, and where (and possibly why) the data were collected; who each case represents; and the definitions of all the variables. Data are typically saved in a spreadsheet, where the rows represent cases and the columns represent variables. 1.1 What Are Data?
49
Day 3 Homework Posted at: faculty.ung.edu/kmelton/BUSA3110.html Includes: Watching JMP videos Installing JMP on your computer Graded: One MyStatLab HW due Wed., 8/26/15 7:30am Three available P&P Roll Verification/Logistics-D2L (due 8/27/15 1:00pm) MyStatLab HW (due 8/30/15 7:30am) MyStatLab Chapter 1 quiz (due 8/30/15 11:30pm) 49
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.