9/15/2015330 Lecture 11 STATS 330: Lecture 1. 9/15/2015330 Lecture 12 Today’s agenda: Introductory Comments: –Housekeeping –Computer details –Plan of.

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

Chapter 8 Linear regression
Chapter 8 Linear regression
Chapter 4 Describing the Relation Between Two Variables 4.3 Diagnostics on the Least-squares Regression Line.
5/13/ lecture 91 STATS 330: Lecture 9. 5/13/ lecture 92 Diagnostics Aim of today’s lecture: To give you an overview of the modelling cycle,
Education 793 Class Notes Joint Distributions and Correlation 1 October 2003.
Student Orientation for Distance Learning Central Piedmont Community College Distance Learning.
© 2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH Getting to Know Your Data Basic Data Cleaning Principles.
CIS101 Introduction to Computing Week 11 Spring 2004.
Announcements ●Exam II range ; mean 72
Lecture 1: IntroductionIntro to IT COSC1078 Introduction to Information Technology Lecture 2 Overview James Harland
TC 310 The Computer in Technical Communication Dr. Jennifer Turns Week 4, Day 1 (4/21)
Chapter 0 Introductory Comments. Overview Syllabus Detailed power point slides My Web Page –Homework on web page –Readings –Other.
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Lecture 1: IntroductionIntro to IT COSC1078 Introduction to Information Technology Lecture 1 Introduction James Harland
Statistics 350 Lecture 1. Today Course outline Stuff Section
0.11 Mike Cox Research Methods and Skill II PSY1011 Thursday, 25 June :55 AM.
Statistics Lecture 2. Last class began Chapter 1 (Section 1.1) Introduced main types of data: Quantitative and Qualitative (or Categorical) Discussed.
Applied Business Forecasting and Regression Analysis Introduction.
Statistics 350 Lecture 10. Today Last Day: Start Chapter 3 Today: Section 3.8 Homework #3: Chapter 2 Problems (page 89-99): 13, 16,55, 56 Due: February.
Introduction CSCI102 - Introduction to Information Technology B ITCS905 - Fundamentals of Information Technology.
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
Using MyMathLab Features You must already be registered or enrolled in a current MyMathLab class in order to use MyMathLab. If you are not registered or.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Multiple Regression Dr. Andy Field.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Normal Distribution Recall how we describe a distribution of data:
Connect Connect is an educational platform where you can access your homework, assignments and course material. Connect Plus gives you also access to an.
Topic 1: Class Logistics. Outline Class Web site Class policies Overview References Software Background Reading.
Tutor: Prof. A. Taleb-Bendiab Contact: Telephone: +44 (0) CMPDLLM002 Research Methods Lecture 9: Quantitative.
N318b Winter 2002 Nursing Statistics Specific statistical tests: Correlation Lecture 10.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Lecture 1 Page 1 CS 111 Summer 2015 Introduction CS 111 Operating System Principles.
Exploring Microsoft Office C51BR1 Business Reporting By David Kilgour
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
BMTRY 789 Introduction to SAS Programming Lecturer: Annie N. Simpson, MSc.
Stat 112 Notes 15 Today: –Outliers and influential points. Homework 4 due on Thursday.
Computer Networking A few notes on reading Dr Sandra I. Woolley.
Chapter 10 Correlation and Regression
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
1 Tutorial 2 GE 5 Tutorial 2  rules of engagement no computer or no power → no lesson no computer or no power → no lesson no SPSS → no lesson no SPSS.
WRITING REPORTS Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall 2015.
Strategies for Using Media Effectively in the Classroom.
Stat 13, Tue 5/29/ Drawing the reg. line. 2. Making predictions. 3. Interpreting b and r. 4. RMS residual. 5. r Residual plots. Final exam.
Education 795 Class Notes Introduction and Overview Note set 1.
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
Staying Out of the Plagiarism Trap. Staying Out of the Plagiarism Trap Overview 4 What is plagiarism? 4 Why is it wrong? 4 Benefits of giving credit to.
Psychology 1 Introduction to Psychology. Introduction Welcome to this course in general psychology! This course is designed to introduce you to the scientific.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
INTRODUCTION: WELCOME TO STAT 200 January 5 th, 2009.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
LECTURE 02: EVALUATING MODELS January 27, 2016 SDS 293 Machine Learning.
Lecture 1 Page 1 CS 236 Online Introduction CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
Curriculum Night 7 th Grade Mathematics Ms. Bottomley.
Forecas ting Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
Starter To complement our notes for Topic 7 & 8, complete a mind map for these two topics on paper – (5 Mins) Project Management & the SDLC Project Failure.
Howard Community College
Daniela Stan Raicu School of CTI, DePaul University
Lecture 14 Review of Lecture 13 What we’ll talk about today?
CHAPTER 29: Multiple Regression*
Tutorial 8 Table 3.10 on Page 76 shows the scores in the final examination F and the scores in two preliminary examinations P1 and P2 for 22 students in.
Course aims The aim of this course is to introduce prospective managers and leaders in the tourism industry to the essentials of the research process.
Introduction to Comparative Effectiveness Course (HAP 823)
Chapter 4, Regression Diagnostics Detection of Model Violation
Presentation transcript:

9/15/ Lecture 11 STATS 330: Lecture 1

9/15/ Lecture 12 Today’s agenda: Introductory Comments: –Housekeeping –Computer details –Plan of the course Statistical Modelling: an overview –Our Analysis strategy –Goals for the course Role of Graphics Data cleaning

9/15/ Lecture 13 Housekeeping Contact details…. Plus much else on course web page Or via Cecil

9/15/ Lecture 14

9/15/ Lecture 15 Assignments There will be five assignments (20% of total grade) The due dates are listed in the course summary Assignment 1 is due on August 2. No paper will be issued in class – download assignments and data from the Web or Cecil

9/15/ Lecture 16 Tutorials These will cover computing details Held in basement tutorial lab, 303S Extension (Rm 303S-B75) –Tutorial 1: Wed –Tutorial 2: Fri –Tutorial 3: 2-3 Fri Start second week of semester

9/15/ Lecture 17 Computer details All analyses will be done using R I require homework to be typed, use Word, cut and paste R text and graphics into Word Use home/laptop computer or basement lab (303S) See web page for info on downloading R330 package URL is

9/15/ Lecture 18 Course Plan There will be 33 lectures, divided into chapters as follows Chapter 1: Introduction (1 lecture, this one!) Chapter 2: Graphics (3 lectures) Chapter 3: Multiple Regression Model (12 lectures) Chapter 4: Factors (4 lectures) Chapter 5: Models for categorical and count responses (12 lectures) Revision (1 lecture)

9/15/ Lecture 19 Course book This covers most of the material we will discuss in the lectures Has 5 chapters corresponding to the division on the previous slide On-line version available on the course web site Paper copies available from the Statistics Department, Commerce A

9/15/ Lecture 110 Overview of Statistical Modelling Statistical models summarize relationships between variables Regression models focus on one variable (the response) and how its distribution can be modelled by one or more explanatory variables

9/15/ Lecture 111 Example: Response is BPD BPD: Bi-Parietal Diameter For an unborn baby, depends on several factors, including Gestational Age.

Ultrasound image 9/15/ Lecture 112

9/15/ Lecture 113 Points to take into account: Not all babies of the same age have the same BPD. In general, the older the baby, the bigger the BPD.

9/15/ Lecture 114 Model relating BPD and GA Gestational Age BPD Mean BPD for 30 weeks =  +  30 Mean BPD for 40 weeks = a +  40 Mean BPD for GA weeks = a +  GA Line is BPD =a + b GA Each bell curve has sd 10 mm 3040

9/15/ Lecture 115 Regression model features Model specifies the distribution of the response Also says how the distribution of the response is affected by the covariates (explanatory variables) In this case the covariate GA determines the mean response: mean BPD = a + b GA ie a straight-line relationship.

9/15/ Lecture 116 In general…. Response has a distribution depending on (conditional on) the explanatory variables Mean of the distribution given by some function of the explanatory variables Need to –describe the function –describe the variability Balance accuracy with simplicity.

9/15/ Lecture 117 Our Analysis strategy: Explore the data using graphics and summary statistics Construct a useful model (trial and error) Use the model to gain knowledge about the system under study (How big should a 40 week baby’s BPD be? ) Communicate findings (be able to write a report!!)

9/15/ Lecture 118 Goals for 330 Get more practice in exploring data Expand knowledge of regression Get better at fitting models Improve your diagnostic skills (how to recognize when a model doesn’t describe the data properly) Improve your interpretation and communication skills, in a more flexible way.

9/15/ Lecture 119 Role of Graphics Data Cleaning: Are there gross errors, outliers, special codings (eg 999 for missing), missing data, absolute rubbish in the data? Exploratory analysis: what sort of model might be appropriate? Diagnostics: having tentatively selected a model, is it any good? (Residual plots, etc)

9/15/ Lecture 120 Exploratory analysis for BPD data Outlier Linear relationship (straight line) appropriate?

9/15/ Lecture 121 Diagnostic Graphics: residuals versus fitted values Smoothing enhances interpretation. Conclusion? Outlier stands out

9/15/ Lecture 122 Data Cleaning All real-life data sets are likely to contain errors These will usually be revealed by suitable plots, such as the one on the previous slide Before starting any analysis, data should always be carefully checked To encourage this practice, I will occasionally introduce errors into the assignment data supplied on the web.

9/15/ Lecture 123 Data Cleaning (2) It is your responsibility to check all data supplied on the web, against the assignment sheets. Failure to so will cost marks. YOU HAVE BEEN WARNED!!!