© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 1 S010Y: Answering Questions with Quantitative Data Class 9/III.2:

Slides:



Advertisements
Similar presentations
National Core Indicators Overview for the State of Washington Lisa A. Weber, Ph.D. Division of Developmental Disabilities.
Advertisements

Visual description is an art
NICS Index State Participation As of 12/31/2007 DC NE NY WI IN NH MD CA NV IL OR TN PA CT ID MT WY ND SD NM KS TX AR OK MN OH WV MSAL KY SC MO ME MA DE.
What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll.
What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll finish.
Agencies’ Participation in PBMS January 20, 2015 PA IL TX AZ CA Trained, Partial Data Entry (17) Required Characteristics & 75% of Key Indicators (8) OH.
What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll.
© Willett, Harvard University Graduate School of Education, 6/23/2016S010Y/C04 – Slide 1 S010Y: Answering Questions with Quantitative Data Class 4: II.2.
Medicaid Eligibility for Working Parents by Income, January 2013
Visual Description of Data
House Price
Train-the-Trainer Sessions 240 sessions with 8,187 participants
House price index for AK
WY WI WV WA VA VT UT TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
Children's Eligibility for Medicaid/CHIP by Income, January 2013
NJ WY WI WV WA VA VT UT TX TN SD SC RI PA OR OK OH ND NC NY NM NH NV
Share of Births Covered by Medicaid, 2006
Non-Citizen Population, by State, 2011
Status of State Medicaid Expansion Decisions
Share of Women Ages 18 – 64 Who Are Uninsured, by State,
Coverage of Low-Income Adults by Scope of Coverage, January 2013
Executive Activity on the Medicaid Expansion Decision, May 9, 2013
Populations included in States’ SIMRs for Part C FFY 2013 ( )
WY WI WV WA VA VT UT TX TN1 SD SC RI PA1 OR OK OH ND NC NY NM NJ NH2
WY WI WV WA VA VT UT TX TN1 SD SC RI PA OR OK OH1 ND NC NY NM NJ NH NV
WY WI WV WA VA* VT UT TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
WY WI WV WA VA VT UT TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
Mobility Update and Discussion as of March 25, 2008
Current Status of the Medicaid Expansion Decision, as of May 30, 2013
IAH CONVERSION: ELIGIBLE BENEFICIARIES BY STATE
WAHBE Brokers / QHPs across the country as of
619 Involvement in State SSIPs
We help food manufacturers make data-driven decisions.
State Health Insurance Marketplace Types, 2015
State Health Insurance Marketplace Types, 2018
HHGM CASE WEIGHTS Early/Late Mix (Weighted Average)
Status of State Medicaid Expansion Decisions
Status of State Participation in Medicaid Expansion, as of March 2014
Percent of Women Ages 19 to 64 Uninsured by State,
Status of State Medicaid Expansion Decisions
Sampling Distribution of a Sample Mean
Medicaid Income Eligibility Levels for Parents, January 2017
State Health Insurance Marketplace Types, 2017
S Co-Sponsors by State – May 23, 2014
WY WI WV WA VA VT UT* TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
Seventeen States Had Higher Uninsured Rates Than the National Average in 2013; Of Those, 11 Have Yet to Expand Eligibility for Medicaid AK NH WA VT ME.
Employer Premiums as Percentage of Median Household Income for Under-65 Population, 2003 and percent of under-65 population live where premiums.
Employer Premiums as Percentage of Median Household Income for Under-65 Population, 2003 and percent of under-65 population live where premiums.
Average annual growth rate
Train-the-Trainer Sessions 250 sessions with 8,352 participants
Sampling Distribution of a Sample Mean
Uninsured Rate Among Adults Ages 19–64, 2008–09 and 2019
Percent of Children Ages 0–17 Uninsured by State
Executive Activity on the Medicaid Expansion Decision, May 9, 2013
How State Policies Limiting Abortion Coverage Changed Over Time
United States: age distribution family households and family size
Premiums for Family Coverage, by State, 2011
Status of State Medicaid Expansion Decisions
Employer Premiums as Percentage of Median Household Income for Under-65 Population, 2003 and percent of under-65 population live where premiums.
Percent of Adults Ages 18–64 Uninsured by State
Uninsured Nonelderly Adult Rate Has Increased from Percent to 20
States including quality standards in their SSIP improvement strategies for Part C FFY 2013 ( ) States including quality standards in their SSIP.
Status of State Medicaid Expansion Decisions
WY WI WV WA VA VT UT* TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
WY WI WV WA VA VT UT* TX TN SD SC RI PA OR* OK OH ND NC NY NM* NJ NH
States including their fiscal systems in their SSIP improvement strategies for Part C FFY 2013 ( ) States including their fiscal systems in their.
Current Status of State Individual Marketplace and Medicaid Expansion Decisions, as of September 30, 2013 WY WI WV WA VA VT UT TX TN SD SC RI PA OR OK.
Status of State Medicaid Expansion Decisions
Income Eligibility Levels for Children in Medicaid/CHIP, January 2017
WY WI WV WA VA VT UT TX TN SD SC RI PA OR OK OH ND NC NY NM NJ NH NV
Presentation transcript:

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 1 S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables What types of data are collected? “Categorical” Data “Continuous” Data What Kinds Of Question Can Be Asked Of Those Data? Questions That Require Us To Describe Single Features of the Participants How many members of the class are women? What proportion of the class is fulltime? …. ? How tall are class members, on average? How many hours a week do class members report that they study? …. ? Questions that Require Us To Examine Relationships Between Features of the Participants. Are men more likely to study part-time? Are women more likely to enroll in USP? …. ? Do people who say they study for more hours think they’ll finish their doctorate earlier? Are computer literates less anxious about statistics? …. ? Research Is A Partnership Of Questions And Data

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 2 Here’s the codebook for the data we’ll use in this part of the module … DatasetWALLCHT.txt Overview Summary information on selected aspects of state educational performance outcomes, resource inputs, and population characteristics, in Source US Department of EducationUS Department of Education and the National Center for Education Statistics.National Center for Education Statistics Sample Size50 states UpdatedDecember 5, 2003 ColVar NameDescriptionMetric 1STATEState postal abbreviationAlphabetic 2TCHRSALAverage teacher salary in the State.1988$ 3STRATIO Average number of students per teacher statewide. ratio 4PPEXPEND Average expenditure per pupil in the State. 1988$ 5HSGRADRT Average high-school graduation rate statewide % S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 3 We can use these data to address a variety of interesting research questions, including this one … Research Question: “Are high school graduation rates higher in states where there are fewer students per teacher?” Research Question: “Are high school graduation rates higher in states where there are fewer students per teacher?” question about a potential relationship between two continuous variables:  Statewide High-School graduation rates (HSGRADRT),  Student/Teacher ratio (STRATIO) question about a potential relationship between two continuous variables:  Statewide High-School graduation rates (HSGRADRT),  Student/Teacher ratio (STRATIO) So, in other words, I’m really asking: Are HSGRADRT and STRATIO related? So, in other words, I’m really asking: Are HSGRADRT and STRATIO related? How do we answer this question? S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 4 OPTIONS Nodate Pageno=1; TITLE1 ‘S010Y: Answering Questions with Quantitative Data'; TITLE2 'Class 9/Handout 1: Displaying Relationships Between Continuous Variables'; TITLE3 'The Infamous Wallchart Data'; TITLE4 'Data in WALLCHT.txt'; * * Input data, name and label variables in the dataset * *; DATA WALLCHT; INFILE 'C:\DATA\A010Y\WALLCHT.txt'; INPUT STATE $ TCHRSAL STRATIO PPEXPEND HSGRADRT; LABEL TCHRSAL = '1988 Average Teacher Salary' STRATIO = '1988 Student/Teacher Ratio' PPEXPEND = '1988 Expenditure/Student' HSGRADRT = '1988 Statewide H.S. Graduation Rate'; * * Data Listing, with the States ranked in descending order by values of HSGRADRT * *; PROC SORT DATA=WALLCHT; BY DESCENDING HSGRADRT; PROC PRINT LABEL DATA=WALLCHT; TITLE5 'Listing of Data, in Descending Order of H.S. Graduation Rates'; VAR STATE HSGRADRT STRATIO TCHRSAL PPEXPEND; OPTIONS Nodate Pageno=1; TITLE1 ‘S010Y: Answering Questions with Quantitative Data'; TITLE2 'Class 9/Handout 1: Displaying Relationships Between Continuous Variables'; TITLE3 'The Infamous Wallchart Data'; TITLE4 'Data in WALLCHT.txt'; * * Input data, name and label variables in the dataset * *; DATA WALLCHT; INFILE 'C:\DATA\A010Y\WALLCHT.txt'; INPUT STATE $ TCHRSAL STRATIO PPEXPEND HSGRADRT; LABEL TCHRSAL = '1988 Average Teacher Salary' STRATIO = '1988 Student/Teacher Ratio' PPEXPEND = '1988 Expenditure/Student' HSGRADRT = '1988 Statewide H.S. Graduation Rate'; * * Data Listing, with the States ranked in descending order by values of HSGRADRT * *; PROC SORT DATA=WALLCHT; BY DESCENDING HSGRADRT; PROC PRINT LABEL DATA=WALLCHT; TITLE5 'Listing of Data, in Descending Order of H.S. Graduation Rates'; VAR STATE HSGRADRT STRATIO TCHRSAL PPEXPEND; I begin the analysis in Class9/Handout1 -- here’s the start of the PC-SAS program … S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables Regular data input paragraph STATE is an “string” variable:  Values are alphabetic characters (that is, the names of the states),  We tell PC_SAS by putting a “$” symbol after the variable name in the input statement. STATE is an “string” variable:  Values are alphabetic characters (that is, the names of the states),  We tell PC_SAS by putting a “$” symbol after the variable name in the input statement. This paragraph sorts the data in descending order of high-school graduation rate, HSGRADRT, to facilitate comparisons across states. Print out the data for inspection Names the columns in the print listing with the variable labels, rather than the variable names

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 5 The data-listing produced by PC-SAS … demonstrates considerable heterogeneity on all four variables!!! 1988 Statewide H.S. Student/ Average 1988 Graduation Teacher Teacher Expenditure/ STATE Rate Ratio Salary Student MN ND WY MT IA NE CT WI KS OH SD UT VT PE NJ WV AR WA IN NV IL ID AL CO ME MA MD NH MO MI OR NM Statewide H.S. Student/ Average 1988 Graduation Teacher Teacher Expenditure/ STATE Rate Ratio Salary Student MN ND WY MT IA NE CT WI KS OH SD UT VT PE NJ WV AR WA IN NV IL ID AL CO ME MA MD NH MO MI OR NM DL OK VA RI TN HI KY MS NC CA AK TX SC NY LA AZ GA FL DL OK VA RI TN HI KY MS NC CA AK TX SC NY LA AZ GA FL S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 6 * * Descriptive statistics on graduation rates and student/teacher ratios * *; PROC UNIVARIATE PLOT DATA=WALLCHT; TITLE5 'Distribution of H.S. Graduation Rates and Student/Teacher Ratios'; VAR HSGRADRT STRATIO; ID STATE; * * Descriptive statistics on graduation rates and student/teacher ratios * *; PROC UNIVARIATE PLOT DATA=WALLCHT; TITLE5 'Distribution of H.S. Graduation Rates and Student/Teacher Ratios'; VAR HSGRADRT STRATIO; ID STATE; univariate descriptive statistics Then, I asked PC-SAS to provide univariate descriptive statistics on the HSGRADRT and STRATIO variables … S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables Here are the usual PROC UNIVARIATE commands to obtain:  Univariate summary statistics,  Stem-Leaf & Boxplots. On the WALLCHT data. Here are the usual PROC UNIVARIATE commands to obtain:  Univariate summary statistics,  Stem-Leaf & Boxplots. On the WALLCHT data. Specifies the variables for which descriptive statistics are required:  Notice that you can list both HSGRADRT and STRATIO. Specifies the variables for which descriptive statistics are required:  Notice that you can list both HSGRADRT and STRATIO. Implementing the ID command ensures that the cases are identified by the (alphabetic) value of the STATE variable

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 7 The UNIVARIATE Procedure Variable: HSGRADRT (1988 Statewide H.S. Graduation Rate) N 50 Sum Weights 50 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min Extreme Observations Lowest Highest Value STATE Obs Value STATE Obs 58.0 FL IA GA MT AZ ND LA WY NY MN 1 The UNIVARIATE Procedure Variable: HSGRADRT (1988 Statewide H.S. Graduation Rate) N 50 Sum Weights 50 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min Extreme Observations Lowest Highest Value STATE Obs Value STATE Obs 58.0 FL IA GA MT AZ ND LA WY NY MN 1 Here are the univariate descriptive statistics for continuous variable HSGRADRT … Can you interpret these univariate descriptive statistics? Stem Leaf # Boxplot | | | | 82 | | | | *--+--* | | | | | | | | | Stem Leaf # Boxplot | | | | 82 | | | | *--+--* | | | | | | | | | S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 8 The UNIVARIATE Procedure Variable: STRATIO (1988 Student/Teacher Ratio) N 50 Sum Weights 50 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min Extreme Observations Lowest Highest Value STATE Obs Value STATE Obs 13.3 CT NV MA ID VT HI NJ CA WY UT 12 The UNIVARIATE Procedure Variable: STRATIO (1988 Student/Teacher Ratio) N 50 Sum Weights 50 Mean Sum Observations Std Deviation Variance Skewness Kurtosis Basic Statistical Measures Location Variability Mean Std Deviation Median Variance Mode Range Interquartile Range Quantile Estimate 100% Max % % % % Q % Median % Q % % % % Min Extreme Observations Lowest Highest Value STATE Obs Value STATE Obs 13.3 CT NV MA ID VT HI NJ CA WY UT 12 Here are the univariate descriptive statistics on continuous variable STRATIO ….. Can you interpret these univariate descriptive statistics? Stem Leaf # Boxplot | 22 | | 21 | | | | | | | | | *--+--* | | | | | | | | | Stem Leaf # Boxplot | 22 | | 21 | | | | | | | | | *--+--* | | | | | | | | | S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide 9 display simultaneouslybivariate scatterplot … But, are HSGRADRT and STRATIO related? To address this question, we must display HSGRADRT and STRATIO simultaneously in a bivariate scatterplot … * * Displaying the relationship between HSGRADRT and STRATIO * *; PROC PLOT DATA=WALLCHT; TITLE5 'Plot of H.S. Graduation Rates against Student/Teacher Ratios'; PLOT HSGRADRT*STRATIO / HAXIS = 10 TO 25 BY 5 VAXIS = 50 TO 100 BY 10; RUN; * * Displaying the relationship between HSGRADRT and STRATIO * *; PROC PLOT DATA=WALLCHT; TITLE5 'Plot of H.S. Graduation Rates against Student/Teacher Ratios'; PLOT HSGRADRT*STRATIO / HAXIS = 10 TO 25 BY 5 VAXIS = 50 TO 100 BY 10; RUN; S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables PROC PLOT is a PC_SAS routine that produces bivariate scatter-plots of continuous variables vertical axis Choose an appropriate scaling for the vertical axis. horizontal axis Choose an appropriate scaling for the horizontal axis. vertical axis horizontal axis Plot HSGRADRT on the vertical axis versus STRATIO on the horizontal axis

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio Here’s a bivariate plot of HSGRADRT versus STRATIO … ? S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables OHIO display values of outcome HSGRADRT & predictor STRATIO simultaneously Points on the scatterplot – like symbol “A” -- represent each State, and display values of outcome HSGRADRT & predictor STRATIO simultaneously. In Ohio, HSGRADRT=79.6, STRATIO=18.0. display values of outcome HSGRADRT & predictor STRATIO simultaneously Points on the scatterplot – like symbol “A” -- represent each State, and display values of outcome HSGRADRT & predictor STRATIO simultaneously. In Ohio, HSGRADRT=79.6, STRATIO=18.0. Vertical axis HSGRADRT Vertical axis (or ordinate), displays the value of “outcome,” HSGRADRT Horizontal axis STRATIO Horizontal axis (or abscissa), displays the value of “predictor,” STRATIO

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio And, how can we tell if HSGRADRT and STRATIO are related? Is this the case here? Two variables are related if… S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C09 – Slide ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio What kind of line, curve or other construction best summarizes the observed relationship between HSGRADRT and STRATIO? You be the judge? S010Y: Answering Questions with Quantitative Data Class 9/III.2: Displaying Relationships Between Continuous Variables

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 13 S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ Student/Teacher Ratio What kind of line, curve or other construction best summarizes the observed relationship between HSGRADRT and STRATIO? Here’s My Best Guess! It was obtained by a mystery process called “Ordinary Least-Squares (OLS) Regression Analysis.” Here’s My Best Guess! It was obtained by a mystery process called “Ordinary Least-Squares (OLS) Regression Analysis.”

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 14 S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables OPTIONS Nodate Pageno=1; TITLE1 ‘S010Y: Answering Questions with Quantitative Data'; TITLE2 'Class 10/Handout 1: Summarizing Relationships Between Continuous Variables'; TITLE3 'The Infamous Wallchart Data'; TITLE4 'Data in WALLCHT.txt'; * * Input data, name and label variables in the dataset * *; DATA WALLCHT; INFILE 'C:\DATA\A010Y\WALLCHT.txt'; INPUT STATE $ TCHRSAL STRATIO PPEXPEND HSGRADRT; LABEL TCHRSAL = '1988 Average Teacher Salary' STRATIO = '1988 Student/Teacher Ratio' PPEXPEND = '1988 Expenditure/Student' HSGRADRT = '1988 Statewide H.S. Graduation Rate'; * * Using regression analysis to summarize the relationship of HSGRADRT and STRATIO * *; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; * * Plotting the relationship between HSGRADRT and STRATIO * *; PROC PLOT DATA=WALLCHT; TITLE5 'Plot of H.S. Graduation Rates against Student/Teacher Ratios'; PLOT HSGRADRT*STRATIO / HAXIS = 10 TO 25 BY 5 VAXIS = 50 TO 100 BY 10; RUN; OPTIONS Nodate Pageno=1; TITLE1 ‘S010Y: Answering Questions with Quantitative Data'; TITLE2 'Class 10/Handout 1: Summarizing Relationships Between Continuous Variables'; TITLE3 'The Infamous Wallchart Data'; TITLE4 'Data in WALLCHT.txt'; * * Input data, name and label variables in the dataset * *; DATA WALLCHT; INFILE 'C:\DATA\A010Y\WALLCHT.txt'; INPUT STATE $ TCHRSAL STRATIO PPEXPEND HSGRADRT; LABEL TCHRSAL = '1988 Average Teacher Salary' STRATIO = '1988 Student/Teacher Ratio' PPEXPEND = '1988 Expenditure/Student' HSGRADRT = '1988 Statewide H.S. Graduation Rate'; * * Using regression analysis to summarize the relationship of HSGRADRT and STRATIO * *; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; * * Plotting the relationship between HSGRADRT and STRATIO * *; PROC PLOT DATA=WALLCHT; TITLE5 'Plot of H.S. Graduation Rates against Student/Teacher Ratios'; PLOT HSGRADRT*STRATIO / HAXIS = 10 TO 25 BY 5 VAXIS = 50 TO 100 BY 10; RUN; Of course, you can also get PC-SAS to tell you where the OLS-fitted regression line is … Here are the usual data input statements Here are the PC- SAS regression analysis commands – we dissect them in detail on the next slide Creates another scatterplot of the data for use later

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 15 * * Using regression analysis to summarize the relationship of HSGRADRT and STRATIO * *; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; * * Using regression analysis to summarize the relationship of HSGRADRT and STRATIO * *; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables Here’s the part of the PC_SAS program that deals specifically with the OLS Regression Analysis of the HSGRADRT versus STRATIO relationship … You request an OLS Regression Analysis by specifying a “Regression Model” that identifies the “Outcome” and the “Predictor(s)” to include in the analysis: Model HSGRADRT = STRATIO You request an OLS Regression Analysis by specifying a “Regression Model” that identifies the “Outcome” and the “Predictor(s)” to include in the analysis: Model HSGRADRT = STRATIO You identify the outcome variable (HSGRADRT) by placing it to the left of the “equals” sign, in the MODEL statement You identify the predictor variable (STRATIO) by placing it to the right of the “equals” sign, in the MODEL statement PROC REG is the command in PC-SAS that requests an OLS Regression Analysis

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 16 S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables The REG Procedure Model: MODEL1 Dependent Variable: HSGRADRT 1988 Statewide H.S. Graduation Rate Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Intercept Intercept STRATIO 1988 Student/Teacher Ratio Parameter Estimates Variable Label DF Pr > |t| Intercept Intercept 1 <.0001 STRATIO 1988 Student/Teacher Ratio Here’s output from the OLS Regression Analysis of Outcome HSGRADRT on Predictor STRATIO….. This is the major part of the “regression analysis” output. I unpack it on the next several slides This is the major part of the “regression analysis” output. I unpack it on the next several slides Ignore this part of the output. When you go on to S030, you’ll learn what it all means

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 17 S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables Dependent Variable: HSGRADRT 1988 Statewide H.S. Graduation Rate Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Intercept Intercept STRATIO 1988 Student/Teacher Ratio Parameter Estimates Variable Label DF Pr > |t| Intercept Intercept 1 <.0001 STRATIO 1988 Student/Teacher Ratio Dependent Variable: HSGRADRT 1988 Statewide H.S. Graduation Rate Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Intercept Intercept STRATIO 1988 Student/Teacher Ratio Parameter Estimates Variable Label DF Pr > |t| Intercept Intercept 1 <.0001 STRATIO 1988 Student/Teacher Ratio The core part of the OLS Regression Output describes the fitted regression line.. How do you work with this “Fitted Model”? These “Parameter Estimates” tell you where PROC REG thinks that the fitted trend line should be drawn … by listing them, it’s telling you that the fitted trend line has the following algebraic equation:

© Willett, Harvard University Graduate School of Education, 1/28/2016S010Y/C10 – Slide 18 S010Y : Answering Questions with Quantitative Data Class 10&11/III.3: Summarizing Relationships Between Continuous Variables Let’s try a couple.. Remember that the fitted equation is telling us PROC REG’s best prediction for HSGRADRT at each value of STRATIO. For instance… 1. When STRATIO = 13.3 (the minimum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(13.3) = – = When STRATIO = 13.3 (the minimum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(13.3) = – = When STRATIO = 24.7 (the maximum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(24.7) = – = When STRATIO = 24.7 (the maximum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(24.7) = – = 66.0 You can substitute reasonable values for predictor, STRATIO, into the fitted equation and can then use it to compute the best predictions – or predicted values -- for HSGRADRT, as follows: Recognize these values?