Sang-Hyop Lee & Comfort Sumida 3rd NTA Workshop January 21, 2006

Slides:



Advertisements
Similar presentations
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
Advertisements

BA 555 Practical Business Analysis
Market-based NTA Labor Income and Consumption by Gender Gretchen Donehower Day 4, Session 1, NTA Time Use and Gender Workshop Thursday, May 24, 2012 Institute.
Sampling Distribution of the Mean Problem - 1
Demand Management and Forecasting
by B. Zadrozny and C. Elkan
Demand Management and Forecasting
Chapter 12: Linear Regression 1. Introduction Regression analysis and Analysis of variance are the two most widely used statistical procedures. Regression.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Operations Management For Competitive Advantage 1Forecasting Operations Management For Competitive Advantage Chapter 11.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Introduction to regression 3D. Interpretation, interpolation, and extrapolation.
Demand Management and Forecasting Module IV. Two Approaches in Demand Management Active approach to influence demand Passive approach to respond to changing.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
Heteroskedasticity ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Market-based NTA by Gender Gretchen Donehower NTA Time Use and Gender Workshop Tuesday, October 23, 2012 Facultad de Ciencias Sociales, Universidad de.
STATA for S-052 M. Shane Tutwiler Your Friendly S-040 Lecturer William Johnston IT Services Harvard Graduate School of Education.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
The Nature of Science & Science Skills Test Review.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Statistics Who Spilled Math All Over My Biology?!.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
2.5 Using Linear Models A scatter plot is a graph that relates two sets of data by plotting the data as ordered pairs. You can use a scatter plot to determine.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Finalizing Results, Review and Sensitivity Testing Gretchen Donehower Day 2, Session 2, NTA Time Use and Gender Workshop Tuesday, May 22, 2012 Institute.
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Linear Correlation (12.5) In the regression analysis that we have considered so far, we assume that x is a controlled independent variable and Y is an.
Regression method (basic level) Regression method (basic level) Jo z e Sambt NTA Hands-On Workshop Berkeley, CA January 14, 2009.
Scatter Plots. Scatter plots are used when data from an experiment or test have a wide range of values. You do not connect the points in a scatter plot,
Hypothesis Testing Example 3: Test the hypothesis that the average content of containers of a particular lubricant is 10 litters if the contents of random.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Comparing k Populations Means – One way Analysis of Variance (ANOVA)
BUS 308 Week 4 Quiz Check this A+ tutorial guideline at 1. With reference to problem 1, what.
Chapter 2 Linear regression.
Correlation & Forecasting
EMPA Statistical Analysis
Advanced Quantitative Techniques
Chapter 7. Classification and Prediction
Correlation & Regression
Analyzing One-Variable Data
Objectives Fit scatter plot data using linear models with and without technology. Use linear models to make predictions.
Analyze ICD-10 Diagnosis Codes with Stata
Splash Screen.
Correlation – Regression
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Copyright © 2011 Dan Nettleton
Comparing k Populations
CHAPTER 21: Comparing Two Means
2. Find the equation of line of regression
Introduction to Summary Statistics
Introduction to Summary Statistics
Inferential Statistics
Statistical Assumptions for SLR
STEM Fair Graphs.
Examining Relationships
Comparing k Populations
Scatter Plots and Equations of Lines
Correlation and Regression
Scatter Plots and Least-Squares Lines
15.1 The Role of Statistics in the Research Process
Displaying Data – Charts & Graphs
Diagnostics and Remedial Measures
بسم الله الرحمن الرحيم. Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University.
Correlation & Regression
A Brief Introduction to Stata(2)
Diagnostics and Remedial Measures
Ordinary Least Square estimator using STATA
Presentation transcript:

Sang-Hyop Lee & Comfort Sumida 3rd NTA Workshop January 21, 2006 Smoothing profiles Sang-Hyop Lee & Comfort Sumida 3rd NTA Workshop January 21, 2006

Lowess Lowess carries out a locally weighted regression of dependent variable on independent variable, displays the graph, and optionally saves the smoothed variable. The command in Stata is .lowess yl age75 if age75>14, nograph gen(sm_yl1) .lowess yl age75 if age75>14, nograph bwidth(0.1) gen(sm_yl2) where yl is the key variable to be smoothed; bwidth(*) specifies the bandwidth and gen(*) is used to save smoothed values.

Methodology Cleveland, William S. 1979. Robust locally weighted regression and smoothing scatter plots. Journal of the American Statistical Association 74:829-36. Let yi and xi be the two variables, and assume that the data are ordered so that xi  xi+1 for i = 1,.....N – 1. For each yi, a smoothed (predicted) value of yiP is calculated. The subset used in calculating yiP is indices i- = max(1, i – k) through i+ = min(i + k, N) where k = [(N * bandwidth – 0.5)/2] := (N * bandwidth)/2 Local weight

Bandwidth and Smoothing The optimal bandwidth is dependent on the variable and dataset, and is determined through examination of smoothed profiles plotted against unsmoothed ones. Too narrow bandwidth results in smoothed estimates that are still noisy. Too wide bandwidth does not provide an accurate representation of the unsmoothed data.

Too narrow bandwidth

Too wide bandwidth

Proper bandwidth

Warning!

Warning “lowess” can produce smoothed values that consistently larger (or smaller) than unsmoothed values The high smoothed values are due to the frequency weight in the data. . table age75 [w=weight], content (mean yl mean sm_yl1 mean sm_yl2)

Solution In principle, the smoothing procedure should be done simultaneously with sample weight, not after the smoothing. .lowess yl age75 if age75>14, nograph bwidth(0.1) gen(sm_yl2) . table age75 [w=weight], content (mean yl mean sm_yl1 mean sm_yl2) Expand (duplicate) the data using sample weight, smooth the data, and tabulate the non-smoothed and smoothed values without w=weight option. => takes long time to execute.