Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stratification Matters: Analysis of 3 Variables

Similar presentations


Presentation on theme: "Stratification Matters: Analysis of 3 Variables"— Presentation transcript:

1 Stratification Matters: Analysis of 3 Variables
Thursday, August 04, 2016 Farrokh Alemi, PhD. Based on work of CF Jeff Lin PhD This lecture focuses on stratification. The lecture is based on slides prepared by Dr. Lin and modified by Dr. Alemi.

2 Stratification Ceteris Paribus Divide into subgroups
Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. Within the strata, members share the same features so impact of variables can be assessed without the influence of the shared features. Since the shared features are held constant within the strata, this is sometimes referred to as ceteris paribus, or holding all other things constant

3 Natural Stratification
Subgroups that are observed in the sample If the data have without pre-planning fallen into several subgroups, these subgroups are called natural strata.

4 Natural Stratification
Mutually exclusive & exhaustive The strata should be mutually exclusive: every element in the population must be assigned to only one stratum. The strata should also be collectively exhaustive: no population element can be excluded. 

5 Holding the 3rd variable constant
Three-Variable Data Impact of one variable on another, ceteris paribus Holding the 3rd variable constant  This lecture focuses on how 3-variable data can be analyzed using stratification. This is the simplest model for looking at impact of a variable on another while holding the third one constant.

6 Remove Impact of Other Variables
Explanatory Variable Response An important part of any research study is the ability to remove competing explanations so that the impact can be reasonable attributed to the same variable. In studying the relationship between a response variable and an explanatory variable, we should control covariates that can influence that relationship. c Jeff Lin, MD., PhD. Three-Way Table, 2 Same Level of Covariates Control of Alternative Explanations

7 Control of Alternative Explanations
Why dissatisfied? MD Satisfaction For example, we may want to examine if the physician is causing the patients dissatisfaction or the various nurses that work with him. In doing so, we should control for the impact of the nurse in the team. We would stratify the data by nurses and examine the impact of MD within each strata. Then we can report the impact of the physician on patient experiences. Various RNs Control of Alternative Explanations

8 Table 1: Satisfaction Across Teams
Satisfaction Example Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85%  This table is a 2 × 2 × 2 contingency table–two rows, two columns, and two layers. In this table we see how two physicians are working with two nurses and whether their patients have complained. The data is hypothetical but is easily available in complaint registries within most hospitals. The 684 patients classified in Table were patients at this hypothetical clinic. c Jeff Lin, MD., PhD. Three-Way Table, 4

9 Table 1: Satisfaction Across Teams
Satisfaction Example Y Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85% X Z The variables in Table 1 are Y is whether the patient complained, having the categories yes and no. X is the physician in the team and Z is the nurse, each having two possible levels. George and Smith for the physicians and Jim and Jill for the nurses. We study the effect of physicians on complaints, treating nurses as a control variables. We want to estimate the impact of physicians after removing the contribution of the nurses. Table 1 has a 2 × 2 partial table relating each physicians to complaint when they worked with various nurses. These 2 by 2 tables are in color and are referred to as partial tables, because they show part of the larger table. The whole table lists the percent of complaints for combinations of physician and nursing teams.

10 Table 1: Satisfaction Across Teams
Satisfaction Example Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85% 11.81% In the partial table when George was the physician, there were 11.81% more complaints when Jill was the nurse than when Jim was the nurse. When Jill and George team up, they do worse than when Jim and George team up.

11 Preferential Dependence
Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85% 11.81% When Smith was the doctor, there were 2.8% more complaints when Jill was the nurse. Jill and Smith are better as a team than Jim and Smith. The impact of nurses seems to depend on which physician they are working with. The idea that difference of Jim and Jill depend on who they team up with is called Preferential Dependence. In preferential dependence the fixed level of one variable affects preferences for other variables. Violation of preferential independence are rare but they do occur and point to unusual data set. When this happens, separate analysis is needed for each of the partial tables. -2.8%

12 Real Example Adding a disease reduces mortality rate
Examples of violation of preferential independence is given in the literature. While rare, it does occur.

13 Table 1: Satisfaction Across Teams
Satisfaction Example Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85% The two rows at the bottom portion of Table 1 displays the marginal table. It results from summing the cell counts in Table 1 over physicians, thus combining the two partial tables for each physician. For example, we get the 15 in the marginal table by adding 11 by 4. Overall, 10.75% of patients seen by Jim and 7.85% of Jill’s patients complained. If we ignore the physician working in the team, the complaints were 2.9% more likely when Jim was the nurse. So overall Jim seems to have higher percent of complaints although when Jim and Smith work together they are the best team, they have no complaints. 2.9%

14 Table 1: Satisfaction Across Teams
Simpson’s Paradox Table 1: Satisfaction Across Teams Physicians Nurses Complained Percent Satisfied Yes No George, MD Jim, RN 53 424 11.11% Jill, RN 11 37 22.92% Smith, MD 16 0.00% 4 139 2.80% Total 440 10.75% 15 176 7.85% The fact that association in the marginal table can have a different direction than association in the sub-group is called Simpson’s paradox. It can occur and should caution against blanket statements based on marginal tables without examining subgroups. True impact is only understood in the stratified subgroups, where we can control from team differences. -2.8% 2.9%

15 Stratification Compares impact in like situations
In stratified analysis, impact of a variable is assessed by comparing its presence and absence in like situations. Then apples cancel apples and oranges cancel oranges, so the impact of variable is assessed without the influence of covariates.


Download ppt "Stratification Matters: Analysis of 3 Variables"

Similar presentations


Ads by Google