Download presentation
Presentation is loading. Please wait.
Published byBrianna Curtis Modified over 8 years ago
1
Week Four
3
Basic decision is use of: New data, collected specifically for research purposes, or Existing data ◦ Records (e.g., patient charts) ◦ Historical data ◦ Existing data set (secondary analysis)
4
Hospital records (e.g., nurses’ shift reports) School records (e.g., student absenteeism) Corporate records (e.g., health insurance choices) Letters, diaries, minutes of meetings, etc. Photographs
5
Self-reports Observation Biophysiologic measures
6
Structure Quantifiability Researcher obtrusiveness Objectivity
7
Data are collected with a formal instrument. ◦ Interview schedule Questions are prespecified but asked orally. Either face-to-face or by telephone ◦ Questionnaire Questions prespecified in written form, to be self-administered by respondents
8
Closed-ended (fixed alternative) questions ◦ e.g., “Within the past 6 months, were you ever a member of a fitness center or gym?” (yes/no) Open-ended questions ◦ e.g., “Why did you decide to join a fitness center or gym?”
9
Dichotomous questions Multiple-choice questions Cafeteria questions Rank-order questions Forced-choice questions Rating questions
10
Lower costs Possibility of anonymity, greater privacy Lack of interviewer bias
11
Higher response rates Appropriate for more diverse audiences Opportunities to clarify questions or to determine comprehension Opportunity to collect supplementary data through observation
12
Scales—used to make fine quantitative discriminations among people with different attitudes, perceptions, traits Likert scales—summated rating scales Semantic differential scales
13
Consist of several declarative statements (items) expressing viewpoints Responses are on an agree/disagree continuum (usually 5 or 7 response options). Responses to items are summed to compute a total scale score.
14
Require ratings of various concepts Rating scales involve bipolar adjective pairs, with 7-point ratings. Ratings for each dimension are summed to compute a total score for each concept.
16
Used to measure subjective experiences (e.g., pain, nausea) Measurements are on a straight line measuring 100 mm End points labeled as extreme limits of sensation
18
Biases reflecting the tendency of some people to respond to items in characteristic ways, independently of item content Examples: ◦ Social desirability response set bias ◦ Extreme response set ◦ Acquiescence response set (yea- sayers) ◦ Nay-sayers response set
19
Participants sort a deck of cards into piles according to specific criteria. Cards contain statements to be sorted on a bipolar continuum (e.g., most like me/least like me). Usually 50 to 100 cards; usually 9 or 11 piles
20
Brief descriptions of situations to which respondents are asked to react Descriptions are usually written “stories.” Respondents can be asked open-ended or closed-ended questions about their reactions. Aspects of the vignettes can be experimentally manipulated.
21
Strong on directness Allows access to information otherwise not available to researchers But can we be sure participants actually feel or act the way they say they do?
22
Activities and behavior Characteristics and conditions of individuals Skill attainment and performance Verbal and nonverbal communication Environmental characteristics
23
Time-sampling—sampling of time intervals for observation Examples: Random sampling of intervals of a given length Systematic sampling of intervals of a given length Event sampling—observation of integral events
24
Excellent method for capturing many clinical phenomena and behaviors Potential problem of reactivity when people are aware that they are being observed Risk of observational biases—factors that can interfere with objective observation
25
In vivo measurements ◦ Performed directly within or on living organisms (e.g., blood pressure measures) In vitro measurements ◦ Performed outside the organism’s body (e.g., urinalysis)
26
Strong on accuracy, objectivity, validity, and precision May be cost-effective for nurse researchers But caution may be required for their use, and advanced skills may be needed for interpretation.
28
The assignment of numbers to represent the amount of an attribute present in an object or person, using specific rules Advantages: ◦ Removes guesswork ◦ Provides precise information ◦ Less vague than words
29
There are four levels (classes) of measurement: ◦ Nominal (assigning numbers to classify characteristics into categories) ◦ Ordinal (ranking objects based on their relative standing on an attribute) ◦ Interval (objects ordered on a scale that has equal distances between points on the scale) ◦ Ratio (equal distances between score units; there is a rational, meaningful zero) A variable’s level of measurement determines what mathematic operations can be performed in a statistical analysis.
30
Obtained Score = True score ± Error ◦ Obtained score: An actual data value for a participant (e.g., anxiety scale score) ◦ True score: The score that would be obtained with an infallible measure ◦ Error: The error of measurement, caused by factors that distort measurement
31
Situational contaminants Transitory personal factors (e.g., fatigue) Response-set biases Administration variations Item sampling
32
A psychometric assessment is an evaluation of the quality of a measuring instrument. Key criteria in a psychometric assessment: ◦ Reliability ◦ Validity
33
The consistency and accuracy with which an instrument measures the target attribute Reliability assessments involve computing a reliability coefficient. ◦ Reliability coefficients can range from.00 to 1.00. ◦ Coefficients below.70 are considered unsatisfactory. ◦ Coefficients of.80 or higher are desirable.
34
Stability Internal consistency Equivalence
35
The extent to which scores are similar on two separate administrations of an instrument Evaluated by test–retest reliability ◦ Requires participants to complete the same instrument on two occasions ◦ Appropriate for relatively enduring attributes (e.g., creativity)
36
The extent to which all the items on an instrument are measuring the same unitary attribute Evaluated by administering instrument on one occasion Appropriate for most multi-item instruments The most widely used approach to assessing reliability Assessed by computing coefficient alpha (Cronbach’s alpha) Alphas ≥.80 are highly desirable.
37
Low reliability can undermine adequate testing of hypotheses. Reliability estimates vary depending on procedure used to obtain them. Reliability is lower in homogeneous than heterogeneous samples. Reliability is lower in shorter than longer multi-item scales.
38
The degree to which an instrument measures what it is supposed to measure Four aspects of validity: ◦ Face validity ◦ Content validity ◦ Criterion-related validity ◦ Construct validity
39
Refers to whether the instrument looks as though it is an appropriate measure of the construct Based on judgment; no objective criteria for assessment
40
The degree to which an instrument has an adequate sample of items for the construct being measured Evaluated by expert evaluation, often via a quantitative measure—the content validity index (CVI)
41
The degree to which the instrument is related to an external criterion Validity coefficient is calculated by analyzing the relationship between scores on the instrument and the criterion. Two types: Predictive validity: the instrument’s ability to distinguish people whose performance differs on a future criterion Concurrent validity: the instrument’s ability to distinguish individuals who differ on a present criterion
42
Concerned with these questions: ◦ What is this instrument really measuring? ◦ Does it adequately measure the construct of interest?
43
Known-groups technique Testing relationships based on theoretical predictions Factor analysis
45
Descriptive statistics ◦ Used to describe and synthesize data Inferential statistics ◦ Used to make inferences about the population based on sample data
46
Parameter ◦ A descriptor for a population (e.g., the average age of menses for Canadian females) Statistic ◦ A descriptor for a population (e.g., the average age of menses for female students at McGill University)
47
A systematic arrangement of numeric values on a variable from lowest to highest, and a count of the number of times (and/or percentage) each value was obtained Frequency distributions can be described in terms of: ◦ Shape ◦ Central tendency ◦ Variability Can be presented in a table (Ns and percentages) or graphically (e.g., frequency polygons)
48
Symmetry ◦ Symmetric ◦ Skewed (asymmetric) Positive skew (long tail points to the right) Negative skew (long tail points to the left)
51
Peakedness (how sharp the peak is) Modality (number of peaks) ◦ Unimodal (1 peak) ◦ Bimodal (2 peaks) ◦ Multimodal (2+ peaks)
52
Characteristics: ◦ Symmetric ◦ Unimodal ◦ Not too peaked, not too flat More popularly referred to as a bell-shaped curve Important distribution in inferential statistics
53
Index of “typicalness” of a set of scores that comes from center of the distribution Mode—the most frequently occurring score in a distribution ◦ Ex: 2, 3, 3, 3, 4, 5, 6, 7, 8, 9 Mode = 3 Median—the point in a distribution above which and below which 50% of cases fall ◦ Ex: 2, 3, 3, 3, 4 | 5, 6, 7, 8, 9 Median = 4.5 Mean—equals the sum of all scores divided by the total number of scores ◦ Ex: 2, 3, 3, 3, 4, 5, 6, 7, 8, 9 Mean = 5.0
54
Mode, useful mainly as gross descriptor, especially of nominal measures Median, useful mainly as descriptor of typical value when distribution is skewed (e.g., household income) Mean, most stable and widely used indicator of central tendency
55
The degree to which scores in a distribution are spread out or dispersed Homogeneity—little variability Heterogeneity—great variability
57
Range: highest value minus lowest value Standard deviation (SD): average deviation of scores in a distribution
59
Used for describing the relationship between two variables Two common approaches: ◦ Contingency tables (Crosstabs) ◦ Correlation coefficients
60
A two-dimensional frequency distribution; frequencies of two variables are cross- tabulated “Cells” at intersection of rows and columns display counts and percentages Variables usually nominal or ordinal
61
Indicate direction and magnitude of relationship between two variables The most widely used correlation coefficient is Pearson’s r. Pearson’s r is used when both variables are interval- or ratio-level measures.
62
Correlation coefficients can range from -1.00 to +1.00 ◦ Negative relationship (0.00 to -1.00) —one variable increases in value as the other decreases, e.g., amount of exercise and weight ◦ Positive relationship (0.00 to +1.00) —both variables increase, e.g., calorie consumption and weight
63
The greater the absolute value of the coefficient, the stronger the relationship: Ex: r = -.45 is stronger than r = +.40 With multiple variables, a correlation matrix can be displayed to show all pairs of correlations.
64
Used to make objective decisions about population parameters using sample data Based on laws of probability Uses the concept of theoretical distributions ◦ e.g., the sampling distribution of the mean
65
A theoretical distribution of means for an infinite number of samples drawn from the same population Is always normally distributed Its mean equals the population mean. Its standard deviation is called the standard error of the mean (SEM). SEM is estimated from a sample SD and the sample size.
66
Parameter estimation Hypothesis testing (more common among nurse researchers than among medical researchers)
67
CIs indicate the upper and lower confidence limits and the probability that the population value is between those limits. ◦ For example, a 95% CI of 40–50 for a sample mean of 45 indicates there is a 95% probability that the population mean is between 40 and 50.
68
Based on rules of negative inference: research hypotheses are supported if null hypotheses can be rejected. Involves statistical decision-making to either: ◦ accept the null hypothesis or ◦ reject the null hypothesis Researchers compute a test statistic with their data and then determine whether the statistic falls beyond the critical region in the relevant theoretical distribution. ◦ Values beyond the critical region indicate that the null hypothesis is improbable, at a specified probability level.
69
If the value of the test statistic indicates that the null hypothesis is improbable, then the result is statistically significant. A nonsignificant result means that any observed difference or relationship could have happened by chance. Statistical decisions are either correct or incorrect.
70
Type I error: rejection of a null hypothesis when it should not be rejected; a false-positive result ◦ Risk of error is controlled by the level of significance (alpha), e.g., =.05 or.01. Type II error: failure to reject a null hypothesis when it should be rejected; a false-negative result ◦ The risk of this error is beta (β). ◦ Power is the ability of a test to detect true relationships; power = 1 – β. ◦ By convention, power should be at least.80. ◦ Larger samples = greater power
72
Parametric Statistics: ◦ Use involves estimation of a parameter; assumes variables are normally distributed in the population; measurements are on interval/ratio scale Nonparametric Statistics: ◦ Use does not involve estimation of a parameter; measurements typically on nominal or ordinal scale; doesn’t assume normal distribution in the population
73
Select an appropriate test statistic. Establish significance criterion (e.g., =.05). Compute test statistic with actual data. Calculate degrees of freedom (df) for the test statistic. Obtain a critical value for the statistical test (e.g., from a table). Compare the computed test statistic to the tabled value. Make decision to accept or reject null hypothesis.
74
t-Test Analysis of variance (ANOVA) Pearson’s r Chi-squared test
75
Tests the difference between two means t-test for independent groups: between- subjects test ◦ e.g., means for men vs. women t-test for dependent (paired) groups: within- subjects test ◦ e.g., means for patients before and after surgery
76
Tests the difference between more than 2 means ◦ One-way ANOVA (e.g., 3 groups) ◦ Multifactor (e.g., two-way) ANOVA ◦ Repeated measures ANOVA (RM-ANOVA): within subjects
77
Tests the difference in proportions in categories within a contingency table Compares observed frequencies in each cell with expected frequencies—the frequencies expected if there was no relationship
78
Pearson’s r is both a descriptive and an inferential statistic. Tests that the relationship between two variables is not zero.
79
Effect size is an important concept in power analysis. Effect size indexes summarize the magnitude of the effect of the independent variable on the dependent variable. In a comparison of two group means (i.e., in a t-test situation), the effect size index is d. By convention: d ≤.20, small effect d =.50, moderate effect d ≥.80, large effect
80
Statistical procedures for analyzing relationships among 3 or more variables Two commonly used procedures in nursing research: ◦ Multiple regression ◦ Analysis of covariance (ANCOVA)
81
The correlation index for a dependent variable and 2+ independent (predictor) variables: R Does not have negative values: shows strength of relationships, not direction R 2 is an estimate of the proportion of variability in the dependent variable accounted for by all predictors.
82
Extends ANOVA by removing the effect of confounding variables (covariates) before testing whether mean group differences are statistically significant Levels of measurement of variables: ◦ Dependent variable is continuous—ratio or interval level ◦ Independent variable is nominal (group status) ◦ Covariates are continuous or dichotomous
83
Used to reduce a large set of variables into a smaller set of underlying dimensions (factors) Used primarily in developing scales and complex instruments
84
The extension of ANOVA to more than one dependent variable Abbreviated as MANOVA Can be used with covariates: Multivariate analysis of covariance (MANCOVA)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.