GS/PPAL Research Methods and Information Systems

GS/PPAL 6200 3.00 Research Methods and Information Systems
A QUANTITATIVE RESEARCH PROJECT - DATA COLLECTION DATA DESCRIPTION DATA ANALYSIS

A Quantitative Research Project: Generic Overview
Research Topic: What is the main issue? Research Questions: Descriptive? Relational? Causal? (How will we measure key variables?) Quantitative Research Design: Experimental? Cross-sectional? Longitudinal? Data Collection Method: Survey? Secondary Data? Census or Sample? Data Analysis: Descriptive Statistics? Regression Analysis? Experimental: Can we create a control group? What would that look like?

A Quantitative Research Project: An Example
Research Topic: Academic Performance Research Questions: How well do graduating students perform academically? What explains that performance? Measure “academic performance” by graduating CGPA Research Design: Cross-sectional analysis of graduating students in a given year Data Collection: Survey (a random sample of) students graduating in 2014 Data Description: Describe the data with basic statistics Data Analysis: Reasons for attending university and performance; Total hours studied and CGPA

DATA COLLECTION INSTRUMENT: Survey Questionnaire
To obtain the data from graduating students in 2014 we need a survey instrument and code book To develop the survey questions, we might first Conduct a focus group to get a better sense of the key factors influencing academic performance of a small convenience sample Conduct a small pilot study to test our survey instrument and to practice the analysis

Whom to Survey? Census or Sample?
What is the total theoretical population in which one is interested? What is the accessible population? Is it feasible or practicable to conduct a census on the accessible population? A census would tell us the actual information for all graduating students in a given year, which is a sample of all students graduating for all time A sample of all the students graduating in a given year is then a sample of a sample

DATA COLLECTION PROCESS - OVERVIEW
WHAT: We will measure “academic performance” by the student’s CGPA on graduation WHO: It is not practicable to survey all graduating students, so we will choose a sample of students HOW: We want to conduct a statistical analysis so we will collect data from a sample of students WHY: To test a hypothesis about the factor(s) that influence CGPA but… We know that the CGPA mean we observe in our sample will be imprecise as a measure of the true mean More accurate information is costly So…we choose our sample size guided by this tradeoff

Strategies for selecting cases to study (i.e., a sample)
Probability sampling: Random Sampling – easy to do and explain, but not the statistically efficient, and may not be a good representation of sub-groups Stratified Random Sampling – take a simple random sample from subgroups of the population Systematic random sampling – take every kth unit where k = N/n

Strategies for selecting cases to study (i.e., sample)
Non-probability Sampling Convenience sampling – easily accessible but not necessarily representative of the population Purposive sampling – reaches a target population Expert sampling – convenes a panel of experts Snowball sampling – first respondents recommend others to be included

How do we Sample? Whom do we Sample?
If the relevant population is all York University students graduating in 2014 with undergraduate degrees, then the census population is approximately 10,000 students To collect information from all 10,000 students is not practicable; therefore consider sampling Sampling Technique: Probability Sampling - Simple Random Sampling Sampling Frame: How to select participants? Once Ethics Approval obtained… University database contains student contact information and CGPAs of all graduating students; a random number generator can perform the randomization for selection; …

How many do we sample? Sample Size
For inferential statistics, “small” is n < 30 Decision is guided by two competing goals: maximize the probability that we obtain correct information on the relevant variables and minimize the cost of our study

How will we know when we know?
Understanding the information we have is complicated by the uncertainty inherent in the data we collect Construct Validity /Measurement Issues: Is CGPA a good indicator of academic performance? Is the mean (average) CGPA we observe for the census population equal to the “true” mean? Is the mean CGPA we observe for a sample equal to the census population mean?

The Challenge We can never know if we are observing the “true” population mean (average) of CGPA since any observed population mean will deviate plus or minus σ (= a “standard deviation”) Any census of a graduating class will only be a sample of the “true” population of all graduating classes A sample of the census population – as a sample of the sample – introduces more uncertainty

Uncertainty complicates “knowing”
Uncertainty Source #1: If we are seeking to explain the key factors determining the CGPA of graduates, we have to account for the fact that the observed CGPA might deviate from the true population mean by an amount sigma ( = σ ) Uncertainty Source #2: If it is infeasible to conduct a census of all graduating students in even one year, and all we can do is sample the sample, then we have additional uncertainty related to the size of the sample

Uncertainty from Sampling
We know there is one inescapable source of uncertainty (Uncertainty Source #1) The sampling error (Uncertainty Source #2) complicates this uncertainty … but in a predictable way. We know the larger (smaller) the sample size, the smaller (greater) the uncertainty from any sampling error IF we use a simple random sampling method

Some Vocabulary for this Uncertainty
We can never know if we are observing the true value of CGPA or some value plus (minus) deviations due to (1) some unexplainable shock (Uncertainty Source #1) – so we talk about a “Confidence Interval” We know that sampling error (Uncertainty Source #2) is possible – so we talk about our analysis of the sample in terms of its “Margin of Error”

What we know in the face of this Uncertainty
For a 95% Confidence Interval we know our census population mean would be close to the true population mean 95% of the times and …we can have the confidence that the census population mean plus or minus a random error will contain the true population mean 95% of the time

What we know (cont’d) When sampling error is possible, and we have only sample statistics to estimate census population values, we must adjust our understanding of the 95% CI for this additional uncertainty If - we have a Margin of Error of 10% for a 95% Confidence Interval … Then - 90% of the estimated sample Confidence Intervals in repeated random sampling of the census population will contain the true population mean (average) value 95% of the time

Sample Size Guide (meaningful for studies adopting random sampling)
For a 95% confidence interval (Margin of Error of 5%), a sample of 400 is needed …95% CI, Margin of Error of 10%, n = 100 …95% CI, Margin of Error of 3%, n = 1000 …95% CI, Margin of Error of 1%, n = 10,000

GS/PPAL Research Methods and Information Systems

Similar presentations

Presentation on theme: "GS/PPAL Research Methods and Information Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GS/PPAL Research Methods and Information Systems

Similar presentations

Presentation on theme: "GS/PPAL Research Methods and Information Systems"— Presentation transcript:

Similar presentations

About project

Feedback