Download presentation
1
Data Collection
2
What is Data? Data is facts and statistics that are collected together
We use data to be able to gather information for reference or analysis We translate the information we find into a form that is more convenient to understand using charts and graphs
3
Types of Data Numerical Data Categorical Data
4
Numerical Data Numerical data is quantitative
This means it can be measured using numbers and these numbers can be placed in ascending or descending order We use scatter plots and line graphs to represent numerical data There are two types of numerical data – discrete and continuous
5
Discrete Numerical Data
Discrete means the numbers used to measure the data have to be whole numbers The numbers must be distinct and separate Examples for discrete numerical data would be age, number of kittens, number of people, etc
6
Continuous Numerical Data
Continuous means the numbers used to measure the data can be any number including decimals Examples of continuous data would be temperature, time, and height
7
Categorical Data Categorical data is data that can be sorted into groups or categories Categorical data is qualitative meaning it describes something We use bar graphs and pie charts to sort categorical data There are two different types of categorical data – nominal data and ordinal data
9
Nominal Categorical Data
Nominal data can be counted but not put in ascending or descending order (sorted) Nominal data makes sense regardless of the order it is presented Examples of nominal data include gender, eye colour, hair colour, etc
10
Ordinal Categorical Data
Values or observations that are ordinal can be ranked or have a scale attached You can count and order ordinal data, but it cannot be measured like numerical data Examples of ordinal data include house numbers, dates, swimming level, etc
11
Data Collection Data collection is separated into two types: primary data and secondary data Primary data is collected first hand Secondary data is data that was collected by somebody else
12
Primary Data - Examples
Surveys Focus groups Questionnaires Personal interviews Experiments and observational study
13
Primary Data - Limitations
Do you have the time and money for: Designing your collection instrument? Selecting your population or sample? Pretesting/piloting the instrument to work out sources of bias? Administration of the instrument? Entry/collation of data? Uniqueness May not be able to compare to other populations Researcher error Sample bias Other confounding factors
14
Secondary Data – Examples of Sources
County health departments Vital Statistics – birth, death certificates Hospital, clinic, school nurse records City and county governments Surveillance data from state government programs Federal agency statistics - Census, NIH, etc.
15
Secondary Data – Limitations
When was it collected? For how long? May be out of date for what you want to analyze. May not have been collected for a long enough time Is the data set complete? There may be missing information on some observations Unless such missing information is caught and corrected for, analysis will be biased. Is the data consistent/reliable? Did variables drop out over time? Did variables change in definition over time? E.g. number of years of education versus highest degree obtained.
16
Secondary Data – Advantages
No need to reinvent the wheel. If someone has already found the data, take advantage of it. It will save you money. Even if you have to pay for access, often it is cheaper in terms of money than collecting your own data. (more on this later.) It will save you time. Primary data collection is very time consuming. (More on this later, too!) It may be very accurate. When especially a government agency has collected the data, incredible amounts of time and money went into it. It’s probably highly accurate.
17
Data Collection When collecting data from a group, we can do it two ways Observational data or Experimental data
18
Observational Data Observational data is collected by grouping people into different categories and observing how something affects them An example of observational data collection would be to separate a group into adults vs children and compare the effects of sunlight on them
19
Experimental Data Experimental data is collected by creating our own groups and imposing our own treatment on the groups to see the effects An example for experimental data would be administering a placebo drug to one group
20
Data Collection We use data collection to be able to obtain information on a smaller group and extend it to a larger population The most important thing to remember is that the group we select must represent the population as a whole It is very difficult to ensure this happens
21
Population vs Sample Population – the entire group being studied. Example: How many families in Canada have internet? Sample – the part of the population that is being studied. Example: We would not be able to ask every family in Canada if they have internet. But we would select smaller groups from each province and territory and extend it to the entire country We select a sample from an entire population so that it is easier to get the information we need We use various sampling techniques to select our sample. Example: Our survey would not be very valid if we selected only families in southern parts of Canada where internet is more easily accessible.
22
Characteristics of a Good Sample
Each person must have an equal chance of being selected into the sample. The sample must be large enough to represent the population We use various sampling techniques to ensure this happens
23
Simple Random Sample Every member of the population has an equal chance of being picked Example: Putting names in a hat and drawing at random
24
Systematic Random Sample
To go through a population sequentially and select at even intervals Example: Going through a phone book and selecting every 50th person
25
Stratified Sample A strata is a group of subjects that share a common characteristic It keeps proportionate samples of each strata to the population Example: If the population has both men and women, you ensure men and women are in the sample
26
Cluster Sample One representative group of the population chosen at random Example: Picking one floor of an office building and surveying them
27
Multi-Stage Sampling Using a combination of stages to obtain the sample
28
Convenience Sample A type of sampling technique that is based on how easy responses are to obtain Example: Surveying people stranded at an airport during a snowstorm about air travel
29
Voluntary Response Sampling
Inviting subjects to voluntarily be a part of the sample Example: Receiving a survey in the mail and being asked to complete it, random phone surveys from businesses
30
Problems with Data Collection
Questions must be simple, clear, specific, ethical, free from bias, allow for honest response, and not infringe on anyone’s privacy Questions must not contain slang, abbreviations, negatives, leading questions, and insensitivities Good questions are often anonymous and require the subject to select from a list of possible responses Survey bias can be unintentional, but can cause the data collected to be invalid. There are many different types of bias
31
Sampling Bias The chosen sample does not accurately reflect the population Example: Asking basketball players about issues with the math curriculum
32
Non-Response Bias Particular groups are under-represented in the sample because they choose not to participate When responders don’t respond, the surveyor is forced to draw their own conclusions about the sample
33
Measurement Bias When the data collection method consistently under- or overestimates a characteristic of the population Leading questions can also cause measurement bias Example: Police radar gun measuring for average speed on a particular road
34
Response Bias When participants in a survey give false or misleading answers Question quality or topic might lead to response bias Example: Teacher asks the class to raise hands if they completed their homework
35
Tally Charts A tally chart is a table used to record values by hand as the data is collected. One tally mark is used for each occurrence of a value Tally marks are usually grouped into sets of five to allow for easier counting
36
Number of days with rain
Frequency Tables Tally charts are helpful during the collection of data Once the data is collected, it is more useful to summarize the data into what we call a frequency table. A frequency table shows the data numerically Number of days with rain Number of weeks Total 52
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.