Statistics for the Social Sciences Section 1.1: Statistics and the Social Sciences This Revision: 02/21/2019 Prepared by Adam J. McKee
Is It Evil? For many students, statistics is a dirty word, second only in evil to college algebra. Its purpose is to make life hard. This could not be further from the truth! The actual purpose of statistics is to make life easy by simplifying and organizing information.
Defining Statistics Most of us think of statistics as facts and figures—things like RBIs in baseball, the average temperature in July, and the number of deaths caused by drunk driving last year. Statistics is that and much more. We will define statistics as a set of rules and techniques for organizing, summarizing, and interpreting empirical information.
Answering Questions Statistics also allows us to test our ideas about how the social world works. In fact, social scientists could not search for answers to many of our most compelling questions without statistical tools. To better understand this statement, it will be helpful to recall how social scientists answer questions about the social world.
The Scientific Approach: It’s Empirical One of the most important things to remember about science is that the scientific method is empirical. Empirical means based on observation. In addition to observing, scientists make systematic observations. Empirical means based on observation rather than other methods of knowing, such as reliance on authority or “common sense.”
The Scientific Approach: It’s Systematic By systematic I mean that the scientist observes with a plan, and the plan is designed to ensure objectivity.
The Scientific Approach: It’s Objective Objectivity means that the scientist takes steps to record facts that are not colored by emotion and personal prejudice. Recording observations as numerical measurements greatly aids the researcher in maintaining objectivity.
It’s Too Complicated! All of these observations result in huge amounts of numbers. We can organize those numbers into a table form such as a spreadsheet, but there will still be pages and pages of them. Information (data) presented in this way is meaningless! The human mind simply cannot wrap itself around large amounts of numbers like that and draw anything meaningful from them.
Statistics to the Rescue We need to organize and simplify the data. Organizing, simplifying, and summarizing data is a primary function of descriptive statistics. Descriptive statistics are a family of statistical methods that organize, simplify, and summarize data.
Kinds of Research Within the scope of empirical research, there are several strategies that the social scientist can use to answer questions. An important distinction is between whether the research is descriptive relational experimental
Descriptive Research Descriptive research seeks to describe the characteristics of a particular social phenomenon. Researchers doing descriptive research ask, “What is the state of things?” Public opinion polls are an example of descriptive research. They answer questions about how people feel about particular issues.
Limitations of Descriptive Research To be merely descriptive, the researcher treats these opinions independently and does not consider how these relate to other variables and does not offer up any potential causes. In other words, variables are considered by themselves, and relationships between variables are not examined.
Relational Research Relational research seeks to understand how two or more variables are related. If our pollster in the above example examined how religious preference was related to whether a voter was going to vote Democrat or Republican, the study would be relational. This is because the researcher is examining the relationship between religious preference and voting preference.
Experimental Research Experimental research seeks to make causal statements about how the social world works. That is, the researcher wants to make cause and effect statements. Example: Did a university’s alcohol awareness program (cause) reduce incidents of alcohol related crime on campus (effect)?
The Importance of Design Note that the confidence with which a researcher can make such statements relies heavily on good experimental design, which is beyond the scope of this text. We will focus strictly on the statistical concerns of good research design.
The Lines are Blurry Note that these types of research are not mutually exclusive. In the practical world of research, they tend to work in concert, building on each other. A researcher conducting an experiment to demonstrate that one variable causes another will no doubt want to describe both variables and explain how they are related as well. In other words, these divisions are somewhat arbitrary, but they are useful because each has a set of statistical techniques that goes along with it.
Matching Statistics with Research Design Descriptive statistics are heavily used in descriptive research, and inferential statistics are used heavily in experimental research. Relational research uses a family of correlational statistics that can be both descriptive and inferential depending on how they are treated.
Goal of Research: Generalization Descriptive statics, then, seek to summarize and explain the characteristics of a variable. The other major branch of statistics, inferential statistics, lets the researcher make generalizations about a population given information from a sample. Generalizations are general statements obtained by inference from specific cases.
What is an Inference? An inference is a conclusion that is based on facts and reasoning. We can gather from these definitions that the purpose of inferential statistics is to let us make general statements about populations based on information that we have gathered from samples.
Populations To understand the jargon of inferential statistics, it is helpful to understand how social scientists answer most questions. It is important to understand that scientists are generally interested in populations. A population is the entire group of people that a researcher is interested in making statements about. A population can be very small, such as female Supreme Court justices, or it can be very large, such as every registered voter in the United States.
What is a Census? When a researcher collects information about an entire population, it is called a census. Numbers that describe characteristics of populations are known as parameters. For example, when the bureau of the census reports the Median household income of Americans, that is a parameter because it is a number that came from data collected on the entire population.
What’s the Deal with Samples? When populations are large, it becomes unfeasible to collect data for every person. There simply are not enough resources (human resources, time, money, etc.) to conduct a census. When this is the case, researchers use samples. A sample is a subset of a population that is used to answer questions about the population from which it was drawn. Numbers that describe characteristics of samples are called statistics.
Hypothesis: The Simple Definition Social scientists using inferential statistics start with a hypothesis and look to see if data gathered from systematic observations are consistent with the hypothesis. Many of these techniques are complex, and are best left to computers. For now, we will define a hypothesis as an educated guess as to how some social phenomenon occurs. (We will delve more deeply into hypotheses when we get to the sections on inferential statistics).
Why Do I Care?! Even if you do not plan to conduct research, you must be an intelligent consumer of research to assume a leadership role in today’s data rich world. The social science professions, even the applied ones such as social work and criminal justice, depend on social scientific research to determine best practices, prepare grant applications, conduct program evaluations, and many other management tasks.
Accountability In every arena of public management, there is a growing demand for accountability. Accountability in a narrow but important sense means that tax dollars are spent on programs that work and work well. The public demands objective evaluations of programs and policies. Such evaluations are performed using the same techniques that social scientists use to answer other questions.