Presentation is loading. Please wait.

Presentation is loading. Please wait.

STAT 250.3: Introduction to Biostatistics Instructor: Efi Antoniou Introduction.

Similar presentations


Presentation on theme: "STAT 250.3: Introduction to Biostatistics Instructor: Efi Antoniou Introduction."— Presentation transcript:

1 STAT 250.3: Introduction to Biostatistics Instructor: Efi Antoniou Introduction

2 Highlights from the Syllabus The course website will contain important information…so plan to access it frequently: http://www.stat.psu.edu/~antoniou/stat250.3 The course website will contain important information…so plan to access it frequently: http://www.stat.psu.edu/~antoniou/stat250.3 Help is available! Don’t be afraid to ask for it. Help is available! Don’t be afraid to ask for it.

3 Breakdown of Grades Homework: Assigned every two weeks, and due at the beginning of Wednesday’s lecture. Approximately 7 homeworks. [20%] Homework: Assigned every two weeks, and due at the beginning of Wednesday’s lecture. Approximately 7 homeworks. [20%] Quizzes: There is one quiz every two weeks. There are going to be 15 minutes MC quizzes. Approximately 6 Quizzes. [25%] Quizzes: There is one quiz every two weeks. There are going to be 15 minutes MC quizzes. Approximately 6 Quizzes. [25%] Mid-term Exams: 2 midterm-exams. [30%] Mid-term Exams: 2 midterm-exams. [30%] Final: Comprehensive final. [25%] Final: Comprehensive final. [25%] Dates of quizzes and exams are on the course outline on the website. Dates of quizzes and exams are on the course outline on the website. NOTE: Homework Problems/Solutions, and Study Guides will be Provided on the course web site.

4 Contact Information Instructor: Efi Antoniou Instructor: Efi Antoniou 330b Thomas Bldg. Office Hours: 10:00-11:00 MT email: antoniou@stat.psu.edu email: antoniou@stat.psu.eduantoniou@stat.psu.edu TA: Shu-Min Liao TA: Shu-Min Liao 301 Thomas Bldg. Office Hours: 1:00-3:00 R Office Hours: 1:00-3:00 R email: sxl340@psu.edu sxl340@psu.edu

5 What is Statistics? Statistics is a collection of procedures and principles for gathering data and analyzing information when faced with uncertainty. Statistics is a collection of procedures and principles for gathering data and analyzing information when faced with uncertainty. When we have a question that needs answering, we use statistics as a method to find the answer. Statistics helps us to ask the right question, collect the right data, and make the correct conclusion. When we have a question that needs answering, we use statistics as a method to find the answer. Statistics helps us to ask the right question, collect the right data, and make the correct conclusion. Statistics is not just about number crunching! Critical thinking skills and common sense are far more important than mathematical ability.

6 Statistics in a Nutshell We use statistics to make conclusions about populations from samples. We use statistics to make conclusions about populations from samples. 1. Draw a Representative SAMPLE from the POPULATION Var 1 Var 2 Va 3 459East28 657West43 321West46 213North47 536East53 T = (x – μ)/σ P(x) = x (1-p) *(n-x) p 2. Describe the SAMPLE 3. Use Rules of Probability and Statistics to make Conclusions about the POPULATION from the SAMPLE.

7 Population, Sample & Data Consider the following hypothesis of interest: Consider the following hypothesis of interest: Smoking, weight and parent’s disease status are associated with heard disease in people 18 years of age and older. Smoking, weight and parent’s disease status are associated with heard disease in people 18 years of age and older. Population: All observations that are of interest to the researcher. Population: All observations that are of interest to the researcher. e.g. Persons 18 years of age and older with heart disease. e.g. Persons 18 years of age and older with heart disease. Sample: The observations that are actually obtained by the researcher, a subset of the population. Sample: The observations that are actually obtained by the researcher, a subset of the population. e.g. 50 people 18+ years old with heart disease. e.g. 50 people 18+ years old with heart disease. Sample size: Usually is denoted by n Sample size: Usually is denoted by n e.g. n = 50. e.g. n = 50. Raw data: is a term used for numbers and category labels that have been collected but have not yet been processed. Raw data: is a term used for numbers and category labels that have been collected but have not yet been processed. e.g. 50 sets of values for “smoking”, “weight” and “parents’ disease status – one for each individual.

8 Why do we use Samples? Because populations are usually too large to measure every unit conveniently. Because populations are usually too large to measure every unit conveniently. Because the measuring process might destroy the unit. Because the measuring process might destroy the unit. Consider the following research questions: Consider the following research questions: 1. How much do PSU students spend on books a semester? 1. How much do PSU students spend on books a semester? 2. What percentage of fire crackers produced by ACME Fire Cracker Company are defective? 2. What percentage of fire crackers produced by ACME Fire Cracker Company are defective?

9 Choosing the Sample We want a sample that is representative of the population of interest! We want a sample that is representative of the population of interest! Considering the previous three questions would the following be appropriate samples? Considering the previous three questions would the following be appropriate samples? 1. 300 full time PSU students answered this question at the bookstore. 2. 200 firecrackers taken from one batch on June 15 th, 2002.

10 How to Describe the Sample… It is often difficult to draw conclusions from raw data. It is helpful to summarize raw data in the form of tables or graphs for easier interpretation. SexCola Fast. Speed MCoke120 FPepsi 85 85 FCoke95 FPepsi105 MCoke100 FCoke110 MCoke90 MPepsi115 ……… Consider data from the survey given to stat 200 students that asked their sex, preference for coke or pepsi, and the fastest speed they had ever driven. The raw data would look something like this: Based on this data sheet it would be difficult to compare males and females in their choice of cola and fastest speed ever driven.

11 Descriptive Statistics We use descriptive statistics to summarize raw data into tables, graphs, and numerical summaries. We use descriptive statistics to summarize raw data into tables, graphs, and numerical summaries. Consider the previous data when we have summarized the variables using descriptive statistics. It is now easier to see differences between the sexes. Variables Cola by Sex: Variables Fastest Speed by Sex: Variables Cola by Sex: Variables Fastest Speed by Sex: Sex Sex CokePepsiAllF4961110 F%44.5555.45100 M392160 M%65.0035.00100 All8882170 All%51.7648.24100

12 Inferential Statistics: Making Conclusions about the Population Now that we have obtained an appropriate sample, and described the sample data we want to be able to make conclusions about the population. Now that we have obtained an appropriate sample, and described the sample data we want to be able to make conclusions about the population. Consider the following example: Sarah conducts an experiment to determine whether a new type of high fat diet is better than the standard low fat diet. In her sample of 100 subjects (50 new, 50 old diet) she finds that those on the new diet lose an average of 10 lbs and those on the old diet lose an average of 8 lbs. Clearly in the sample the new diet was more effective than the old diet, but can we make this conclusion about the population? Is there enough evidence in the sample to suggest that the new diet is better for everyone, or could Sarah’s results have just been by chance (a lucky result)? In this course we will learn statistical methods to answer Sarah’s question, and rules that help us draw conclusions about populations (what we are really interested in!) based on data from the sample. In this course we will learn statistical methods to answer Sarah’s question, and rules that help us draw conclusions about populations (what we are really interested in!) based on data from the sample.

13 Variables & Data Variables: Characteristics that varies from one individual to the next. Variables: Characteristics that varies from one individual to the next. Raw Data (Data): All of the information we gather on the subjects. It includes one ore more variables. Raw Data (Data): All of the information we gather on the subjects. It includes one ore more variables.

14 Quantitative and Categorical data Raw data from quantitative variables consist of numerical values taken on each individual. Raw data from quantitative variables consist of numerical values taken on each individual. Examples: height, number of siblings, IQ score. Raw data from categorical variables consist of group or category names that don’t necessarily have a logical ordering. Raw data from categorical variables consist of group or category names that don’t necessarily have a logical ordering. Examples: eye color, country of residence, t-shirt size.

15 For tomorrow… Skim over Chapter 1 and Sections 2.1, 2.2 2.3., 6.1 and 6.5. Skim over Chapter 1 and Sections 2.1, 2.2 2.3., 6.1 and 6.5.


Download ppt "STAT 250.3: Introduction to Biostatistics Instructor: Efi Antoniou Introduction."

Similar presentations


Ads by Google