Chapter 2 Data Collection. Before any data are collected, you need to carefully define the question and develop operational definitions! Explicitly define.

Slides:



Advertisements
Similar presentations
Designing Experiments
Advertisements

Research Study. Type Experimental study A study in which the investigator selects the levels of at least one factor Observational study A design in which.
Chapter 28 Design of Experiments (DOE). Objectives Define basic design of experiments (DOE) terminology. Apply DOE principles. Plan, organize, and evaluate.
The Practice of Statistics
Introduction to the design (and analysis) of experiments James M. Curran Department of Statistics, University of Auckland
Principles of Experimental Design
Association vs. Causation
Chapter 5 Data Production
Chapter 1: Introduction to Statistics
Much of the meaning of terms depends on context. 1.
Experimental Design All experiments have independent variables, dependent variables, and experimental units. Independent variable. An independent.
Part III Gathering Data.
Collection of Data Chapter 4. Three Types of Studies Survey Survey Observational Study Observational Study Controlled Experiment Controlled Experiment.
Chapter 1 Introduction. 1.1 Engineering Statistics Planning for and collecting data Summarizing data Drawing conclusions based on data.
Assumes that events are governed by some lawful order
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
Experiments Main role of randomization: Assign treatments to the experimental units. Sampling Main role of randomization: Random selection of the sample.
CHAPTER 9: Producing Data: Experiments. Chapter 9 Concepts 2  Observation vs. Experiment  Subjects, Factors, Treatments  How to Experiment Badly 
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 4 Designing Studies 4.2Experiments.
Chapter Five Vocabulary. Page 1 (1) A Census of the Population This would be ideal – we would actually KNOW the values of the parameters! Really hard.
STATISTICAL DATA GATHERING: Sampling a Population.
1 Chapter 11 Understanding Randomness. 2 Why Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
AP Statistics Exam Review Topic #4
Warm Up – Take out a ½ sheet of paper…
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
EXPERIMENT DESIGN.
Take-home quiz due! Get out materials for notes!
CHAPTER 4 Designing Studies
Probability and Statistics
Observational Studies and Experiments
Observational Studies and Experiments
CHAPTER 4 Designing Studies
Designing Experiments
Use your Chapter 1 notes to complete the following warm-up.
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
Statistical Reasoning December 8, 2015 Chapter 6.2
Section 5.2 EXPERIMENTAL DESIGN.
Chapter 4: Designing Studies
Experimental Design: The Basic Building Blocks
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
Chapter 4: Designing Studies
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
Chapter 4: Designing Studies
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
Introduction to the design (and analysis) of experiments
DESIGN OF EXPERIMENTS by R. C. Baker
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
Chapter 1 Introduction.
Chapter 4: Designing Studies
CHAPTER 4 Designing Studies
10/28/ B Experimental Design.
If you have your parent letter, please turn in at my desk (scissors on my desk). Get out your homework and materials for notes!
Chapter Ten: Designing, Conducting, Analyzing, and Interpreting Experiments with Two Groups The Psychologist as Detective, 4e by Smith/Davis.
Probability and Statistics
Presentation transcript:

Chapter 2 Data Collection

Before any data are collected, you need to carefully define the question and develop operational definitions! Explicitly define the scope of the inferences including limitations.

Stopping distances of bike - smooth vs tread tires – On asphalt? dry? wet? – Which brands of smooth tires? Which type of brake?

There is a tradeoff between more precise answers to narrower questions or less precise answers to more general questions.

2.1 General Principles in the Collection of Engineering Data Measurement "An engineer planning a study ought to ensure that data on relevant variables will be collected by well-trained people using measurement equipment of known and adequate quality." "Training technicians has to be taken seriously."

Biases, intentional or unintentional, are to be avoided. Measurements can be made blind without personnel knowing what condition is being tested. – Medical experiments often have patients and doctors blind to medication given. Other techniques for ensuring fair play (such as randomization, blocking) are discussed later.

2.1.3 Recording Develop Documented protocols Recording forms Include documentation explicitly on the recording forms - Ambient temperature, unusual events - Put documentation into permanent computer data base 'meta-data'

2.2 Sampling in Enumerative Studies Simple random sample Put all part numbers in a box. In ExcelPart #Random # Sort by random #. Pick first n rows for sampled parts.

Stratified random sample Split parts to inventory into strata – Big expense parts – Small expense parts Stratification assures adequate sampling of subcategories and potentially more precision estimates

Advantage of random sampling Assumes objectivity Insurance against biases, intentional or unintentional Allows quantification of potential error via probability

2.3 Principles for Effective Experimentation Taxonomy of Variables Response variable - System output of interest - Compression strength of taconite - Strength of glued boards Managed variable - Set by experiments. - Experimental variable - Set at different levels - Three levels of temperature for gluing - Controlled variable - Use 3 glues but all at the same temperature Freezing effect on glue bond - Experimental variable - freezing temperature - Controlled variable - drying time, wood type, drying temp

2.3.2 Handling Extraneous Variables An extraneous variable is one that can influence the response but is not of primary interest. – Stopping times of bicycles with treaded and smooth tires. The particular rider affects stopping times. – Strength of glued wood. The moisture content of the wood can affect the strength

Sometimes the extraneous variable is observed, like rider, and sometimes it's unobserved, like moisture. Sometimes the extraneous variable is even unanticipated.

Inattention to extraneous variables can add noise to the comparisons or confuse (confound) the experimental results. – We are interested in comparing types of golf clubs. If we use golf balls of various condition, the variability due to golf ball conditions makes it harder to measure effects precisely, adds noise to the system. Other extraneous variables include golfer, temperature, wind speed, golfer fatigue, etc. – If the glue 1 is set on a humid day and glue 2 is set on a dry day, observed differences could be due to glue type or humidity effects. Here glue and humidity effects are completely confounded, confused with each other.

Strategies for reducing effects of extraneous variables – Controlling variables – Blocking – Randomization

Controlling a variable means keeping it at the same level. – Glue all boards at a nearly fixed temperature. – Have one rider for all runs of smooth and treaded tires. – Use new golf balls of the same type.

A block of experimental units, experimental times, experimental conditions, etc. is a homogeneous group of experimental units within which different levels of primary experimental variables can be applied and compared in a relatively uniform environment.

Blocking is a very important concept. There will be exam questions about this concept. – Have each rider use a treaded and smooth tire bike. Each rider is a 'block'. A block with 2 treatment levels is a paired design. – For comparing 3 glues, take 10 boards and cut each board into thirds. Use each glue on one part of each board. The boards are blocks. – Most often each treatment is replicated once in each block

Randomization is insurance against biases that might otherwise occur. – Each rider will ride bikes twice, once treaded and once smooth tire. We don't want all smooth tired runs done first. We could randomize (flip a coin) to decide. – If we have 30 small boards for gluing We could randomly assign 10 boards to each glue. A completely randomized design. If there are some obvious differences between the boards, it may help to divide the boards into 10 blocks of 3 boards. Within each block assign on board to each glue. A randomized block design.

– The order of gluing the 30 boards would also be randomized (and possibly blocked) to guard against having one glue done earlier in the day. Blocking often provides better insurance. Unblocked randomizing can end up with more of one glue earlier in the day.

! Blocks are set up Before units are assigned to treatments. If we hit 10 golf balls with a titanium driver, these 10 balls are not a block. This is common mistake by students on exams.

Both randomization and blocking are like insurance policies. In some cases not having the insurance won’t hurt. Other times not having the insurance can hurt big time. The cost, hassle of randomization and potentially blocking isn’t very big. – Usually randomization is worth the cost. – Infrequently randomization is not worth the cost. But think carefully about whether there are potential pitfalls to not randomizing. Bouncing balls on wood and cement surfaces.

2.3.3 Comparative Study A comparative study compares treatments, for example comparing 2 glues. Even when investigating a particular new treatment, it's best to do a comparative study with the old glue. If we only use the new glue on a batch of boards and compare the strengths to historical board strengths, it could be that the new boards are different from the historical boards. Any observed difference could be due to glue effects or due to changes in the boards. In medical studies it's standard to include some patients who receive the old drug or no drug for a head to head comparison with the new drug. The patients getting no drug are a 'control' group. This is another use of the term 'control'

2.3.4 Replication Replication means carrying through the whole process of adjusting values for the supervised variables, making an experimental 'run', and observing the results of that run – more than once.

"Simply re-measuring an experimental unit does not amount to real replication." Or not resetting the entire process means not having true replicates. See example 9, page 45. Example 10: Making one of each of 2 designs of paper planes and retesting the 2 planes does not accomplish independent replications of the designs. If we only make 2 planes, we don’t know if the two planes more different than we would find by making 2 planes from the same design.

2.4. Some Common Experimental Plans Completely Randomized Designs In a completely randomized design all units or runs are put into a simple hat and randomly assigned to each treatment. Number the 30 boards. Pick 10 numbers for each (boards) for each glue. – Put the board numbers 1-30 in column 1 of Excel. – Put random numbers into column 2. – Sort by the random column 2. – Assign the board numbers in row 1-10 to glue 1 rows to glue 2 rows to glue 3

Randomized Complete Block Design Units are broken into hopefully homogeneous blocks, and treatments are randomized to units within each block. – Form 10 sets of 3 similar boards in each set (block). – Within each set (block) assign 1 board randomly to each glue Most commonly each treatment is replicated once in each bock. Example 12 is unusual in this regard.

2.5 Preparing to Collect Engineering Data Read the book. Problem Definition Step 1: Identify the problem. Step 2: Understand the context of the problem. Step 3: State in precise terms the objective and scope of the study.

Study Definition Step 4: Identify the response variables(s) and appropriate instrumentation. Step 5: Identify possible factors influencing responses. Step 6: Decide how (and if so how) to manage factors likely to affect the responses. Step 7: Develop a detailed data collection protocol and time table for the first phase.

Physical Preparation Step 8: Assign responsibility for careful supervision. Step 9: Identify technicians and provide necessary instruction in objectives and methods. Step 10: Prepare data collection forms and/or equipment. Step 11: Do a dry run of analysis on fictitious data. Step 12: Write up a 'best guess' prediction of results. See the text for more details.

Some Study Questions What advantages does an experimental study have compared to an observational study? What is the difference between a population and a sample? Give an example of multivariate data. Managed variables are either experimental or controlled variables. What is a controlled variable? What is an extraneous variable? What are the 3 strategies for reducing effects of extraneous variables?

What is a “block”? Blocks are set up B_____ units are assigned to treatments. Fill in the blank. What is the potential advantage to the randomized block design versus a completely randomized design? Give an example where 2 measurements are not separate, independent replicates.