Download presentation
Presentation is loading. Please wait.
Published byPhilippa Mitchell Modified over 9 years ago
1
Chapter 2 Data Collection
2
Before any data are collected, you need to carefully define the question and develop operational definitions! Explicitly define the scope of the inferences including limitations.
3
Stopping distances of bike - smooth vs tread tires – On asphalt? dry? wet? – Which brands of smooth tires? Which type of brake?
4
There is a tradeoff between more precise answers to narrower questions or less precise answers to more general questions.
5
2.1 General Principles in the Collection of Engineering Data 2.1.1 Measurement "An engineer planning a study ought to ensure that data on relevant variables will be collected by well-trained people using measurement equipment of known and adequate quality." "Training technicians has to be taken seriously."
6
Biases, intentional or unintentional, are to be avoided. Measurements can be made blind without personnel knowing what condition is being tested. – Medical experiments often have patients and doctors blind to medication given. Other techniques for ensuring fair play (such as randomization, blocking) are discussed later.
7
2.1.3 Recording Develop Documented protocols Recording forms Include documentation explicitly on the recording forms - Ambient temperature, unusual events - Put documentation into permanent computer data base 'meta-data'
8
2.2 Sampling in Enumerative Studies Simple random sample Put all part numbers in a box. In ExcelPart #Random # Sort by random #. Pick first n rows for sampled parts.
9
Stratified random sample Split parts to inventory into strata – Big expense parts – Small expense parts Stratification assures adequate sampling of subcategories and potentially more precision estimates
10
Advantage of random sampling Assumes objectivity Insurance against biases, intentional or unintentional Allows quantification of potential error via probability
11
2.3 Principles for Effective Experimentation 2.3.1 Taxonomy of Variables Response variable - System output of interest - Compression strength of taconite - Strength of glued boards Managed variable - Set by experiments. - Experimental variable - Set at different levels - Three levels of temperature for gluing - Controlled variable - Use 3 glues but all at the same temperature Freezing effect on glue bond - Experimental variable - freezing temperature - Controlled variable - drying time, wood type, drying temp
12
2.3.2 Handling Extraneous Variables An extraneous variable is one that can influence the response but is not of primary interest. – Stopping times of bicycles with treaded and smooth tires. The particular rider affects stopping times. – Strength of glued wood. The moisture content of the wood can affect the strength
13
Sometimes the extraneous variable is observed, like rider, and sometimes it's unobserved, like moisture. Sometimes the extraneous variable is even unanticipated.
14
Inattention to extraneous variables can add noise to the comparisons or confuse (confound) the experimental results. – We are interested in comparing types of golf clubs. If we use golf balls of various condition, the variability due to golf ball conditions makes it harder to measure effects precisely, adds noise to the system. Other extraneous variables include golfer, temperature, wind speed, golfer fatigue, etc. – If the glue 1 is set on a humid day and glue 2 is set on a dry day, observed differences could be due to glue type or humidity effects. Here glue and humidity effects are completely confounded, confused with each other.
15
Strategies for reducing effects of extraneous variables – Controlling variables – Blocking – Randomization
16
Controlling a variable means keeping it at the same level. – Glue all boards at a nearly fixed temperature. – Have one rider for all runs of smooth and treaded tires. – Use new golf balls of the same type.
17
A block of experimental units, experimental times, experimental conditions, etc. is a homogeneous group of experimental units within which different levels of primary experimental variables can be applied and compared in a relatively uniform environment.
18
Blocking is a very important concept. There will be exam questions about this concept. – Have each rider use a treaded and smooth tire bike. Each rider is a 'block'. A block with 2 treatment levels is a paired design. – For comparing 3 glues, take 10 boards and cut each board into thirds. Use each glue on one part of each board. The boards are blocks. – Most often each treatment is replicated once in each block
19
Randomization is insurance against biases that might otherwise occur. – Each rider will ride bikes twice, once treaded and once smooth tire. We don't want all smooth tired runs done first. We could randomize (flip a coin) to decide. – If we have 30 small boards for gluing We could randomly assign 10 boards to each glue. A completely randomized design. If there are some obvious differences between the boards, it may help to divide the boards into 10 blocks of 3 boards. Within each block assign on board to each glue. A randomized block design.
20
– The order of gluing the 30 boards would also be randomized (and possibly blocked) to guard against having one glue done earlier in the day. Blocking often provides better insurance. Unblocked randomizing can end up with more of one glue earlier in the day.
21
! Blocks are set up Before units are assigned to treatments. If we hit 10 golf balls with a titanium driver, these 10 balls are not a block. This is common mistake by students on exams.
22
Both randomization and blocking are like insurance policies. In some cases not having the insurance won’t hurt. Other times not having the insurance can hurt big time. The cost, hassle of randomization and potentially blocking isn’t very big. – Usually randomization is worth the cost. – Infrequently randomization is not worth the cost. But think carefully about whether there are potential pitfalls to not randomizing. Bouncing balls on wood and cement surfaces.
23
2.3.3 Comparative Study A comparative study compares treatments, for example comparing 2 glues. Even when investigating a particular new treatment, it's best to do a comparative study with the old glue. If we only use the new glue on a batch of boards and compare the strengths to historical board strengths, it could be that the new boards are different from the historical boards. Any observed difference could be due to glue effects or due to changes in the boards. In medical studies it's standard to include some patients who receive the old drug or no drug for a head to head comparison with the new drug. The patients getting no drug are a 'control' group. This is another use of the term 'control'
24
2.3.4 Replication Replication means carrying through the whole process of adjusting values for the supervised variables, making an experimental 'run', and observing the results of that run – more than once.
25
"Simply re-measuring an experimental unit does not amount to real replication." Or not resetting the entire process means not having true replicates. See example 9, page 45. Example 10: Making one of each of 2 designs of paper planes and retesting the 2 planes does not accomplish independent replications of the designs. If we only make 2 planes, we don’t know if the two planes more different than we would find by making 2 planes from the same design.
26
2.4. Some Common Experimental Plans 2.4.1 Completely Randomized Designs In a completely randomized design all units or runs are put into a simple hat and randomly assigned to each treatment. Number the 30 boards. Pick 10 numbers for each (boards) for each glue. – Put the board numbers 1-30 in column 1 of Excel. – Put random numbers into column 2. – Sort by the random column 2. – Assign the board numbers in row 1-10 to glue 1 rows 11-20 to glue 2 rows 21-30 to glue 3
27
2.4.2. Randomized Complete Block Design Units are broken into hopefully homogeneous blocks, and treatments are randomized to units within each block. – Form 10 sets of 3 similar boards in each set (block). – Within each set (block) assign 1 board randomly to each glue Most commonly each treatment is replicated once in each bock. Example 12 is unusual in this regard.
28
2.5 Preparing to Collect Engineering Data Read the book. Problem Definition Step 1: Identify the problem. Step 2: Understand the context of the problem. Step 3: State in precise terms the objective and scope of the study.
29
Study Definition Step 4: Identify the response variables(s) and appropriate instrumentation. Step 5: Identify possible factors influencing responses. Step 6: Decide how (and if so how) to manage factors likely to affect the responses. Step 7: Develop a detailed data collection protocol and time table for the first phase.
30
Physical Preparation Step 8: Assign responsibility for careful supervision. Step 9: Identify technicians and provide necessary instruction in objectives and methods. Step 10: Prepare data collection forms and/or equipment. Step 11: Do a dry run of analysis on fictitious data. Step 12: Write up a 'best guess' prediction of results. See the text for more details.
31
Some Study Questions What advantages does an experimental study have compared to an observational study? What is the difference between a population and a sample? Give an example of multivariate data. Managed variables are either experimental or controlled variables. What is a controlled variable? What is an extraneous variable? What are the 3 strategies for reducing effects of extraneous variables?
32
What is a “block”? Blocks are set up B_____ units are assigned to treatments. Fill in the blank. What is the potential advantage to the randomized block design versus a completely randomized design? Give an example where 2 measurements are not separate, independent replicates.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.