Download presentation
Presentation is loading. Please wait.
Published byEsmond Randall Modified over 9 years ago
1
1 CHAPTER 3 Analysis of Data
2
2 Data Analysis The tasks in connection with the analysis of data include the following: 1. Reduction of raw data 2. Summary of data 3. Study of relations between variables
3
3 1. Reduction of Raw Data The units in which data recorded differ by measurement methods, e.g., kN for loads or mm for deformations. Most data have meaning in comparison with similar data, they should be reduced to comparable values; e.g. loads are reduced to stresses in MPa, deformations to strains. In reducing data, corrections have to be applied for systematic errors.
4
4 2. Summary of Data It is important to assemble and evaluate the accumulated masses of data in large- scale experiments. Statistical procedures are advantageous for summarizing the data.
5
5 3. Study of Relations between Variables The final step is to develop relations between the data obtained from the test and previously obtained data or some theory. The skill with which this is done depends on the capacity and background of the analyst. Common devices employed in studying such relations are tabulations, graphs, bar charts, and correlation diagrams; the procedure is usually to hold constant all variables except two, whose relation is investigated.bar charts
6
6 Statistical Methods Descriptive methods help us to present data in a comprehensible form. Inference methods help us generalize from the properties of a limited sample to those of the whole population, thus making testing more efficient.
7
7 Random Variables A random variable may either be discrete or continuous. If the set of all possible values of the random variable is either finite or countably infinite, then the random variable is discrete If the set of all possible values of the random variable is an interval, then the random variable is continuous.
8
8 3.1 Variations in Data All data derived from tests are subject to variation. After the measurements have been corrected for the effects of systematic errors, it is usually found that the variations in corrected measurements follow a chance distribution. For large numbers of data, variations in measurements and measures of properties have been found to coincide closely with variations computed from theoretical considerations. When the data are few, the coincidence is often not so good, but the concepts developed from the theory of probability are applied and afford a fairly workable means of summarizing and utilizing data.
9
9 Raw Data Raw data: the data collected in original form or the results listed in order of testing. It is hard to analyse raw data Chart 3.1 shows the net mass of the galvanized iron sheets before and after the galvanization process. Chart 3.1
10
10 Ungrouped frequency distribution Ungrouped frequency distribution: arranging the items according to magnitude, usually in ascending order (u.f.d.) The minimum and maximum values may be selected and mean, median and range may be calculated on u.f.d It is also possible to study the array by dividing it into equal parts, such as quartiles (four parts), deciles (10 parts), or percentiles (100 parts). Chart 3.2 shows the previous data in this form; each of the columns in the table represents one quartile. Chart 3.2
11
11 3.2 Data Grouping Analyzing the data is important so that the results may be presented in tabular or graphical form. Most data in materials are grouped according to magnitude. The arrangement of data according to magnitude results to frequency distribution series. When the time of occurrence (time of testing)is important, a chronological sequence is sometimes used and the data are presented as time series, e.g., the amount of concrete placed on a project each day, determination of creep, deterioration of materials after alternate freeze- thaw cycles etc. Some data, such as results of test borings, may require geographical grouping.
12
12 Frequency Distribution It is often useful to group the data according to subdivisions called cells, class or step intervals. After the length of the interval has been decided, the number of items in each interval, called class frequency (or frequency), is determined. When there is large number of items, 13 to 20 class intervals are recommended. Too many intervals may give an irregular distribution, in this case 10 class intervals are chosen. When the total number of items is less than 25, such a presentation is of little value. Chart 3.3 shows the frequency histogram of example Chart 3.3
13
13 Frequency Histogram Graphical illustrations usually help us to visualize the nature of data. The x axis shows the variable studied. The frequencies, actual or relative, are plotted as ordinates.
14
14 Cumulative Frequency Diagram Sometimes it is of interest to know the number of data that fall below (or above) a certain value. For this reason the cumulative frequency or the relative cumulative frequency may be shown. Chart 3.4 Chart 3.4 shows the cumulative frequency diagram of the example.
15
15 Cumulative Frequency Diagram The variable under consideration is plotted on the x axis, and when both x axis and y axis are arithmetic, the cumulative distribution takes a peculiar form is called ogiv curve.
16
16 3.3 Sampling and Statistical Errors Samples should be taken in a random manner, so that each specimen has an equal chance of being selected every time a choice has been made. Sampling may be done with or without replacement: the chosen specimen may be returned to the population before the next choice is made, or discarded. For destructive tests the latter method must be used and it is usually more efficient in any case.
17
17 Sample Size-1 The size of the sample is important, as the mean of one sample is likely to differ from that of another. If in the example problem we had made only 4 observations instead of 80, we feel that we would have obtained a less accurate representation of the population, but we don’t know how much less accurate.
18
18 Sample Size-2 If we have a population size of N, the number of possible samples of size n is N!/[n!(N-n)!]. The mean of all the individual sample means equals the mean of the population.
19
19 Sample Size-3 If N is very large compared to n, the standard deviation of the sample means σ s from the population mean σ p is: σ s is called as the standard error of the mean. If σ p is unknown, as is usually the case, it may be estimated, for example, by using the standard deviation of the sample as an approximation.
20
20 Sample Size-4 In our example of 80 galvanized sheet specimens, the mean is calculated to be 42.69 g and the standard deviation to be 2.089 g. If we assume the standard deviation of the entire population to be equal to this value, then the standard error of the mean is 2.089/ = 0.234 g. If we had chosen only four specimens, the corresponding value would be 1.045 g.
21
21 Errors vs Residuals Error is the amount by which an observation differs from its expected value (average of population)- errors are unobservable Residual, on the other hand, is an observable estimate of the unobservable error. The sample average is used as an estimate of the population average. The difference between the tensile strength of each reinforcement in the sample and the unobservable population average is an error, and The difference between the tensile strength of each reinforcement in the sample and the observable sample average is a residual.
22
22 3.4 Correlation Correlation, indicates the strength and direction of a linear relationship between two random variables. In order to study a relation of group of paired measurements, the obvious procedure is to construct a scatter diagram,scatter diagram
23
23 Correlation The line representing the best fit is the regression line, if the line were straight, its general form is y=mx+b, where m and n are the regression coefficients. If all points were on the regression line, the correlation would be perfect and the coefficient of correlation would be 1, the sign depending on the slope of the line. For a straight regression line, a wide scatter would decrease the coefficient of correlation (r).
24
24 T ensile Strength Example : Tensile Strength vs Hardness Scatter Diagram
25
25 The heavy dashed lines equally spaced on both sides of the regression line can be placed so as to indicate any desired probability limits. The frequency polygon shows that the most likely or probable strength (H), is the central value S. For the example given, a hardness of H indicates that the chances are even (1 to 1) that the tensile strength will be between s 1 and s 2, because the limits are placed on each side of the central value S In the frequency distribution shown to the right, the open area is equal to that shown cross-hatched, each being one-half the total.
26
26 3.5 Quality control charts It is practically impossible to attain a given value of quality in each successive manufactured article because the quality is a variable and the change it its magnitude is a frequency distribution. The variation in the magnitude of some statistic of a measurable property such as tensile strength can be used as a criterion of quality. Values of a given function of quality, such as the arithmetic mean of the tensile strength of samples, each containing an equal number of items, say five, are plotted as ordinates against a scale of abscissas that gives a numerical sequence of samples increasing the customary way from left to right.
27
27 Example-Quality Control Chart
28
28 The control chart presents the data so that their consistency and regularity can be seen at a glance. The limits of variability, the lines parallel to the abscissas, are commonly set at three standard deviations on both sides of the central value. With a normal distribution, 99.73% of the samples will then satisfy the criterion.
29
29 When the control chart is used in connection with a standard, the limits are established with respect to the specified value, but if no standards are given, the limits are determined on the basis of the data themselves as they are accumulated.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.