1 CHAPTER 3 Analysis of Data. 2 Data Analysis The tasks in connection with the analysis of data include the following: 1. Reduction of raw data 2. Summary.

Slides:



Advertisements
Similar presentations
Chapter 2: Frequency Distributions
Advertisements

Random Sampling and Data Description
Psychology: A Modular Approach to Mind and Behavior, Tenth Edition, Dennis Coon Appendix Appendix: Behavioral Statistics.
Table of Contents Exit Appendix Behavioral Statistics.
IB Math Studies – Topic 6 Statistics.
QUANTITATIVE DATA ANALYSIS
9. SIMPLE LINEAR REGESSION AND CORRELATION
Calculating & Reporting Healthcare Statistics
Chapter 6 The Normal Distribution and Other Continuous Distributions
Introduction to Educational Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter 11 Multiple Regression.
CHAPTER 6 Statistical Analysis of Experimental Data
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
CHAPTER 6 Statistical Analysis of Experimental Data
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Hydrologic Statistics
Chapter 1: Introduction to Statistics
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
May 06th, Chapter - 7 INFORMATION PRESENTATION 7.1 Statistical analysis 7.2 Presentation of data 7.3 Averages 7.4 Index numbers 7.5 Dispersion from.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 PROBABILITIES FOR CONTINUOUS RANDOM VARIABLES THE NORMAL DISTRIBUTION CHAPTER 8_B.
Quantitative Skills 1: Graphing
STATISTICS I COURSE INSTRUCTOR: TEHSEEN IMRAAN. CHAPTER 4 DESCRIBING DATA.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Dr. Asawer A. Alwasiti.  Chapter one: Introduction  Chapter two: Frequency Distribution  Chapter Three: Measures of Central Tendency  Chapter Four:
Chapter 2 Describing Data.
6-1 Numerical Summaries Definition: Sample Mean.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
Basic Statistical Terms: Statistics: refers to the sample A means by which a set of data may be described and interpreted in a meaningful way. A method.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
Applied Quantitative Analysis and Practices
Confidence Interval Estimation For statistical inference in decision making:
Barnett/Ziegler/Byleen Finite Mathematics 11e1 Chapter 11 Review Important Terms, Symbols, Concepts Sect Graphing Data Bar graphs, broken-line graphs,
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Basic Business Statistics
Chapter 2: Frequency Distributions. Frequency Distributions After collecting data, the first task for a researcher is to organize and simplify the data.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
1 Frequency Distributions. 2 After collecting data, the first task for a researcher is to organize and simplify the data so that it is possible to get.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 6-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions Basic Business.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Chap 6-1 Chapter 6 The Normal Distribution Statistics for Managers.
1 Statistical Analysis - Graphical Techniques Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
 2012 Pearson Education, Inc. Slide Chapter 12 Statistics.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Chapter 6 The Normal Distribution and Other Continuous Distributions
Exploratory Data Analysis
Doc.RNDr.Iveta Bedáňová, Ph.D.
Basic Statistical Terms
THE STAGES FOR STATISTICAL THINKING ARE:
Statistics: The Interpretation of Data
THE STAGES FOR STATISTICAL THINKING ARE:
Statistics for Managers Using Microsoft® Excel 5th Edition
Advanced Algebra Unit 1 Vocabulary
The Normal Distribution
Presentation transcript:

1 CHAPTER 3 Analysis of Data

2 Data Analysis The tasks in connection with the analysis of data include the following: 1. Reduction of raw data 2. Summary of data 3. Study of relations between variables

3 1. Reduction of Raw Data The units in which data recorded differ by measurement methods, e.g., kN for loads or mm for deformations. Most data have meaning in comparison with similar data, they should be reduced to comparable values; e.g. loads are reduced to stresses in MPa, deformations to strains. In reducing data, corrections have to be applied for systematic errors.

4 2. Summary of Data It is important to assemble and evaluate the accumulated masses of data in large- scale experiments. Statistical procedures are advantageous for summarizing the data.

5 3. Study of Relations between Variables The final step is to develop relations between the data obtained from the test and previously obtained data or some theory. The skill with which this is done depends on the capacity and background of the analyst. Common devices employed in studying such relations are tabulations, graphs, bar charts, and correlation diagrams; the procedure is usually to hold constant all variables except two, whose relation is investigated.bar charts

6 Statistical Methods Descriptive methods help us to present data in a comprehensible form. Inference methods help us generalize from the properties of a limited sample to those of the whole population, thus making testing more efficient.

7 Random Variables A random variable may either be discrete or continuous. If the set of all possible values of the random variable is either finite or countably infinite, then the random variable is discrete If the set of all possible values of the random variable is an interval, then the random variable is continuous.

8 3.1 Variations in Data All data derived from tests are subject to variation. After the measurements have been corrected for the effects of systematic errors, it is usually found that the variations in corrected measurements follow a chance distribution. For large numbers of data, variations in measurements and measures of properties have been found to coincide closely with variations computed from theoretical considerations. When the data are few, the coincidence is often not so good, but the concepts developed from the theory of probability are applied and afford a fairly workable means of summarizing and utilizing data.

9 Raw Data Raw data: the data collected in original form or the results listed in order of testing. It is hard to analyse raw data Chart 3.1 shows the net mass of the galvanized iron sheets before and after the galvanization process. Chart 3.1

10 Ungrouped frequency distribution Ungrouped frequency distribution: arranging the items according to magnitude, usually in ascending order (u.f.d.) The minimum and maximum values may be selected and mean, median and range may be calculated on u.f.d It is also possible to study the array by dividing it into equal parts, such as quartiles (four parts), deciles (10 parts), or percentiles (100 parts). Chart 3.2 shows the previous data in this form; each of the columns in the table represents one quartile. Chart 3.2

Data Grouping Analyzing the data is important so that the results may be presented in tabular or graphical form. Most data in materials are grouped according to magnitude. The arrangement of data according to magnitude results to frequency distribution series. When the time of occurrence (time of testing)is important, a chronological sequence is sometimes used and the data are presented as time series, e.g., the amount of concrete placed on a project each day, determination of creep, deterioration of materials after alternate freeze- thaw cycles etc. Some data, such as results of test borings, may require geographical grouping.

12 Frequency Distribution It is often useful to group the data according to subdivisions called cells, class or step intervals. After the length of the interval has been decided, the number of items in each interval, called class frequency (or frequency), is determined. When there is large number of items, 13 to 20 class intervals are recommended. Too many intervals may give an irregular distribution, in this case 10 class intervals are chosen. When the total number of items is less than 25, such a presentation is of little value. Chart 3.3 shows the frequency histogram of example Chart 3.3

13 Frequency Histogram Graphical illustrations usually help us to visualize the nature of data. The x axis shows the variable studied. The frequencies, actual or relative, are plotted as ordinates.

14 Cumulative Frequency Diagram Sometimes it is of interest to know the number of data that fall below (or above) a certain value. For this reason the cumulative frequency or the relative cumulative frequency may be shown. Chart 3.4 Chart 3.4 shows the cumulative frequency diagram of the example.

15 Cumulative Frequency Diagram The variable under consideration is plotted on the x axis, and when both x axis and y axis are arithmetic, the cumulative distribution takes a peculiar form is called ogiv curve.

Sampling and Statistical Errors Samples should be taken in a random manner, so that each specimen has an equal chance of being selected every time a choice has been made. Sampling may be done with or without replacement: the chosen specimen may be returned to the population before the next choice is made, or discarded. For destructive tests the latter method must be used and it is usually more efficient in any case.

17 Sample Size-1 The size of the sample is important, as the mean of one sample is likely to differ from that of another. If in the example problem we had made only 4 observations instead of 80, we feel that we would have obtained a less accurate representation of the population, but we don’t know how much less accurate.

18 Sample Size-2 If we have a population size of N, the number of possible samples of size n is N!/[n!(N-n)!]. The mean of all the individual sample means equals the mean of the population.

19 Sample Size-3 If N is very large compared to n, the standard deviation of the sample means σ s from the population mean σ p is: σ s is called as the standard error of the mean. If σ p is unknown, as is usually the case, it may be estimated, for example, by using the standard deviation of the sample as an approximation.

20 Sample Size-4 In our example of 80 galvanized sheet specimens, the mean is calculated to be g and the standard deviation to be g. If we assume the standard deviation of the entire population to be equal to this value, then the standard error of the mean is 2.089/ = g. If we had chosen only four specimens, the corresponding value would be g.

21 Errors vs Residuals Error is the amount by which an observation differs from its expected value (average of population)- errors are unobservable Residual, on the other hand, is an observable estimate of the unobservable error. The sample average is used as an estimate of the population average. The difference between the tensile strength of each reinforcement in the sample and the unobservable population average is an error, and The difference between the tensile strength of each reinforcement in the sample and the observable sample average is a residual.

Correlation Correlation, indicates the strength and direction of a linear relationship between two random variables. In order to study a relation of group of paired measurements, the obvious procedure is to construct a scatter diagram,scatter diagram

23 Correlation The line representing the best fit is the regression line, if the line were straight, its general form is y=mx+b, where m and n are the regression coefficients. If all points were on the regression line, the correlation would be perfect and the coefficient of correlation would be 1, the sign depending on the slope of the line. For a straight regression line, a wide scatter would decrease the coefficient of correlation (r).

24 T ensile Strength Example : Tensile Strength vs Hardness Scatter Diagram

25 The heavy dashed lines equally spaced on both sides of the regression line can be placed so as to indicate any desired probability limits. The frequency polygon shows that the most likely or probable strength (H), is the central value S. For the example given, a hardness of H indicates that the chances are even (1 to 1) that the tensile strength will be between s 1 and s 2, because the limits are placed on each side of the central value S In the frequency distribution shown to the right, the open area is equal to that shown cross-hatched, each being one-half the total.

Quality control charts It is practically impossible to attain a given value of quality in each successive manufactured article because the quality is a variable and the change it its magnitude is a frequency distribution. The variation in the magnitude of some statistic of a measurable property such as tensile strength can be used as a criterion of quality. Values of a given function of quality, such as the arithmetic mean of the tensile strength of samples, each containing an equal number of items, say five, are plotted as ordinates against a scale of abscissas that gives a numerical sequence of samples increasing the customary way from left to right.

27 Example-Quality Control Chart

28 The control chart presents the data so that their consistency and regularity can be seen at a glance. The limits of variability, the lines parallel to the abscissas, are commonly set at three standard deviations on both sides of the central value. With a normal distribution, 99.73% of the samples will then satisfy the criterion.

29 When the control chart is used in connection with a standard, the limits are established with respect to the specified value, but if no standards are given, the limits are determined on the basis of the data themselves as they are accumulated.