John Federici NJIT Physics Department

John Federici NJIT Physics Department
Physics 114: Lecture 13 Confidence Levels, Probability Tests & Chi Square John Federici NJIT Physics Department

Physics Cartoons

Reminder of Previous Results
Last time we showed that rather than considering a single set of measurements, one can join multiple sets of measurements to refine both the estimated value and the precision of the mean. The rule for finding the standard deviation of such a combination of sets of measurements, for the case of all statistically identical data sets (i.e. same errors ), is Likewise, the rule for combining data sets with different errors is That led us to the concept of weighting, where perhaps the errors themselves are not known, but the relative weighting of the measurements is known. In that case, the rule for individual sets of data is: then combine N sets as usual

Probability Tests We sometimes need to know more than just the mean and standard deviation (uncertainty) of a set of measurements. For many cases, we also want to assess how likely our result is to be “true.” One way to do this is to relate the uncertainty to the Gaussian probability. For example, we have learned that approximately 68% of measurements in a Gaussian distribution fall within 1s of the mean m. In other words, 68% of our measurements should fall in the range (m – s) < < (m + s). If we repeat our measurement many times to determine the mean more precisely ( ), then again 68% of the repeated measurements should average in the range (m’ – sm) < < (m’ + sm). In science, it is expected that errors are given in terms of ±1s. Thus, stating a result as 3.4±0.2 means that 68% of values fall between 3.2 and In some disciplines, it is common instead to state 90% confidence intervals (1.64s), in which case the same measurements would be stated as 3.4± To avoid confusion, one should say 3.4±0.37 (90% confidence level). The OpenStax textbook will typically use a 95% confidence interval which is equivalent to 2s

Confidence Intervals A comment concerning notation:
In the sciences, as already noted, we typically quote a number as the following example: 25.4±1.2. This means that 68% of all the measured values will be between =24.2 and =26.6. This is a 68% or 1s confidence level. The OpenStax textbook uses the notation of (value-margin of error,value+margin of error) or for this example, the confidence interval is written as (24.2,26.6) for a 68% confidence level or (22.9,27.8) for a 95% confidence level. EBM=Margin of Error CL=Confidence Level α=Propability that measurement outside of confidence level

Class Exersize Professor Federici measures the weight of ball bearings to be 21.04±0.25 grams. What is the confidence interval? What is the percent confidence for this interval? What is the 95% confidence level?

Probability Tests, cont’d
A problem, however, occurs when we want to assign a probability estimate to measurements that are based on only a few samples. Although the samples are governed by the same parent mean and width (s), the sample width sm is so poorly determined with only a few measurements that we should take that into account. William S. Goset (1876–1937) of the Guinness brewery in Dublin, Ireland ran into this problem. His experiments with hops and barley produced very few samples. Just replacing σ with s did not produce accurate results when he tried to calculate a confidence interval. He realized that he could not use a normal distribution for the calculation; he found that the actual distribution depends on the sample size. This problem led him to "discover" what is called the Student's t-distribution. The name comes from the fact that Gosset wrote under the pen name "Student.“ OPENSTAX TEXTBOOK In this distribution, the parameter t is the deviation in units of the sample standard deviation, It is a complicated function:

Probability Tests, cont’d
It is a complicated function: where G is the gamma function, and n is the number of degrees of freedom (N – 1 in this case). One degree of freedom is ‘removed’ from the N measured values because we have the ‘restriction’ of the mean value. This function differs from the Gaussian PDF for small N, but is nearly identical for N > 30 or so For each sample size N, there is a different Student's t-distribution. In the OpenStax textbook, a Student's t table (See Appendix H) gives t-scores given the degrees of freedom and the right-tailed probability. For our class, we will use MATLAB. Let’s do an example

Based on Example 8.8 Suppose you do a study of the flow rate of gases through a porous material. You measure flow rates for 15 different samples with the results given below. Use the sample data to construct a 95% confidence interval for the mean flow rate for the sample (assumed normal) from which you took the data. 8.6; 9.4; 7.9; 6.8; 8.3; 7.3; 9.2; 9.6; 8.7; 11.4; 10.3; 5.4; 8.1; 5.5; 6.9 First of all, why don’t we calculate the standard deviation of the data and then say that the 95% confidence interval is just ±2σ ? Answer: because we have fewer than 30 samples, the Student-t distribution is appropriate rather than the Gaussian distribution. STEP 1: To find the confidence interval, you need the sample mean and the error estimate (EBM). x=[8.6,9.4,7.9,6.8,8.3,7.3,9.2,9.6,8.7,11.4,10.3,5.4,8.1,5.5,6.9]; mean(x) std(x) ans = ans =

Based on Example 8.8 Suppose you do a study of the flow rate of gases through a porous material. You measure flow rates for 15 different samples with the results given below. Use the sample data to construct a 95% confidence interval for the mean flow rate for the sample (assumed normal) from which you took the data. STEP 2: Use confidence level in problem and the INVERSE Student-t Function. Syntax (use Matlab Help) Percentile y = tinv(p,nu) Degrees of Freedom So if we have a 95% confidence level, 2.5% of the values are ABOVE confidence level and 2.5% are BELOW confidence level. p= =97.5%

Based on Example 8.8 Suppose you do a study of the flow rate of gases through a porous material. You measure flow rates for 15 different samples with the results given below. Use the sample data to construct a 95% confidence interval for the mean flow rate for the sample (assumed normal) from which you took the data. STEP 2: Use confidence level in problem and the INVERSE Student-t Function. >> sizez=size(x); % determine number of data points in data set >> degFree=sizez(2)-1; % determine number of degrees of freedom. >> y=tinv(0.975,degFree) % value y represents distance from origin in units of σ y = 2.1448 This is the y value for the AREA under the Student-T distribution which is ABOVE 97.5% Error (for % confidence level) relative to mean

Based on Example 8.8 Calculating the Error Bound (EBM)
Suppose you do a study of the flow rate of gases through a porous material. You measure flow rates for 15 different samples with the results given below. Use the sample data to construct a 95% confidence interval for the mean flow rate for the sample (assumed normal) from which you took the data. STEP 3: Now that you have the value of y that corresponds to 2.5% of values ABOVE the confidence level, you can use this y value to calculate the ERROR BOUND >> ErrorB=y*std(x)/sqrt(degFree+1) ErrorB = Calculating the Error Bound (EBM) The error bound formula for an unknown population mean μ when the population standard deviation σ is known is Now we can write our error in the measurement and confidence level Mean= so reported value should be 8.23±0.93 95% confidence interval = ( , )

Cartoon break…

Chi squared tests In Phys 111 lab, you probably measured the distance that an object moves under constant acceleration. Did the values you measure seem to obey the constant acceleration formula? Do you think that the types of movies people preferred are different across different age groups? How would you determine (statistically) if a coffee machine was dispensing approximately the same amount of coffee each time? You could answer these questions by conducting a hypothesis test. We will use a new distribution called the chi-square distribution. In the OpenStax textbook, three major applications of the chi-square distribution are described: 1. the goodness-of-fit test, which determines if data fits a particular distribution, such as the constant acceleration example. 2. the test of independence, which determines if events are independent, such as in the movie example 3. the test of a single variance, which tests variability, such as in the coffee example For this course, we will focus on goodness of fit test with Chi-Squared.

Chi-Square Probability
I want to introduce a the concept of the (chi-square) test of goodness of fit. Let’s consider the ‘theoretical’ constant acceleration formula and add some Gaussian noise to create ‘experimental’ points a=0.5; vo=0; xo=0;

Now let’s imagine repeating the measurement 100 times a=0.5; vo=0; xo=0; The ‘spread’ in the dots is determined by the standard deviation. Standard Deviation Note that, as expected, there is a RANGE of measured values for the distance y at each time t due to the noise. It is MORE LIKELY that the ‘measured value’ is close to the theoretical value than far away.. Density of ‘dots’ is higher near the theory line.

The definition of is where yi are the measurements, y is the expected value (ie. the Theoretical curve given by the constant acceleration formula), and si is the expected standard deviation of each yi. You can see that at each time t you expect the yi not to stray more than about si from y on average, so each data point at time t should contribute about 1 to the sum. Thus, the sum should be about n, the number of bins. This is almost right. In fact, statistically the expectation value of is not n, but the number of degrees of freedom n = n – nc, where nc is the number of constraints. Often we use the reduced chi-square

Meaning of the Chi-Square Test
Consider the plot below of the ‘experimental measurements’ and the smooth curve of the constant acceleration formula as a fit to the data. If we shift the smooth curve by changing xo, it will obviously not fit the data as well. Then will be much larger than n, because the deviations of each time t from the shifted smooth curve are larger than si. d a=0.5; vo=0; xo=1;

Meaning of the Chi-Square Test
Likewise, if we change the acceleration a , or the initial velocity in the constant acceleration formula, either of these changes will also raise the value of . The best fit of the curve, in fact, is the one that minimizes , which then should be close to n. What is n in this case? It takes three parameters to define the constant acceleration formula: acceleration, initial velocity, initial position: so n = n – nc = 6 – 3 = 3. a=1.3; vo=0; xo=0;

Chi-Square in OpenStax textbook
It should be noted that in the OpenStax textbook, the value of Chi-Squared is defined differently: OPENSTAX textbook definition The OpenStax textbook definition DOES NOT include the uncertainty in the measured values. As such, their definition of Chi-Squared is not well suited for experimental physics. Also note that their definition of Chi-Square has the UNITS of the measured quantity while ‘our’ definition is unitless. Class Discussion: How does one determine if the theoretical curve is a ‘good fit’ to the experimental data?

Is theoretical curve a ‘good fit’?
For now, we will determine a good fit to a theoretical curve, but NOT consider changing the constants in the theory equation. So in our constant acceleration problem, a, vo and xo are fixed. To answer this question, the OpenStax textbook definition of Chi-Squared requires one to choose a significance level (say 1%) and then calculate if the summed value for chi-squared defines an area of probability on the Chi-squared distribution function which exceeds the 1% threshold. Probability of ‘good fit’ to hypothesis Example 11.3 Plot of Chi-2 function Probability of ‘good fit’ does NOT meet 1% standard… NOT a good fit Number of degrees of freedom Chi-2 value for data in problem

Is theoretical curve a ‘good fit’?
Let’s use the ‘Physicist’ definition of Chi-squared. Unlike the OpenStax textbook approach in which they DEFINE a significance level for the fit, our ‘significance level’ is defined by the error or standard deviation in the measurement. If we assume that the experimental values are ‘TYPICALLY’ within one standard deviation of the ‘true’ ie. theorectical value, then the sum on the right hand side becomes So that a ‘reduced’ Chi-squared value for a ‘good fit’ should be about 1. Of course, if we get a PERFECT fit (ie. NO NOISE AND a perfect fit), Chi-squared is equal to zero.

Other Tests using Chi-Squared
Can be applied to “Do you think that the types of movies people preferred are different across different age groups?” OpenStax textbook

Other Tests using Chi-Squared
OpenStax textbook Can be applied to “How would you determine (statistically) if a coffee machine was dispensing approximately the same amount of coffee each time?”

Class Exercise Use the following matlab code to generate ‘experimental’ data for distance travelled (yExp) under constant acceleration for various times t. a=1; xo=0; vo=0; i=[1:100]; t=(i-1)/100*5; yTheory=0.5*a*t.^2+vo*t+xo; noise=randn(1,100); yExp=yTheory+noise; How many data points are there? How many degrees of freedom? What is the standard deviation you should use in your formulas? Next, in matlab calculate chi-square using the ‘experimental’ and theorectical values above. Is chi-square close to the number of data points? Lastly, change the ‘fitting’ parameter to be a=1.3. Repeat calculation of chi-squared. Do you get a better fit? How do you know?

John Federici NJIT Physics Department

Similar presentations

Presentation on theme: "John Federici NJIT Physics Department"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

John Federici NJIT Physics Department

Similar presentations

Presentation on theme: "John Federici NJIT Physics Department"— Presentation transcript:

Similar presentations

About project

Feedback