Administrative Matters Midterm II Results Take max of two midterm scores:
Administrative Matters Midterm II Results Take max of two midterm scores Approx grades: A 82-92B 70-82C D 0 – 60F
Last Time Confidence Intervals –For proportions (Binomial) Choice of sample size –For Normal Mean –For proportions (Binomial) Interpretation of Confidence Intervals
Reading In Textbook Approximate Reading for Today’s Material: Pages , , Approximate Reading for Next Class: Pages ,
Sample Size for Proportions i.e. find so that: Now solve to get: (good candidate for list of formulas)
Sample Size for Proportions i.e. find so that: Now solve to get: Problem: don’t know
Sample Size for Proportions Solution 1: Best Guess Use from: –Earlier Study –Previous Experience –Prior Idea
Sample Size for Proportions Solution 2: Conservative Recall So “safe” to use:
Interpretation of Conf. Intervals Mathematically: pic 1 pic 2 3 rd interpretation
Interpretation of Conf. Intervals Frequentist View: If repeat the experiment many times
Interpretation of Conf. Intervals Frequentist View: If repeat the experiment many times, About 95% of the time, CI’s will contain μ
Interpretation of Conf. Intervals Frequentist View: If repeat the experiment many times, About 95% of the time, CI’s will contain μ (and 5% of the time it won’t)
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals
Interpretation of Conf. Intervals
Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Shows proper interpretation
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Shows proper interpretation: –If repeat drawing the sample
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Shows proper interpretation: –If repeat drawing the sample –Interval will cover truth 95% of time
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Lower Confidence Level (95% 80%)
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Lower Confidence Level (95% 80%): –Shorter confidence intervals
Interpretation of Conf. Intervals Nice Illustration: Publisher’s Website Statistical Applets Confidence Intervals Lower Confidence Level (95% 80%): –Shorter confidence intervals –Leads to lower hit rate
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC Revisit Class Example 7
Interpretation of Conf. Intervals Estimate % of Male Students at UNC
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC Recall: Q1: Sample of 25 from Class
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC Recall: Q1: Sample of 25 from Class Q2: Sample of 25 from any doorway
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC Recall: Q1: Sample of 25 from Class Q2: Sample of 25 from any doorway Q3: Sample of 25 think of names
Interpretation of Conf. Intervals Recall Class HW: Estimate % of Male Students at UNC Recall: Q1: Sample of 25 from Class Q2: Sample of 25 from any doorway Q3: Sample of 25 think of names Q4: Random sample (from phone book)
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q1: Sample from Class
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q1: Sample from Class: - Compare to theoretical
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q1: Sample from Class: - Compare to theoretical - Some bias
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q1: Sample from Class: - Compare to theoretical - Some bias - less variation
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q2: From Doorways
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q2: From Doorways: - No bias
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q2: From Doorways: - No bias - More variation
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q3: Think up names
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q3: Think up names: - Upwards bias
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q3: Think up names: - Upwards bias - More variation
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q4: Random Sample
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q4: Random Sample: - Looks better?
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q4: Random Sample: - Looks better? - Reasonable variation?
Interpretation of Conf. Intervals Histogram analysis: Class Example 7 Q4: Random Sample: - Looks better? - Reasonable variation? - Really need CIs etc.
Interpretation of Conf. Intervals Now consider C.I. View: Class Example 13
Interpretation of Conf. Intervals Now consider C.I. View: Class Example 13 Explore idea: CI should cover 90% of time
Interpretation of Conf. Intervals Class Example 13
Interpretation of Conf. Intervals Class Example 13
Interpretation of Conf. Intervals Class Example 13
Interpretation of Conf. Intervals Class Example 13
Interpretation of Conf. Intervals Class Example 13
Interpretation of Conf. Intervals Class Example 13 Q1: Summarize Coverage
Interpretation of Conf. Intervals Class Example 13 Q1: Summarize Coverage 94% > 90% (since sd too small)
Interpretation of Conf. Intervals Class Example 13 Q2: Summarize Coverage 77% < 90% (since too variable)
Interpretation of Conf. Intervals Class Example 13 Q3: Summarize Coverage 77% < 90% (since too biased)
Interpretation of Conf. Intervals Class Example 13 Q4: Summarize Coverage 87% ≈ 90% (seems OK?)
Interpretation of Conf. Intervals Class Example 13 Simulate from Binomial 87% ≈ 90% (shows within expected range)
Interpretation of Conf. Intervals Class Example 13: Q1: SD too small Too many cover Q2: SD too big Too few cover Q3: Big Bias Too few cover Q4: Good sampling About right Q5: Simulated Bi Shows “natural var’n”
Interpretation of Conf. Intervals HW: 6.20 ($1260, $1540), (but use Excel & make histogram)
Research Corner Another SiZer analysis: British Incomes Data
Research Corner Another SiZer analysis: British Incomes Data o Annual Survey (1985)
Research Corner Another SiZer analysis: British Incomes Data o Annual Survey (1985) o Done in Great Britain
Research Corner Another SiZer analysis: British Incomes Data o Annual Survey (1985) o Done in Great Britain o Variable of Interest: Family Income
Research Corner Another SiZer analysis: British Incomes Data o Annual Survey (1985) o Done in Great Britain o Variable of Interest: Family Income o Distribution?
Research Corner British Incomes Data SiZer Results: 1 bump at coarse scale (expected)
Research Corner British Incomes Data SiZer Results: 1 bump at coarse scale 2 bumps at medium scale
Research Corner British Incomes Data SiZer Results: 1 bump at coarse scale 2 bumps at medium scale (Quite a radical statement)
Research Corner British Incomes Data SiZer Results: 1 bump at coarse scale 2 bumps at medium scale Finer scale bumps not statistically significant
Research Corner British Incomes Data 2 bumps at medium scale Usual models for Incomes (one bump only)
Research Corner British Incomes Data 2 bumps at medium scale Usual models for Incomes (one bump only) 2 bumps were verified
Research Corner British Incomes Data 2 bumps at medium scale Usual models for Incomes (one bump only) 2 bumps were verified (in PhD dissertation)
Research Corner British Incomes Data 2 bumps at medium scale Usual models for Incomes (one bump only) 2 bumps were verified (in PhD dissertation) But when worth looking?
Next time Add multiple year plots as well In: IncomesAllKDE.mpg
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually σ is unknown
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually σ is unknown, so replace with an estimate, s.
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually σ is unknown, so replace with an estimate, s. For n large, should be “OK”
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually σ is unknown, so replace with an estimate, s. For n large, should be “OK”, but what about: n small?
Deeper look at Inference Recall: “inference” = CIs and Hypo Tests Main Issue: In sampling distribution Usually σ is unknown, so replace with an estimate, s. For n large, should be “OK”, but what about: n small? How large is n “large”?
Unknown SD Goal: Account for “extra variability in the s ≈ σ approximation”
Unknown SD Goal: Account for “extra variability in the s ≈ σ approximation” Mathematics: Assume individual
Unknown SD Goal: Account for “extra variability in the s ≈ σ approximation” Mathematics: Assume individual I.e. Data have mound shaped histogram
Unknown SD Goal: Account for “extra variability in the s ≈ σ approximation” Mathematics: Assume individual I.e. Data have mound shaped histogram Recall averages generally normal
Unknown SD Goal: Account for “extra variability in the s ≈ σ approximation” Mathematics: Assume individual I.e. Data have mound shaped histogram Recall averages generally normal But now must focus on individuals
Unknown SD Then
Unknown SD Then So can write:
Unknown SD Then So can write: (recall: standardization (Z-score) idea)
Unknown SD Then So can write: (recall: standardization (Z-score) idea, used in an important way here)
Unknown SD Then So can write: Replace
Unknown SD Then So can write: Replace by
Unknown SD Then So can write: Replace by, then
Unknown SD Then So can write: Replace by, then has a distribution named
Unknown SD Then So can write: Replace by, then has a distribution named: “t-distribution with n-1 degrees of freedom”
t - Distribution Notes: 1.n is a parameter
t - Distribution Notes: 1.n is a parameter (like )
t - Distribution Notes: 1.n is a parameter (like ) (Recall: these index families of probability distributions)
t - Distribution Notes: 1.n is a parameter (like ) that controls “added variability that comes from the s ≈ σ approximation”
t - Distribution Notes: 1.n is a parameter (like ) that controls “added variability that comes from the s ≈ σ approximation” View: Study Densities, over degrees of freedom…
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread: - Lower Peak
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread: - Lower Peak - Fatter Tails
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread smaller 5%-tile
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread smaller 5%-tile larger 99%-tile
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 t is more spread Makes sense, since s ≈ σ more variation
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 3 All effects are magnified Since s ≈ σ approx gets worse
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 1 Extreme Case Have terrible s ≈ σ approx
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 7 Now try larger d.f.
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 14 All approximations are better
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 25 Even better
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 25 Even better - Densities almost on top
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 25 Even better - Densities almost on top - Quantiles very close
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 100 Hard to see any difference
t - Distribution Compare N(0,1) distribution, to t-distribution, d.f. = 100 Hard to see any difference Since excellent s ≈ σ approx
t - Distribution Notes: 2.Careful: set “degrees of freedom” = = n – 1
t - Distribution Notes: 2.Careful: set “degrees of freedom” = = n – 1 (not n)
t - Distribution Notes: 2.Careful: set “degrees of freedom” = = n – 1 (not n) Easy to forget later
t - Distribution Notes: 2.Careful: set “degrees of freedom” = = n – 1 (not n) Easy to forget later Good to add to sheet of notes for exam
t - Distribution Notes: 3.Must work with standardized version of
t - Distribution Notes: 3.Must work with standardized version of i.e.
t - Distribution Notes: 3.Must work with standardized version of i.e. (will affect how we compute probs….)
t - Distribution Notes: 3.Must work with standardized version of i.e. No longer can plug mean and SD into EXCEL formulas
t - Distribution Notes: 3.Must work with standardized version of i.e. No longer can plug mean and SD into EXCEL formulas In text standardization was already done
t - Distribution Notes: 3.Must work with standardized version of i.e. No longer can plug mean and SD into EXCEL formulas In text standardization was already done, since used in Normal table calc’ns
t - Distribution Notes: 4.Calculate t probs (e.g. areas & cutoffs),
t - Distribution Notes: 4.Calculate t probs (e.g. areas & cutoffs), using TDIST
t - Distribution Notes: 4.Calculate t probs (e.g. areas & cutoffs), using TDIST & TINV
t - Distribution Notes: 4.Calculate t probs (e.g. areas & cutoffs), using TDIST & TINV Caution: these are set up differently from NORMDIST & NORMINV
t - Distribution Notes: 4.Calculate t probs (e.g. areas & cutoffs), using TDIST & TINV Caution: these are set up differently from NORMDIST & NORMINV See Class Example 14
t - Distribution Class Example 14:
t - Distribution Class Example 14: Calculate Upper Prob
t - Distribution Class Example 14: Calculate Upper Prob
t - Distribution Class Example 14: Calculate Upper Prob Using TDIST
t - Distribution Class Example 14: Calculate Upper Prob Using TDIST (Check TDIST menu)
t - Distribution Class Example 14: Calculate Upper Prob Using TDIST - cutoff
t - Distribution Class Example 14: Calculate Upper Prob Using TDIST - cutoff - d. f.
t - Distribution Class Example 14: Calculate Upper Prob Using TDIST - cutoff - d. f. - upper prob. only
t - Distribution Class Example 14: Careful: opposite from NORMDIST Using TDIST - cutoff - d. f. - upper prob. only
t - Distribution Class Example 14: Careful: opposite from NORMDIST use upper Using TDIST - cutoff - d. f. - upper prob. only
t - Distribution Class Example 14: Careful: opposite from NORMDIST use upper, NOT lower probs Using TDIST - cutoff - d. f. - upper prob. only
t - Distribution Class Example 14: To compute lower prob
t - Distribution Class Example 14: To compute lower prob Use “1 – trick”, i.e. Not Rule of probability
t - Distribution Class Example 14: How about upper prob of negative?
t - Distribution Class Example 14: How about upper prob of negative? Give it a try
t - Distribution Class Example 14: How about upper prob of negative? Give it a try Get an error message in response
t - Distribution Class Example 14: How about upper prob of negative? Give it a try Get an error message in response (Click this for sometimes useful info)
t - Distribution Class Example 14: Reason: TDIST tuned for 2-tailed
t - Distribution Class Example 14: Reason: TDIST tuned for 2-tailed (where need cutoff > 0)
t - Distribution Class Example 14: Reason: TDIST tuned for 2-tailed (where need cutoff > 0) (correct version for CIs and H. tests)
t - Distribution Class Example 14: Approach:
t - Distribution Class Example 14: Approach: Use “1 – trick”
t - Distribution Class Example 14: Approach: Use “1 – trick” (to write as prob. can compute)
t - Distribution Class Example 14: For Two-Tailed Prob
t - Distribution Class Example 14: For Two-Tailed Prob TDIST is very convenient
t - Distribution Class Example 14: For Two-Tailed Prob TDIST is very convenient (much better than NORMDIST)
t - Distribution Class Example 14: For Interior Prob
t - Distribution Class Example 14: For Interior Prob Use “1 – trick”
t - Distribution Class Example 14: For Interior Prob Use “1 – trick” TDIST again very convenient
t - Distribution Class Example 14: For Interior Prob Use “1 – trick” TDIST again very convenient (again better than NORMDIST)
t - Distribution Class Example 14: Now try increasing d.f.
t - Distribution Class Example 14: Now try increasing d.f. Big difference for small n
t - Distribution Class Example 14: Now try increasing d.f. Big difference for small n But converges for larger n
t - Distribution Class Example 14: Now try increasing d.f. Big difference for small n But converges for larger n To Normal(0,1)
t - Distribution Class Example 14: Now try increasing d.f. Big difference for small n But converges for larger n To Normal(0,1) (as expected)
t - Distribution HW: C23 For T ~ t, with degrees of freedom: (a) 3 (b) 12 (c) 150 (d) N(0,1) Find: i.P{T> 1.7} (0.094, 0.057, 0.046, 0.045) ii.P{T < 2.14} (0.939, 0.973, 0.983, 0.984) iii.P{T < -0.74} (0.256, 0.237, 0.230, 0.230) iv.P{T > -1.83} (0.918, 0.954, 0.965, 0.966)
t - Distribution HW: C23 v.P{|T| > 1.18} (0.323, 0.261, 0.240, 0.238) vi.P{|T| < 2.39} (0.903, 0.966, 0.982, 0.983) vii.P{|T| < -2.74} (0, 0, 0, 0)
And now for something completely different “Thinking Outside the Box” Also Called: “Lateral Thinking”
And now for something completely different Find the word or simple phrase suggested: death..... life
And now for something completely different Find the word or simple phrase suggested: death..... life life after death
And now for something completely different Find the word or simple phrase suggested: ecnalg
And now for something completely different Find the word or simple phrase suggested: ecnalg backward glance
And now for something completely different Find the word or simple phrase suggested: He's X himself
And now for something completely different Find the word or simple phrase suggested: He's X himself He's by himself
And now for something completely different Find the word or simple phrase suggested: THINK
And now for something completely different Find the word or simple phrase suggested: THINK think big ! !
And now for something completely different Find the word or simple phrase suggested: ababaaabbbbaaaabbbb ababaabbaaabbbb..
And now for something completely different Find the word or simple phrase suggested: ababaaabbbbaaaabbbb ababaabbaaabbbb.. long time no 'C'
t - Distribution Class Example 14: Next explore TINV (Inverse function)
t - Distribution Class Example 14: Next explore TINV (Inverse function) (Given cutoff, find area)
t - Distribution Class Example 14: Next explore TINV
t - Distribution Class Example 14: Next explore TINV Given prob. (area)
t - Distribution Class Example 14: Next explore TINV Given prob. (area) & d.f.
t - Distribution Class Example 14: Next explore TINV Given prob. (area) & d.f., find cutoff
t - Distribution Class Example 14: Next explore TINV Given prob. (area) & d.f., find cutoff (next think carefully about interpretation)
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above:
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Now invert this,
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Now invert this, i.e. given prob.
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Now invert this, i.e. given prob., find cutoff
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: For same d.f.
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: For same d.f., use resulting prob. as input
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: For same d.f., use resulting prob. as input But new answer is different
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Maybe due to rounding?
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Maybe due to rounding? Try exact value
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Maybe due to rounding? Try exact value
t - Distribution Class Example 14: Next explore TINV Recall TDIST e.g. from above: Maybe due to rounding? Try exact value Still get wrong answer
t - Distribution Class Example 14: Next explore TINV Reason for inconsistency:
t - Distribution Class Example 14: Next explore TINV Reason for inconsistency: Works via 2-tailed
t - Distribution Class Example 14: Next explore TINV Reason for inconsistency: Works via 2-tailed, not 1-tailed, probability
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability Check by inverting 2-tailed answer above:
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability Check by inverting 2-tailed answer above:
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability Check by inverting 2-tailed answer above: Get:
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability Check by inverting 2-tailed answer above: Get: plug in above output
t - Distribution Class Example 14: Explore TINV Works via 2-tailed, not 1-tailed, probability Check by inverting 2-tailed answer above: Get: plug in above output, to return to input
EXCEL Functions Summary: Normal:
EXCEL Functions Summary: Normal: plug in: get out:
EXCEL Functions Summary: Normal: plug in: get out: NORMDIST: cutoff
EXCEL Functions Summary: Normal: plug in: get out: NORMDIST: cutoff area
EXCEL Functions Summary: Normal: plug in: get out: NORMDIST: cutoff area NORMINV: area
EXCEL Functions Summary: Normal: plug in: get out: NORMDIST: cutoff area NORMINV: area cutoff
EXCEL Functions Summary: Normal: plug in: get out: NORMDIST: cutoff area NORMINV: area cutoff (but TDIST is set up really differently)
EXCEL Functions t distribution: 1 tail:
EXCEL Functions t distribution: 1 tail: plug in: get out:
EXCEL Functions t distribution: 1 tail: plug in: get out: TDIST: cutoff
EXCEL Functions t distribution: 1 tail: plug in: get out: TDIST: cutoff area
EXCEL Functions t distribution: 1 tail: plug in: get out: TDIST: cutoff area EXCEL notes: - no explicit inverse
EXCEL Functions t distribution: 1 tail: plug in: get out: TDIST: cutoff area EXCEL notes: - no explicit inverse - backwards from Normal…
EXCEL Functions t distribution: 2 tail:
EXCEL Functions t distribution: 2 tail: plug in: get out:
EXCEL Functions t distribution: 2 tail: plug in: get out: TDIST: cutoff
EXCEL Functions t distribution: Area 2 tail: plug in: get out: TDIST: cutoff area
EXCEL Functions t distribution: Area 2 tail: plug in: get out: TDIST: cutoff area TINV: area
EXCEL Functions t distribution: Area 2 tail: plug in: get out: TDIST: cutoff area TINV: area cutoff
EXCEL Functions t distribution: Area 2 tail: plug in: get out: TDIST: cutoff area TINV: area cutoff (EXCEL note: this one has the inverse)
EXCEL Functions Note: when need to invert the 1-tail TDIST, Use twice the area.
EXCEL Functions Note: when need to invert the 1-tail TDIST, Use twice the area. Area = A
EXCEL Functions Note: when need to invert the 1-tail TDIST, Use twice the area. Area = A Area = 2 A
t - Distribution HW: C23 (cont.) viii.C so that 0.05 = P{|T| > C} (3.18, 2.17, 1.98, 1.96) ix.C so that 0.99 = P{|T| < C} (5.84, 3.05, 2.61, 2.58)