Last Time Histograms Notions of Center

Last Time Histograms Notions of Center
Binomial Probability Distributions Lists of Numbers Real Data Excel Computation Notions of Center Average of list of numbers Weighted Average

Administrative Matters
Midterm I, coming Tuesday, Feb. 24 Excel notation to avoid actual calculation So no computers or calculators Bring sheet of formulas, etc. No blue books needed (will just write on my printed version)

Midterm I, coming Tuesday, Feb. 24 Material Covered: HW 1 – HW 5 Note: due Thursday, Feb. 19 Will ask grader to return Mon. Feb. 23 Can pickup in my office (Hanes 352) So this weeks HW not included

Midterm I, coming Tuesday, Feb. 24 Extra Office Hours: Monday, Feb :00 – 9:00 Monday, Feb :00 – 10:00 Monday, Feb :00 – 11:00 Tuesday, Feb :00 – 9:00 Tuesday, Feb :00 – 10:00 Tuesday, Feb :00 – 2:00

Midterm I, coming Tuesday, Feb. 24 How to study: Rework HW problems Since problems come from there Actually do, not “just look over” In random order (as on exam) Print HW sheets, use as a checklist Work Practice Exam Posted in Blackboard “Course Information” Area

Reading In Textbook Approximate Reading for Today’s Material:
Pages , Approximate Reading for Next Class: Pages 55-68,

Big Picture Margin of Error Choose Sample Size Need better prob tools
Start with visualizing probability distributions

Big Picture Margin of Error Choose Sample Size Need better prob tools
Start with visualizing probability distributions, Next exploit constant shape property of Bi

Big Picture Start with visualizing probability distributions,
Next exploit constant shape property of Binom’l

Next exploit constant shape property of Binom’l Centerpoint feels p

Next exploit constant shape property of Binom’l Centerpoint feels p Spread feels n

Next exploit constant shape property of Binom’l Centerpoint feels p Spread feels n Now quantify these ideas, to put them to work

Will later study “notions of spread”
Notions of Center Will later study “notions of spread”

Notions of Center Textbook: Sections 4.4 and 1.2
Recall parallel development: (a) Probability Distributions (b) Lists of Numbers Study 1st, since easier

Notions of Center Lists of Numbers
“Average” or “Mean” of x1, x2, …, xn Mean = = common notation

Notions of Center Generalization of Mean: “Weighted Average”
Intuition: Corresponds to finding balance point of weights on number line

Notions of Center Textbook: Sections 4.4 and 1.2
Recall parallel development: (a) Probability Distributions (b) Lists of Numbers

Notions of Center Probability distributions, f(x)
Approach: use connection to lists of numbers

Approach: use connection to lists of numbers Recall: think about many repeated draws

Approach: use connection to lists of numbers Draw X1, X2, …, Xn from f(x)

Approach: use connection to lists of numbers Draw X1, X2, …, Xn from f(x) Compute and express in terms of f(x)

Notions of Center

Notions of Center Rearrange list, depending on values

Notions of Center Number of Xis that are 1

Notions of Center Apply Distributive Law of Arithmetic

Notions of Center Recall “Empirical Probability Function”

Notions of Center

Notions of Center Frequentist approximation

Notions of Center

Notions of Center A weighted average of values that X takes on

Notions of Center A weighted average of values that X takes on, where weights are probabilities

Notions of Center This concept deserves its own name: Expected Value
A weighted average of values that X takes on, where weights are probabilities This concept deserves its own name: Expected Value

Expected Value Define Expected Value of a random variable X:

Expected Value Define Expected Value of a random variable X:
Useful shorthand notation

Expected Value Define Expected Value of a random variable X:
Recall f(x) = 0, for most x, so sum only operates for values X takes on

Expected Value E.g. Roll a die, bet (as before):

Expected Value E.g. Roll a die, bet (as before):
Win $9 if 5 or 6, Pay $4, if 1, 2 or 3, otherwise (4) break even

Win $9 if 5 or 6, Pay $4, if 1, 2 or 3, otherwise (4) break even Let X = “net winnings”

Win $9 if 5 or 6, Pay $4, if 1, 2 or 3, otherwise (4) break even Let X = “net winnings” Are you keen to play?

Expected Value Let X = “net winnings”

Expected Value Let X = “net winnings” Weighted average, wts & values

Expected Value Let X = “net winnings”
i.e. weight average of values 9, -4 & 0, with weights of “how often expect”, thus “expected”

Expected Value Let X = “net winnings”
Conclusion: on average in many plays, expect to win $1 per play.

Expected Value Caution: “Expected value” is not what is expected on one play (which is either 9, -4 or 0) But instead on average, over many plays HW: , (1.9, 1)

Expected Value Real life applications of expected value:
Decision Theory Operations Research Rational basis for making business decisions In presence of uncertainty Common Goal: maximize expected profits Gives good average results over long run

(+ from their perspective)
Expected Value Real life applications of expected value: Decision Theory Casino Gambling Casino offers games with + expected value (+ from their perspective) Their goal: good overall average performance Expected Value is a useful tool for this

Decision Theory Casino Gambling Insurance Companies make profit By writing policies with + expected value Their goal is long run average performance

Decision Theory Casino Gambling Insurance State Lotteries State’s view: games with + expected value Raise money for state in long run overall

Flip Side of Expected Value
Decisions made against expected value:

Decisions made against expected value: Casino Gambling Why do people play?

Decisions made against expected value: Casino Gambling Why do people play? Odds are against them For sure will lose “over long run”

Decisions made against expected value: Casino Gambling Why do people play? Odds are against them For sure will lose “over long run” But love of short run successes, can make eventual long term loss worthwhile Are buying entertainment

Decisions made against expected value: Casino Gambling Insurance Why should you buy it?

Decisions made against expected value: Casino Gambling Insurance Why should you buy it? You lose in expected value sense

Decisions made against expected value: Casino Gambling Insurance Why should you buy it? You lose in expected value sense But E not applicable since you only play once Avoids chance of catastrophic loss Allows low cost sharing of risk

Decisions made against expected value: Casino Gambling Insurance State Lotteries Why do people play?

Decisions made against expected value: Casino Gambling Insurance State Lotteries Why do people play? Clear loss in expected value

Decisions made against expected value: Casino Gambling Insurance State Lotteries Why do people play? Clear loss in expected value But only play once (hopefull), so not applicable Worth feeling of hope from buying ticket?

Interesting issues about State Lotteries: A very different type of tax Big Plus: Only totally voluntary tax Big Minus: Tax paid mostly by poor

Decisions made against expected value: Key Lesson: Expected Value tells what happens on average over long run, not in one play

Decisions made against expected value: Key Lesson: Expected Value tells what happens on average over long run, not in one play Conclude Expected Value not good for everything, but very good for many things

Another Inverse View of Expected Value
Suppose you have $5000, and need $10,000

Suppose you have $5000, and need $10,000 e.g. you owe mafia $5000 (gambling debt?), clean out safe at work for $5000.

Suppose you have $5000, and need $10,000 e.g. you owe mafia $5000 (gambling debt?), clean out safe at work for $5000. If give to mafia, you go to jail

Suppose you have $5000, and need $10,000 e.g. you owe mafia $5000 (gambling debt?), clean out safe at work for $5000. If give to mafia, you go to jail, so decide to raise another $5000 by gambling.

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Can really do this, e.g. bet on Red in game of Roulette at a casino

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Important question: Make one large bet? Or many small bets? Something in between?

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Make one large bet? Or many small bets? E[Gain] = 0 x P[loss] + $2 x P[win] = 0 x (0.52) + $2 x (0.48) = $0.96

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Make one large bet? Or many small bets? E[Gain] = $0.96 Interpretation: “expect” to lose $0.04 every time you play

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Make one large bet? Or many small bets? E[Gain] = $0.96 Interpretation: “expect” to lose $0.04 every time you play Why games are so profitable for casinos

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Make one large bet? Or many small bets? E[Gain] = $0.96 Interpretation: “expect” to lose $0.04 every time you play Many plays  Expected Value dictates result

Suppose you have $5000, and need $10,000 and can make even bets, with P[win] = 0.48 Make one large bet? Or many small bets? E[Gain] = $0.96 Interpretation: “expect” to lose $0.04 every time you play So best to make just one large bet!

Interpretation: “expect” to lose $0.04 every time you play So best to make just one large bet!

Interpretation: “expect” to lose $0.04 every time you play So best to make just one large bet! After many plays, will surely lose! (lesson of expected value)

Another View: Strategy P[win $1,0000] one $5000 bet ≈ ½

Another View: Strategy P[win $1,0000] one $5000 bet ≈ ½ two $2500 bets ≈ (0.48)2 ≈ ¼

Another View: Strategy P[win $1,0000] one $5000 bet ≈ ½ two $2500 bets ≈ (0.48)2 ≈ ¼ four $1250 bets ≈ 1/16

Another View: Strategy P[win $1,0000] one $5000 bet ≈ ½ two $2500 bets ≈ (0.48)2 ≈ ¼ four $1250 bets ≈ 1/16 many bets no chance

Interpretation: “expect” to lose $0.04 every time you play So best to make just one large bet! Casino Folklore: Sometimes people really make such bets

Expected Value Binomial Expected Value: For X ~ Bi(n,p),

Expected Value Binomial Expected Value:
For X ~ Bi(n,p), Expected Value is probability weighted average of values

For X ~ Bi(n,p), Expected Value is Use Binomial Probability Distribution

For X ~ Bi(n,p), Expected Value is After a long and tricky calculation (details beyond scope of this course)

“Expect” to win proportion p, of n trials
Expected Value For X ~ Bi(n,p), Makes sense: “Expect” to win proportion p, of n trials

Expected Value For X ~ Bi(n,p), Makes sense: “Expect” to win proportion p, of n trials Just use this formula from here on out

Expected Value For X ~ Bi(n,p), Makes sense: “Expect” to win proportion p, of n trials Just use this formula from here on out E.g. to capture “shifting mean”

Expected Value For X ~ Bi(n,p), HW: 5.28a, mean part only (900)

Properties of Expected Value
Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) =

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = Σx Σy (ax + by) f(x) g(y) Weighted average, where weights are probabilities

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = Σx Σy (ax + by) f(x) g(y) = = Σx Σy (ax f(x) g(y) + by f(x) g(y)) (Distributive Rule)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = Σx Σy (ax + by) f(x) g(y) = = Σx Σy (ax f(x) g(y) + by f(x) g(y)) = = Σx Σy axf(x)g(y) + Σx Σy axf(x)g(y) (Associative Property of Addition)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = Σx Σy (ax + by) f(x) g(y) = = Σx Σy (ax f(x) g(y) + by f(x) g(y)) = = Σx Σy axf(x)g(y) + Σx Σy axf(x)g(y) = = (Σxaxf(x)) Σyg(y) + (Σxf(x)) Σybyg(y) (Distributive Rule)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = = (Σxaxf(x)) Σyg(y) + (Σxf(x)) Σybyg(y)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = = (Σxaxf(x)) Σyg(y) + (Σxf(x)) Σybyg(y) = = Σx a x f(x) + Σy b y g(y) (Since Σx f(x) = Σy y g(y) = 1)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = = (Σxaxf(x)) Σyg(y) + (Σxf(x)) Σybyg(y) = = Σx a x f(x) + Σy b y g(y) = = a Σx x f(x) + b Σy y g(y) (Distributive Rule)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = = (Σxaxf(x)) Σyg(y) + (Σxf(x)) Σybyg(y) = = Σx a x f(x) + Σy b y g(y) = = a Σx x f(x) + b Σy y g(y) = = a E(X) + b E(Y)

Linearity: For X ~ f(x) and Y ~ g(y) E(aX + bY) = a E(X) + b E(Y) i.e. E(linear combo) = linear combo (E)

HW: (mean part only) (mean part only)

HW: C18 An insurance company sells 1378 policies to cover bicycles against theft for 1 year. It costs $300 to replace a stolen bicycle and the probability of theft is estimated at Suppose there is no chance of more than one theft per individual.

HW: C18 (cont.) Calculate the expected payout for each policy, to give a break even price for each policy. ($24) If 2 times the break even price is actually charged, what is the company’s expected profit per policy, if the theft rate is actually 0.10? ($18)

Research Corner Recall Hidalgo StampData & Movie over binwidth

Research Corner Recall Hidalgo StampData & Movie over binwidth
Main point: Binwidth drives histogram performance

Research Corner Less known fact: Bin location also has Serious effect
(even for fixed width)

Research Corner How many bumps? ~2?

Research Corner Explanation? Compare with “smoothed version” called
“Kernel Density Estimate”

Research Corner Compare with “smoothed version” called
“Kernel Density Estimate” Peaks appear: when entirely in a bin

Research Corner Compare with “smoothed version” called
“Kernel Density Estimate” Peaks disappear: when split between two bins bin

Research Corner Question: If understand problem with histogram, using Kernel Density Estimate Then why not use KDE for data analysis? Will explore KDE later.

Works well for ~symmetric distributions
Notions of Center Caution about mean: Works well for ~symmetric distributions

Notions of Center Caution about mean: Works well for ~symmetric distributions E.g. Buffalo Snowfalls

Notions of Center Caution about mean: Works well for ~symmetric distributions E.g. Buffalo Snowfalls Analyzed in:

Notions of Center Caution about mean: Works well for ~symmetric distributions E.g. Buffalo Snowfalls Mean = 80.3 (from Excel)

Notions of Center Caution about mean: Works well for ~symmetric distributions E.g. Buffalo Snowfalls Mean = 80.3 Visually sensible Notion of “Center”

But poorly for asymmetric distributions
Notions of Center Caution about mean: But poorly for asymmetric distributions

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Time (in days) to suicide attempt Of Suicide Patients After Initial Treatment

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Analyzed in:

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Clearly not mound shaped

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Clearly not mound shaped Very asymmetric

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Clearly not mound shaped Very asymmetric Called “right skewed”

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Mean = 122.3

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Mean = 122.3 Sensible as “center”??

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Mean = 122.3 Sensible as “center”?? %(data ≥) = 30.2%

Notions of Center Caution about mean: But poorly for asymmetric distributions E.g. British Suicides Data Mean = 122.3 Sensible as “center”?? %(data ≥) = 30.2% Too Small…

Notions of Center Perhaps better notion of “center”:
Take center to be point in middle I.e. have 50% of data smaller And 50% of data larger This is called the “median”

Notions of Center Median: = Value in middle (of sorted list)

Notions of Center Median: = Value in middle (of sorted list)
Unsorted E.g: Sorted E.g:

Unsorted E.g: Sorted E.g: One in middle???

Unsorted E.g: Sorted E.g: One in middle??? NO, must sort

Unsorted E.g: Sorted E.g: Sensible version of “middle”

Notions of Center What about ties? Sorted E.g: 1 2 3

Notions of Center What about ties? Sorted E.g: Tie for point in 1
Tie for point in 1 middle 3

Tie for point in 1 middle 3 Break by taking average (of two tied values):

Tie for point in 1 middle 3 Break by taking average (of two tied values): e.g. Median = 1.5

Unsorted E.g: Sorted E.g: EXCEL: use function “MEDIAN”

Notions of Center EXCEL: use function “MEDIAN”
Very similar to other functions E.g. see:

Notions of Center E.g. Buffalo Snowfalls Mean = 80.3

Notions of Center E.g. Buffalo Snowfalls Mean = 80.3 Median = 79.6
(from Excel)

Very similar

Very similar (expected from symmetry)

Notions of Center E.g. British Suicides Data Mean = 122.3
Median = 77.5 Substantially different

Median = 77.5 Substantially different But which is better?

Median = 77.5 Substantially different But which is better? Goal 1: ½ - ½ middle

Median = 77.5 Substantially different But which is better? Goal 1: ½ - ½ middle Goal 2: long run average

Last Time Histograms Notions of Center

Similar presentations

Presentation on theme: "Last Time Histograms Notions of Center"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Last Time Histograms Notions of Center

Similar presentations

Presentation on theme: "Last Time Histograms Notions of Center"— Presentation transcript:

Similar presentations

About project

Feedback