CS 189 Brian Chu Office Hours: Cory 246, 6-7p Mon. (hackerspace lounge) brianchu.com
Agenda me for slides Questions? Random / HW Why logistic regression Worksheet
Questions Any grad students? – Thoughts on final project? Who would be able to make my 12-1pm section? – Lecture / worksheet split section Questions? Concerns? Lecture pace / content / coverage?
Features sklearn hog, sklearn tfidf, bag of words, etc.
Terminology Shrinkage (regularization) Variable with a hat (ŷ) estimated/predicted P(Y | X) ∝ P(X |Y) P(Y) posterior ∝ likelihood * prior
Why logistic regression Odds measure of relative confidence – P =.9998; 4999:1 – P =.9999; 9999:1 – Doubled confidence!.5001% .5002; :1 :1 – (basically no change in confidence) “relative increase or decrease of a factor by one unit becomes more pronounced as the factors absolute difference increases.”
Log-odds (calculations in base 10) (0, 1) (-∞, ∞) Symmetric:.99 ≈ 2,.01 ≈ -2 X units of log-odds same Y % change in confidence –0.5 0.91 ≈ 0 1 –.999 .9999 ≈ 3 4 “ Log-odds make it clear that increasing from 99.9% to 99.99% is just as hard as increasing from 50% to 91%” Credit:
Logistic Regression w x = lg [ P(Y=1|x) / (1 – P(Y=1|x) ] Intuition: some linear combination of the features tells us the log-odds that Y = 1 Intuition: some linear combination of the features tells us the “confidence” that Y = 1