CSC444F'05Lecture 51 The Stochastic Capacity Constraint.

Slides:



Advertisements
Similar presentations
Mean, Proportion, CLT Bootstrap
Advertisements

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Estimation in Sampling
G. Alonso, D. Kossmann Systems Group
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Sampling Distributions
6-1 Stats Unit 6 Sampling Distributions and Statistical Inference - 1 FPP Chapters 16-18, 20-21, 23 The Law of Averages (Ch 16) Box Models (Ch 16) Sampling.
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 16 Mathematics of Normal Distributions 16.1Approximately Normal.
Business Statistics for Managerial Decision
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Introduction to Statistics
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Chapter 19 Confidence Intervals for Proportions.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Probability and Probability Distributions
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Today: Central Tendency & Dispersion
Stat 1510: Introducing Probability. Agenda 2  The Idea of Probability  Probability Models  Probability Rules  Finite and Discrete Probability Models.
Estimation and Hypothesis Testing. The Investment Decision What would you like to know? What will be the return on my investment? Not possible PDF for.
1 1 Slide Statistical Inference n We have used probability to model the uncertainty observed in real life situations. n We can also the tools of probability.
Chapter 5 Sampling Distributions
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Density Curves Normal Distribution Area under the curve.
+ DO NOW What conditions do you need to check before constructing a confidence interval for the population proportion? (hint: there are three)
Chapter 8: Estimating with Confidence
Introduction to Data Analysis Probability Distributions.
Essentials of Marketing Research
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
Ch 8 Estimating with Confidence. Today’s Objectives ✓ I can interpret a confidence level. ✓ I can interpret a confidence interval in context. ✓ I can.
PARAMETRIC STATISTICAL INFERENCE
Probability, contd. Learning Objectives By the end of this lecture, you should be able to: – Describe the difference between discrete random variables.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Standard Deviation Z Scores. Learning Objectives By the end of this lecture, you should be able to: – Describe the importance that variation plays in.
Applied Business Forecasting and Regression Analysis Review lecture 2 Randomness and Probability.
Standard Error and Confidence Intervals Martin Bland Professor of Health Statistics University of York
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
Copyright © 2009 Pearson Education, Inc. Chapter 18 Sampling Distribution Models.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.3 Estimating a Population Mean.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Sampling Distribution Models.
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Central Tendency & Dispersion
Sample Means & Proportions
Inference: Probabilities and Distributions Feb , 2012.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 11.1 Estimating a Population Mean.
Week 6. Statistics etc. GRS LX 865 Topics in Linguistics.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U.
1 Probability and Statistics Confidence Intervals.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Stat 31, Section 1, Last Time Sampling Distributions
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Sampling Distribution Models
CHAPTER 22: Inference about a Population Proportion
Econometric Models The most basic econometric model consists of a relationship between two variables which is disturbed by a random error. We need to use.
2/5/ Estimating a Population Mean.
Presentation transcript:

CSC444F'05Lecture 51 The Stochastic Capacity Constraint

CSC444F'05Lecture 52 MIDTERM NEW DATE AND TIME AND PLACE Tuesday, November 1 8pm to 9pm Woodsworth College WW111

CSC444F'05Lecture 53 Estimates Estimates are never 100% certain E.g, if we estimate a feature at 20 ECD’s –Not saying will be done in 20 ECDs –But then what are we saying? Are we confident in it? Is it optimistic? Is it pessimistic? A quantity whose value depends upon unknowns (or upon random chance) is called a stochastic variable Release planning contains many such stochastic variables.

CSC444F'05Lecture 54 Confidence Intervals Say we toss a fair coin 5000 times –We expect it to come up heads ½ the time – 2500 times or so –Exactly 2500? Chance is only 1.1% –≤ 2500? Chance is 50% If we repeat this experiment over and over again (tossing a coin 5000 times), on average ½ the time it will be more, ½ the time less. –≤ 2530? Chance is 80% –≤ 2550? Chance is 92% These (50%, 80%, 92%) are called confidence intervals –With 80% confidence we can say that the number of heads will be less than 2530.

CSC444F'05Lecture 55 Stochastic Variables Consider the work factor of a coder, w. –When estimating in advance, w is a stochastic variable. –Stochastic variables are described by statistical distributions –A statistical distribution will tell you: For any range of w The probability of w being within that range –Can be described completely with a probability density function. X-axis: all possible values of the stochastic variable Y-axis: numbers >= 0 The probability that the stochastic variables lies between two values a and b is given by the area under the p.d.f. between a and b.

CSC444F'05Lecture 56 PDF for w Probability that 0.5 < w < 0.7 = 66% Looks to be fairly accurate. –Has a finite probability of being 0 –Has not much chance of being much greater than 1.2 or so Drawing such a curve is the only real way of describing a stochastic variable mathematically.

CSC444F'05Lecture 57 Parameterized Distributions “So, Bill, here’s a piece of paper, could you please draw me a p.d.f. for your work factor?” –Nobody knows the distribution to this level of accuracy –Very hard to work with mathematically Usual method is to make an assumption about the overall shape of the curve, choosing from a few set shapes that are easy to work with mathematically. Then ask Bill for a few parameters that we can use to fit the curve. Because we are not so sure on our estimates anyways, the relative inaccuracy of choosing from one of a set of mathematically tractable p.d.f.’s is small compared to the other estimation errors.

CSC444F'05Lecture 58 e.g., a Normal for w Assume work factors are adequately described by a bell-shaped Normal distribution. 2 points are required to fit a Normal E.g., average case and some reasonable “worst case”. –Average case: half the time less, half the time more = 0.6 –“Worst” case: 95% of the time w won’t be that bad (small) = 0.4 Normal curves that fits is N(0.6,0.12). area = 68%

CSC444F'05Lecture 59 Maybe not Normal Normals are easiest to work with mathematically. May not be the best thing to use for w –Normal is symmetric about the mean E.g., N(0.6,0.12) predicts a 5% “best case” of 0.8. What if Bill tells us the 5% best case is really 1.0? –Then can’t use a Normal –Would need a skewed (tilted) distribution with unsymmetrical 5% and 95% cases. –Normal extends to infinity in both directions Finite probability of w 10

CSC444F'05Lecture 510 Estimates Most define our quantities very precisely E.g., for a feature estimate of 1 week –Post-Facto What are the units? 40 hours? Longer? Shorter? Dedicated? Disrupted? One person or two?... Dealt with this last lecture in great detail –Stochastic 1 week best case? 1 week worst case? 1 week average case? Need a p.d.f Depending upon these concerns, my “1 week” maybe somebody else’s 4 weeks. –Very significant issue in practice

CSC444F'05Lecture 511 The Stochastic Capacity Constraint T is fixed F and N are both stochastic quantities. Can only speak about the chance of the goo fitting into the rectangle Say F=400, N=10, T=40: are we good to go? –Cannot say. –Need precise distributions to F and N to answer, and then only at some confidence level.

CSC444F'05Lecture 512 Summing Distributions F and N are sums and products over many contributing stochastic variables. E.g. –F = f1 + f2 –If f1 and f2 have associated statistical distributions, what is the statistical distribution of F? –In general, no answer. –Special case: f1 and f2 are both Normal Then F will be Normal as well. Mean of F will be the sums of the means of f1 and f2 Standard deviation of F will be the square root of the sums of the squares of the standard deviations of f1 and f2. –How about f1 * f2? Figet about it! Huge formula, result is not a Normal distribution –One needs statistical simulation software tools to do arithmetic on stochastic variables.

CSC444F'05Lecture 513 Law of Large Numbers If we sum lots and lots of stochastic variables, the sum will approach a Normal distribution. Therefore something like F is going to be pretty close to Normal. –E.g., 400 features summed N will also be, but a bit less so –E.g., 10 w’s summed

CSC444F'05Lecture 514 Delta Statistic D(T) = N  T  F If we have Normal approximations for N and F, can compute the Normal curve for D as a function of various T’s. We can then choose a T that leads to a D we can live with. Interested in Probability [ D(T)  0 ] The probability that all features will be finished by dcut. In choosing T will want to choose a confidence interval the company can live with, e.g., 80%. Then will pick a T such that D(T)  0 80% of the time.

CSC444F'05Lecture 515 Example Picking T F is Normal with mean 400 and 90% worst case 500 N is Normal with mean 10 and 90% worst case 8 Cells are D(T) = N  T  F at the indicated confidence level Note transitions through 0. confidence level 25%40%50%60%80%90%95% T

CSC444F'05Lecture 516 Choices for T To be 95% certain of hitting the dates, choose T = 60 workdays Or... If we plan to take 40 workdays, only 5% of the time will be late by more than 20 workdays To be 80% sure, T = 49 To gamble, for a 25% fighting chance, make T = 33.

CSC444F'05Lecture 517 Shortcut Ask for 80% worst case estimates for everything. If F = NxT using the 80% worst case values, then there is an 80% chance of making the release. The Deterministic Release Plan is based on this approach. If you also ask for mean cases for everything, can then fit a Normal distribution for D(T) and can predict the approximate probability of slipping.

CSC444F'05Lecture 518 Initial Planning Start with a T Choose a feature set See if the plan works out If not, adjust T and/or the feature set an continue

CSC444F'05Lecture 519 Adjusting the Release Plan Count on the w estimated to be too high and feature estimates to be too low. Re-adjust as new data comes in. Can “pad the plan” by choosing a 95% T. –Will make it with a high degree of confidence –May run out of work –May gold plate features Better to have an A-list and a B-list –Choose one T such that, e.g., Have 95% confidence of making the A list Have 40% confidence of making the A+B list.

CSC444F'05Lecture 520 Appreciating Uncertainty Successful Gamblers and Traders –Really understand probabilities Both will tell you the trick is to know when to take your losses In release planning, the equivalent is knowing when to go to the boss and say –We need to move out the date –Or we need to drop features from the plan

CSC444F'05Lecture 521 Risk Tolerance Say a plan is at 60% Developer may say: –Chances are poor: 60% at best An entrepreneurial CEO will say –Looking great! At least a 60% chance of making it. Should have an explicit discussion of risk tolerance

CSC444F'05Lecture 522 Loading the Dice Can manage to affect the outcome. Like a football game: –Odds may be 3-to-1 against a team winning –But by making a special effort, the team may still win In release planning –Base the odds on history –But as a manager, don’t ever accept that history is as good as you can do! E.g., introduce a new practice that will boost productivity –Estimate will increase productivity by 20% –Don’t plan for that! –Plan for what was achieved historically. –Manage to get that 20% and change history for next time around.

CSC444F'05Lecture 523 Example Stochastic Release Plan Sample Stochastic Release PlanStochastic Release Plan