Learning Simio Chapter 10 Analyzing Input Data

Slides:



Advertisements
Similar presentations
Chapter 6 Continuous Random Variables and Probability Distributions
Advertisements

Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Statistics review of basic probability and statistics.
Eastern Mediterranean University Department of Industrial Engineering IENG461 Modeling and Simulation Systems Computer Lab 2 nd session ARENA (Input Analysis)
Chapter 5 Statistical Models in Simulation
Prof. Dr. Ahmed Farouk Abdul Moneim. 1) Uniform Didtribution 2) Poisson’s Distribution 3) Binomial Distribution 4) Geometric Distribution 5) Negative.
Chapter 4 Discrete Random Variables and Probability Distributions
The Binomial Probability Distribution and Related Topics
Random-Variate Generation. Need for Random-Variates We, usually, model uncertainty and unpredictability with statistical distributions Thereby, in order.
Simulation Modeling and Analysis
Chapter 6 Continuous Random Variables and Probability Distributions
Horng-Chyi HorngStatistics II127 Summary Table of Influence Procedures for a Single Sample (I) &4-8 (&8-6)
Chapter 11 Multiple Regression.
3-1 Introduction Experiment Random Random experiment.
Chapter 5 Continuous Random Variables and Probability Distributions
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Hypothesis Testing and T-Tests. Hypothesis Tests Related to Differences Copyright © 2009 Pearson Education, Inc. Chapter Tests of Differences One.
Chapter 9 Title and Outline 1 9 Tests of Hypotheses for a Single Sample 9-1 Hypothesis Testing Statistical Hypotheses Tests of Statistical.
Chapter 4 Continuous Random Variables and Probability Distributions
SIMULATION MODELING AND ANALYSIS WITH ARENA
A Review of Probability Models
Chapter 10 Hypothesis Testing
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Discrete Random Variables Chapter 4.
QA in Finance/ Ch 3 Probability in Finance Probability.
Chapter 5 Modeling & Analyzing Inputs
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Chapter 4 – Modeling Basic Operations and Inputs  Structural modeling: what we’ve done so far ◦ Logical aspects – entities, resources, paths, etc. 
0 Simulation Modeling and Analysis: Input Analysis K. Salah 8 Generating Random Variates Ref: Law & Kelton, Chapter 8.
Chapter 5 Statistical Models in Simulation
1 CS 475/575 Slide Set 6 M. Overstreet Spring 2005.
Modeling and Simulation CS 313
Modeling and Simulation Input Modeling and Goodness-of-fit tests
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Mid-Term Review Final Review Statistical for Business (1)(2)
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
2 Input models provide the driving force for a simulation model. The quality of the output is no better than the quality of inputs. We will discuss the.
1 Statistical Distribution Fitting Dr. Jason Merrick.
OPIM 5103-Lecture #3 Jose M. Cruz Assistant Professor.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
קורס סימולציה ד " ר אמנון גונן 1 ההתפלגויות ב ARENA Summary of Arena’s Probability Distributions Distribution Parameter Values Beta BETA Beta, Alpha Continuous.
Chapter 9 Input Modeling Banks, Carson, Nelson & Nicol Discrete-Event System Simulation.
ETM 607 – Input Modeling General Idea of Input Modeling Data Collection Identifying Distributions Parameter estimation Goodness of Fit tests Selecting.
Pemodelan Kualitas Proses Kode Matakuliah: I0092 – Statistik Pengendalian Kualitas Pertemuan : 2.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
5-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. May 28, 2009 Inventory # Chapter 5 Six Sigma.
Starting point for generating other distributions.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Chapter 6: Analyzing and Interpreting Quantitative Data
© Copyright McGraw-Hill 2004
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Chapter 31Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Modeling and Simulation CS 313
Chapter Nine Hypothesis Testing.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Statistical Modelling
Modeling and Simulation CS 313
CONCEPTS OF HYPOTHESIS TESTING
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Discrete Event Simulation - 4
LESSON 8: RANDOM VARIABLES EXPECTED VALUE AND VARIANCE
Presentation transcript:

Learning Simio Chapter 10 Analyzing Input Data

Outline Working with various types of data. Fitting distributions to data. Summary of common distributions. Modeling customer arrivals. Modeling task times. Sensitivity of results to data. In this chapter we will discuss input data and its role in your model. We will discuss some common distributions and their appropriate use. Chapter 10

Model Input Data A model has both structure and input data. Both the model structure and the input data have a significant impact on the results. The data can be a problematic aspect of a modeling project. Chapter 10

Typical Data Cases No data exists. Data exists in the wrong form. Lots of good data exists. Chapter 10

No data exists Consider using the Triangular or Pert distributions (minimum, mode, maximum) for activity times. Hypothesize distributions based on the underlying processes, and make educated guesses for the parameters. Run experiments to test sensitivity of results to the parameters. Don’t use a mean in place of a distribution. Chapter 10

Data exists in the wrong form. Data observed from a different real-world process. Time between failures when failures are count based. Time to repair when repairs are resource constrained. Data recorded during a “slow time” or a “busy time Values from multiple processes with no discriminatory information (e.g., repair times without noting the type of stoppage). Use the data that does exist to make intelligent guesses for the required data. Chapter 10

Lots of data exists If a large amount of data is available an empirical distribution may be used – however a theoretical distribution is preferred (compact, fast, easy to change). If possible, hypothesize a distribution based on the underlying process (combine data and theory). Use goodness of fit software to test the hypothesis and estimate the parameters. Chapter 10

Data Fitting Procedure Assess IID assumptions. Independent observations. Identically distributed. Use software to view the data using a histogram Hypothesize a distribution family/form. Use software to: Estimate distribution parameters Assess quality of fit Chapter 10

Sample Data Sets Chapter 10 Suppose that you have a set of observed values of the phenomenon for which you’re developing an input model. One of our “first steps” is to use a frequency histogram to get an idea of the probability mass/density function and the “general shape” of the distribution. Chapter 10

Common Distributions Binomial – Models the number of successes in n trials, when the trials are independent with common success probability, p; for example; the number of defective computer chips found in a lot of n chips. Negative Binomial – Models the number of trials required to achieve k successes; for example, the number of computer chips that we must inspect to find 4 defective chips. Poisson – Models the number of independent events that occur in a fixed amount of time or space; for example, the number of customers that arrive to a store during 1 hour, or the number of defects found in 30 square meters of sheet metal. Normal – Models the distribution of a process that can be thought of as the sum of a number of component processes; for example, a time to assemble a product that is the sum of times required for each assembly operation. Lognormal – Models the distribution of a process that can be thought of as the product of a number of component processes; for example, the rate of an investment, when interest is compounded, is the product of the returns for a number of periods. Banks et al., pp. 314-316 Chapter 10

Common Distributions Exponential – Models the time between independent events, or a process time that is memoryless; for example, the times between the arrivals from a large population of potential customers who act independently of each other. The exponential is a highly variable distribution; it is sometime overused because it often leads to mathematically tractable models. Recall that, if the time between events is exponentially distributed, then the number of events in a fixed period of time is Poisson. Gamma – An extremely flexible distribution used to model nonnegative random variables (can be shifted away from 0 by adding a constant). Beta – An extremely flexible distribution used to model bounded random variables. The beta can be shifted away from 0 by adding a constant and can be given a range larger than [0, 1] by multiplying by a constant. Erlang – Models processes that can be viewed as the sum of several exponentially distributed processes; for example, a computer network fails when a computer and two backup computers fail, and each has a TTF that is exponentially distributed. Banks et al., pp. 314-316 Chapter 10

Common Distributions Weibull – Models the time to failure for components; for example, the time to failure for a disk drive. The exponential is a special case of the Weibull. Discrete or Continuous Uniform – Models complete uncertainty: All outcomes are equally likely. This distribution is often used inappropriately, when there are no data. Triangular – Models a process for which only the minimum, most likely, and maximum values of the distribution are known; for example, the minimum, most likely, and maximum time required to test a product. This model is a marked improvement over the uniform distribution [in many cases]. Pert – A special case of the Beta with minimum, most likely, and maximum values. The pert provides a “smooth” alternative to the triangular in the absence of data. Empirical – Samples from the distribution of the actual data collected; often used when no theoretical distribution seems appropriate. Banks et al., pp. 314-316 Chapter 10

Goodness-of-fit (GOF) Tests Statistical hypothesis tests that are used to assess formally whether the observations X1, X2, …, Xn constitute an independent sample from a particular distribution function Hypothesis: H0: The Xi’s are IID random variables with the specified distribution function. Chapter 10

GOF Test Considerations Failure to reject the null hypothesis should not be interpreted as “accepting H0 as being true.” GOF tests are not very powerful for small-to-moderate sample sizes. Also, when n is large, the tests will often reject H0 since even minute differences will be detected. Chapter 10

Some GOF Software Options General packages EasyFit (www.mathwave.com) Simulation specific packages Stat::Fit (www.geerms.com) ExpertFit (www.averill-law.com) Chapter 10

Modeling Arrivals If arrivals are independent and random, they follow a Poisson process. The number of arrivals in a fixed time is Poisson. The time between arrivals is exponential. In some cases the arrival rate may vary over time – Simio supports step-wise linear arrival rates using a Rate Table. Chapter 10

Modeling Task Times Use a distribution with a range >= 0 (e.g. not the Normal or JohnsonUB). In the absence of data Triangular and Pert are possible choices. With supporting data the Gamma, LogNormal, Weibull, LogLogisitc, Beta, PearsonIV, and JohnsonSB are possible choices. Chapter 10

Gamma, Log Normal, Weibull Chapter 10

Determining what data is critical Some data may have a dominant impact on performance. The variability is often more important than the mean. Run scenarios specifically designed to determine the sensitivity of the model to the data inputs. Chapter 10

References Leemis, L, “Input Modeling Techniques for Discrete-Event Simulations,” Proceedings of the 2001 Winter Simulation Conference, Washington, DC, December 2001. Vincent, S., “Input Data Analysis,” in Handbook of Simulation, Edited by J. Banks, John Wiley & Sons, Inc, New York, NY, pp. 55-91, 1998. Chapter 9 – Input Modeling (Banks et al.) Chapter 6 – Selecting Input Probability Distributions (Law) Leemis – Theoretical Vincent – Practical Banks and Law chapters focus primarily on “fitting” distributions from “historical” data Chapter 10

Summary Distributions are the primary method for capturing variability in the system. Never use a mean in place of a distribution for a random component. When data exists hypothesize a distribution and estimate parameters and test using goodness-of-fit software. In the absence of data, use appropriate distributions. Arrivals – exponential time between arrivals, or non-stationary Poisson. Activities – triangular or pert. Use the model to determine the critical data elements. Chapter 10