Simulation Using computers to simulate real- world observations.

Slides:



Advertisements
Similar presentations
1 Random Sampling - Random Samples. 2 Why do we need Random Samples? Many business applications -We will have a random variable X such that the probability.
Advertisements

Lesson 3 Working with Formulas.
EXCEL.
© Paradigm Publishing, Inc Excel 2013 Level 2 Unit 1Advanced Formatting, Formulas, and Data Management Chapter 2Advanced Functions and Formulas.
Chapter 13 – Boot Strap Method. Boot Strapping It is a computer simulation to generate random numbers from a sample. In Excel, it can simulate 5000 different.
Pre-defined System Functions Simple IF & VLOOKUP.
Simulation Operations -- Prof. Juran.
Outline/Coverage Terms for reference Introduction
Tutorial 3 Calculating Data with Formulas and Functions
Spreadsheet Simulation
Calling all Data Geeks! Corey McAfee October 24, 2014 Corey McAfee October 24, 2014.
Tutorial 7: Using Advanced Functions and Conditional Formatting
Stock Options 101 Lindsay Yoshitomi Leslie White Jennifer Jones Jeff Guba.
Pradeep Velugoti Lakshman Tallam.  Type in the month name “January” in any cell say A1.  Now drag the fill handle to the right to select the range (Do.
Test 2 Stock Option Pricing
Simulating Normal Random Variables Simulation can provide a great deal of information about the behavior of a random variable.
Histograms & Summary Data.  Summarizing large of amounts of data in two ways: Histograms: graphs give a pictorial representation of the data Numerical.
Probability Distributions Finite Random Variables.
Expected Value- Random variables Def. A random variable, X, is a numerical measure of the outcomes of an experiment.
Introduction to Simulation. What is simulation? A simulation is the imitation of the operation of a real-world system over time. It involves the generation.
1 Committed to Shaping the Next Generation of IT Experts. Chapter 4: Spreadsheets in Decision Making: What If? Robert Grauer and Maryann Barber Exploring.
Example 11.1 Simulation with Built-In Excel Tools.
Probability Distributions Random Variables: Finite and Continuous A review MAT174, Spring 2004.
Probability Distributions Random Variables: Finite and Continuous Distribution Functions Expected value April 3 – 10, 2003.
Random Sampling. In the real world, most R.V.’s for practical applications are continuous, and have no generalized formula for f X (x) and F X (x). We.
Simulation Basic Concepts. NEED FOR SIMULATION Mathematical models we have studied thus far have “closed form” solutions –Obtained from formulas -- forecasting,
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
Random Sampling  In the real world, most R.V.’s for practical applications are continuous, and have no generalized formula for f X (x) and F X (x). 
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Monte Carlo Methods A Monte Carlo simulation creates samples from a known distribution For example, if you know that a coin is weighted so that heads will.
Copyright © 2008 Pearson Prentice Hall. All rights reserved. 11 Copyright © 2008 Prentice-Hall. All rights reserved. Committed to Shaping the Next Generation.
Chapter 9: Simulation Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University Name.
Example 16.1 Ordering calendars at Walton Bookstore
Welcome to the Unit 8 Seminar Dr. Ami Gates
5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete Probability Distributions.
Chapter © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
1 Chapter 5 Continuous Random Variables. 2 Table of Contents 5.1 Continuous Probability Distributions 5.2 The Uniform Distribution 5.3 The Normal Distribution.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5.
Discrete Distributions The values generated for a random variable must be from a finite distinct set of individual values. For example, based on past observations,
Exploring Office 2003 Vol 1 2/e - Grauer and Barber 1 Committed to Shaping the Next Generation of IT Experts. Chapter 4: Spreadsheets in Decision Making:
11 Chapter 2: Formulas and Functions Chapter 02 Lecture Notes (CSIT 104) Exploring Microsoft Office Excel 2007.
Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.
ESD.70J Engineering Economy Module - Session 21 ESD.70J Engineering Economy Fall 2006 Session Two Alex Fadeev - Link for this PPT:
ESD.70J Engineering Economy Module - Session 21 ESD.70J Engineering Economy Fall 2009 Session Two Michel-Alexandre Cardin – Prof. Richard.
Random Sampling Approximations of E(X), p.m.f, and p.d.f.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
Estimation of a Population Mean
Simulation is the process of studying the behavior of a real system by using a model that replicates the system under different scenarios. A simulation.
Risk Analysis Simulate a scenario of possible input values that could occur and observe key impacts Pick many input scenarios according to their likelihood.
Overview Excel is a spreadsheet, a grid made from columns and rows. It is a software program that can make number manipulation easy and somewhat painless.
Risk Analysis Simulate a scenario of possible input values that could occur and observe key financial impacts Pick many different input scenarios according.
Chapter 6 Lecture 3 Sections: 6.4 – 6.5. Sampling Distributions and Estimators What we want to do is find out the sampling distribution of a statistic.
ESD.70J Engineering Economy Module - Session 21 ESD.70J Engineering Economy Fall 2010 Session Two Xin Zhang – Prof. Richard de Neufville.
Microsoft Office 2013 ®® Calculating Data with Formulas and Functions.
Simulation Chapter 16 of Quantitative Methods for Business, by Anderson, Sweeney and Williams Read sections 16.1, 16.2, 16.3, 16.4, and Appendix 16.1.
Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 8.3.
RevisionRevision Function in Spreadsheet DATE Returns the serial number of a particular date. Syntax –DATE(year,month,day) year is a number from 1900.
04. Excel Countif and Vlookup. File -> Open -> 04b-datastart.xlsx.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 8.3.
VLOOKUP Function Tech Tuesday January 5, What is VLOOKUP? A Function in Excel (and also in Google Sheets) for finding specific information in.
CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics B: Michael Baron. Probability and Statistics for Computer Scientists,
Contents Introduction Text functions Logical functions
TRACKER Contents Intro Excel 101 Math Operations Formulas 101.
Confidence Interval Estimation
Spreadsheets, Modelling & Databases
REACH CRC Professor Manni
in Excel Instructor: Zhe He Department of Computer Science
Presentation transcript:

Simulation Using computers to simulate real- world observations

Why do we use it? In business, we identify a random variable We want to estimate the probability information of the random variable Problem: Is it not practical to gather reliable data using real-world observations Solution: Replace the real-world observations with computer simulations

Types We have two types of simulation that we look at: –Generating integers –Generating values between 0 and 1 with uniform distribution In the next example, we are going to look at creating a simulation by generating integers

Example #1 The operator of a phone switchboard at a large company takes a break after every 50 phone calls he handles. To plan for a back-up operator, the office manager would like to have information on the length of time that it might take for a set of 50 calls to arrive at the switchboard. Specifically, she would like to know the probability that, starting at 9am, a run of 50 calls will arrive before 10am

Example #1 (continued) It would be impractical for the manager to wait and collect records on thousands and thousands of incoming calls The manager could, instead, collect a small amount of data and then use computer simulation to generate more Using all of this information, the manager could then estimate the probability of a run of 50 calls coming between 9 and 10am

Example #1 (continued) For one day, 270 calls were received from 9am to 3pm T is the random variable giving the time, in minutes, between the arrival of successive calls The 270 calls determine 269 time intervals; these intervals are assumed to be independent observations of T Look in Phone Log.xls for the information recorded

Example #1 (continued) In Phone Log.xls, the average interval between calls was found to be 1 min, 20 sec Thus, a run of 50 calls would be expected to arrive in 1 hour, 6 min, 40 sec --- i.e., from 9am to 10:06:40 If END is the random variable variable that is the end arrival time for a run of 50 calls, the manager wants P(END < 10:00:00) Why? Estimate the probability by letting Excel simulate 2,000 runs of 50 calls each

Creating a Simulation We want to run a simulation that generates 2,000 runs where each run has 50 calls How do you run one run of 50 calls? Need to use 2 database functions: –RANDBETWEEN –VLOOKUP

RANDBETWEEN RANDBETWEEN selects a random integer In our case, we wanted it to select a random integer between 1 and 269 –WHY? It is found under the Math & Trig submenu of Function Wizard Syntax: Enter the smallest and largest integer for RANDBETWEEN to return

VLOOKUP Once a random number has been picked, VLOOKUP can be used to cross-reference the information we have to find another value VLOOKUP “searches for a value in the leftmost column of a table, and then returns a value in the same row from a column you specify” Let’s look at Phone Log.xls

VLOOKUP How do we use VLOOKUP and what do we enter? –Lookup_value – value in the leftmost column of table Since we are generating random numbers, this is where you would use the RANDBETWEEN function –Table_array – Location of table –Col_index_num – Number of the column where value is to be found –Range_lookup – We leave this blank For example, VLOOKUP(RANDBETWEN(1,269),$B$8:$D$276,3) –pick a number between 1 and 269, where the table is from B8:D276 and once you get your random number, find it in the leftmost column of the table and move two cells to the right to find the value you want

Generating 2,000 observations Once you have the syntax to generate one run, you can generate how ever many you want Drag the formula to the desired areas (across and down)

Finding the Probability Once you have all the observations you want, we can finally answer the original question For the bank manager, she wanted to know what the probability was that a run of 50 calls will arrive between 9 and 10am Use the COUNTIF function to count how many times the values of the last column are less than 10:00:00 Why are we only concerned with the last column of our generated data? Once we have that number, divide it by 2,000 to get the probability

New Simulations To get an accurate probability, you should do repeated trials of the simulation –How many repeated trials will be needed to get an accurate probability? How do you know? If you use Phone Log.xls, it is set for manual calculation; thus, you can press F9 to produce a new simulation Repeated trials show that P(END <10:00:00) is about 26%

What do you gain with simulation? You only have to gain data for one day as opposed to numerous days For our simulation in Phone Log.xls, we only had to gain one day’s data to get the proability If we didn’t run the simulation, we would have had to gain 7 ½ years worth of data to get about the same approximation This type of method is called bootstrapping

Continuous Random Variables For other busines situations, generating integers will not be enough. Thus, using RANDBETWEEN will not be helpful Excel has another function that generates random numbers with a uniform distribution that are between 0 and 1. This function is called RAND.

RAND function Syntax: –RAND does not take on any arguments –Type in =RAND( ) in the cell and hit enter Found under Math & Trig submenu of Function Wizard

RAND function For example, let’s generate 10 numbers using the function RAND. Thus, P(x  0.5) = ½ because RAND randomly generates the number with a uniform distribution Why? –Each number generated has a probability of 1/n, where n is the amount of numbers that can be generated between 0 and 1. –Now, there are m numbers between 0 and 0.5 and m = ½ (n) because m is half of the numbers between 0 and 1. –Thus, P(x  0.5) = m(1/n) = m/n = (1/2 n)/n = 1/2

IF function We would like to have another function that gives us some information about the numbers that have just been generated. This function is called IF and is found under the Logical submenu of the Function Wizard What does the IF function do? –Returns a value if a condition you specify is true –Returns another value if a condition you specifiy is false What condition? –Any value or expression that can be evaluated to be true or false

IF function Syntax: –Logical_test – condition specified by you (has to be a condition that can be evaluated by true or false) –Value_if_true – if condition is true, IF returns the value specified (this value needs to be typed in quotes) –Value_if_false – if condition is false, IF returns the value specified (this value needs to be typed in quotes) E.g., IF(B11<=0.5, “H”, “T”) means the following: –if the value in cell B11 is less than or equal to 0.5 return H –if it is greater than 0.5 return T.

Example #2 Errors in procedure coding are a major problem for Health Maintenance Organizations (HMO’s). Suppose that Health Associates receives forms from doctors’ offices, with each form requiring two codes. If 5 forms in a random sample from a given office contain more than 1 coding error all together, then a complaint is sent to the doctor. At Doctor Bustamante’s office the actual probabilities of errors on a single form are as follows:

Example #2 (continued) Question: What is the probability that a random sample of 5 forms from Dr. Bustamante’s office will contain more than 1 coding error? –I.e, What is P(T > 1), if T is the random variable that observes the number of coding errors in a random sample of 5 forms from Dr. Bustamante’s office?

Solution to Example #2 To answer the questions from the previous slide, we will need to look at some probabilities and set up a simulation Let X be the random variable that is the number of coding errors on a single, randomly selected form from Dr. Bustamante’s office. X = 0, 1, or 2 (from the information that was given earlier) We also know that P(X = 0) = 0.90, P(X = 1) = 0.08 and P(X = 2) = 0.02 Thus, E(X) = 0(0.90) + 1(0.08) + 2(0.02) = 0.12

Solution to Example #2 (cont.) We are looking for the total number of errors. Thus, we need to look at the random variable T defined earlier. The 5 forms are picked at random and the number of errors on each form are independent observations of X. Since T is the sum of 5 independent observations of X, E(T) = 5*E(X) = 0.60 Use this number as a check when we run our simulation

Simulation for Example #2 Look in the file Coding.xls to see 4,000 generated samples [of 5 forms from Dr. Bustamante’s office]. We need to look at how the functions RAND and IF were used. Let’s look at cell G11 in the file – the formula typed there is =IF(B11<=0.90,0,IF(B11<=0.98,1,2)) What does this mean? –Go to B11 – if the number is less than or equal to 0.90 return a 0 (for no errors) – if not, then -- –Go to B11 and if the number is between.090 and.098 (how do we know this is what is meant by the formula) return a 1 (1 error) – if not, then -- –Return a 2 (for 2 errors)

Focus on the Project How does all of this relate to our project? We can use the functions RANDBETWEEN and VLOOKUP to create a simulations This simulation will help us determine the price of our option on the starting date of the option

Focus on the Project Recall, we have following random variables: –C norm – closing price at the end of the option period (the number of weeks your option ran) –R norm – normalized weekly ratios –FV – normalized value of our option at the end of option period –PV – option price, per share, on starting date of option

Focus on the Project If we go to Option Focus.xls we can look in the sheet Simulation to see how RANDBETWEEN and VLOOKUP are used to select random values of R norm After randomly selecting normalized ratios, you can use them to find an observation of the normalized closing price.

Focus on the Project How? –take the actual starting price of the stock on the starting date of the option, March 7, 2003 this is found from the historical data –multiply the starting price by a randomly selected normalized ratio – this gives you a current price of your stock for that week –repeat this procedure, using the current week’s stock price (found from the step before) as the number to multiply to the normalized ratio selected do this for the length of your option (how many weeks your option is running) –at the end of your option length, you will have an observation of the normalized closing price, C norm –this is 1 run of your simulation

Focus on the Project Once you have an observation of your normalized closing price, we need to use this to find the starting option value. We know that the Future Value of the option (FV) is the max of {S 0 – C norm, 0} (Why?) So for the normalized closing price you found, you should find the corresponding future value of the option –Note – the number will be either 0 or positive Once you have the future value of the option, we can use the present value formula to find the present value of the option (or the value of the option on the starting date of the option).

Focus on the Project At the end of this 1 run of the simulation, you will have a value of your option on the starting date of your option You will need to do more than 1 run of your simulation to make your estimate accurate –The more observations you have the better your estimate will be How many runs of your simulation should you have? –The more the better – have at least 5,000 simulations –You can click and drag your formula down 5000 cells and across the number of weeks your option runs to fill in your simulation table

Focus on the Project Once you have 5,000 runs, this means that you will have 5,000 observations of your closing price. Thus, you can find 5,000 future values of your option and in turn, find 5,000 values of your option price (PV) Find the sample mean of these 5,000 observations of PV which provides an approximation of E(PV) Each time the simulation is recomputed (by pressing F9), a new E(PV) is calcuated. Run your simulation at least 20 times – this provides you with 20 approximate values of E(PV) Average these numbers together to determine your option price, per share

Focus on the Project 5,000 runs to compute one E(PV) and running the simulation 20 times to produce 20 E(PV)’s produced 100,00 simulations Is this enough simulations? How can you tell? Well, what happens if you run 200,000 simulations? 300,000 simulations? Will you be getting a more accurate price? You will need to look in to this issue – –perhaps you can graph the values that you get of E(PV) when you run the simulation 100,000, 200,000, 300,00 times – what is the graph telling you?

What should you do? You will need to input your data into the given simulation (or better yet, create your own from hand – you will understand it better if you do this!) If you just change the simulation, what do you need to input? –Change all cell references to match your data –Some examples are (might not be all): The number of weeks of ratios you have Your week dates Your normalized ratios Change the length of time your option is running for (match with your data) – there are many cell references for this Actual closing price of stock on March 7, 2003 Change the cell references in the formulas to match the length of your option time and the number of normalized ratios you have

What should you do? After creating your own simulation OR changing the current simulation, you should end up with one value for E(PV) Run the simulation (by hitting F9) at least 20 times Record the values obtained for E(PV) and average them together This number produces a final estimate for the value of your option on March 7, 2003, per share Remember our underlying is 100 shares, so your option premimum will be the value of your option, per share, multiplied by 100.