History of Numbers and Expectations

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Introduction to Probability The problems of data measurement, quantification and interpretation.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Probability Simple Events
Statistics: Purpose, Approach, Method. The Basic Approach The basic principle behind the use of statistical tests of significance can be stated as: Compare.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio LECTURE 11 (Lab): Probability reminder.
Chapter 4 Probability and Probability Distributions
COUNTING AND PROBABILITY
Copyright © Cengage Learning. All rights reserved.
NIPRL Chapter 1. Probability Theory 1.1 Probabilities 1.2 Events 1.3 Combinations of Events 1.4 Conditional Probability 1.5 Probabilities of Event Intersections.
1 BASIC NOTIONS OF PROBABILITY THEORY. NLE 2 What probability theory is for Suppose that we have a fair dice, with six faces, and that we keep throwing.
1 Chapter 6: Probability— The Study of Randomness 6.1The Idea of Probability 6.2Probability Models 6.3General Probability Rules.
Independence, Dependence and History of Numbers 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice.
Probability & Certainty: Intro Probability & Certainty.
Chapter 4 Probability.
September SOME BASIC NOTIONS OF PROBABILITY THEORY Universita’ di Venezia 29 Settembre 2003.
1 Basic Probability Statistics 515 Lecture Importance of Probability Modeling randomness and measuring uncertainty Describing the distributions.
Probability & Certainty: Intro Probability & Certainty.
Chapter 6 Probability.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Chapter 4 Probability See.
Standardized Score, probability & Normal Distribution
Independence and Dependence 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
Chapter 11 Data Descriptions and Probability Distributions
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Independence and Bernoulli.
Permutations & Combinations and Distributions
“PROBABILITY” Some important terms Event: An event is one or more of the possible outcomes of an activity. When we toss a coin there are two possibilities,
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 5 Section 1 – Slide 1 of 33 Chapter 5 Section 1 Probability Rules.
Simple Mathematical Facts for Lecture 1. Conditional Probabilities Given an event has occurred, the conditional probability that another event occurs.
Copyright © Cengage Learning. All rights reserved. CHAPTER 9 COUNTING AND PROBABILITY.
Chapter 4 Correlation and Regression Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Expectations 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
Probability and Independence 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
Introduction to Simulations and Experimental Probability Chapter 4.1 – An Introduction to Probability Learning goals: Describe a simulation for a given.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
1 1 Slide © 2016 Cengage Learning. All Rights Reserved. Probability is a numerical measure of the likelihood Probability is a numerical measure of the.
 History and Relevance of probability theory Probability theory began with the study of game of chance that were related to gambling, like throwing a.
1 TABLE OF CONTENTS PROBABILITY THEORY Lecture – 1Basics Lecture – 2 Independence and Bernoulli Trials Lecture – 3Random Variables Lecture – 4 Binomial.
1.3 Simulations and Experimental Probability (Textbook Section 4.1)
Bayes Theorem and Review of Concepts Learnt 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Chapter 4 Probability ©. Sample Space sample space.S The possible outcomes of a random experiment are called the basic outcomes, and the set of all basic.
The Cartoon Guide to Statistics
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Independence and Bernoulli Trials. Sharif University of Technology 2 Independence  A, B independent implies: are also independent. Proof for independence.
Making sense of randomness
POSC 202A: Lecture 4 Probability. We begin with the basics of probability and then move on to expected value. Understanding probability is important because.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
Independence and Dependence 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
Week 21 Rules of Probability for all Corollary: The probability of the union of any two events A and B is Proof: … If then, Proof:
PROBABILITY AND BAYES THEOREM 1. 2 POPULATION SAMPLE PROBABILITY STATISTICAL INFERENCE.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Chance We will base on the frequency theory to study chances (or probability).
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
PROBABILITY 1. Basic Terminology 2 Probability 3  Probability is the numerical measure of the likelihood that an event will occur  The probability.
AP Statistics From Randomness to Probability Chapter 14.
Virtual University of Pakistan
Probability Axioms and Formulas
Presented by: Karen Miller
What is Probability? Quantification of uncertainty.
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Statistical NLP: Lecture 4
28th September 2005 Dr Bogdan L. Vrusias
Presentation transcript:

History of Numbers and Expectations Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University

Content Take home exercise Continuation of History of Numbers Bayes’ Theorem General Product Rule Expectations

Take Home Exercise - 7 Suppose two players A and B play a game of dice Let the dice be fair and unique: Suppose one is colored blue and other black Let Player A rolls this pair of dice and player B guess the numbers on them Let the outcome of rolling be (x,y), where x is outcome on the blue die and y is the outcome on the black die. This implies (x, y) ≠ (y, x) Player A “may” inform Player B about the number of “even numbers” in the outcome of rolling the pair of dies. For example, (2,3) has only 1 even number while (4,6) has 2 even numbers (1)Calculate the probabilities of guessing the correct (x,y) for each pair of x,y Without any information With the information of number of even numbers as stated above. Tabulate the results of all the (x,y) with and without the information Take a single case say (2,5) and derive the above probability of guessing (2,5) using conditional probability.

Take Home Exercise - 7 Part 2: (Slightly tricky) Consider the same game but we are not interested in individual cases of (x,y) Instead we are only interested in the probability of player B guessing the correct number How often will player B guess the correct answer Without any information And with information Hint: Find the probabilities of getting “1 even number”, “2 even numbers and “3 even numbers”

Content Take home exercise Continuation of History of Numbers Bayes’ Theorem General Product Rule Expectations

The Babylonian Number system 20000BC 3000BC 500BC 300BC 500AD 1000AD The Babylonian Number system The Babylonians lived in Mesopotamia, a fertile plain between the Tigris and Euphrates rivers. 3400 BC: The Egyptians and Babylonians were first recorded as using the natural numbers and rational numbers. The base of a number system is the number of symbols available for representation. The modern day numbers are a base-10 system, because we have 10 different symbols for all our numbers. Babylonians had a base of 60. That means they had 60 symbols. 1 2 3 4 5 6 7 8 9 …. 1 2 3 4 58 59 Notice the white space for a zero !!

Sumerian/ Babylonian Numerals 20000BC 3000BC 500BC 300BC 500AD 1000AD Sumerian/ Babylonian Numerals Though we mentioned that the Babylonians had a base of 60 which means that they have to remember 60 different symbols to use numbers, they had invented a clever way of creating all the 60 numerals from just two symbols . And They needed 59 numerals to represent all the numbers from 1 and 59. So they had a symbol for 10 and another one for 1. Thus the number of times each one of them is repeated gives the numeral for that particular number. Now let us create the numeral 7 Thus to create a numeral for 7, we need to combine 7 of the symbols for 1.

Sumerian/ Babylonian Numerals 20000BC 3000BC 500BC 300BC 500AD 1000AD Sumerian/ Babylonian Numerals Now let us consider another example Let us create the numeral 23 Thus to create a symbol for 23, we need two symbols of ‘10’ and three of ‘1’. Similarly

Sumerian/ Babylonian Numerals To create other numerals there are some simple rules that have to be followed. When we are stacking symbols for 1 to create numerals, each stack should have at most three in each row. We stack all the three symbols in a single row. For example, To create the number 3 We stack three in the first row and then the remaining two in the next row. To create the number 5 Similarly when we stack the symbols for 10, the symbols are stacked in the same way except diagonally. Can you infer from the following ?

Constructing bigger numerals from small numerals 20000BC 3000BC 500BC 300BC 500AD 1000AD Constructing bigger numerals from small numerals Thus we obtain the complete set of symbols for all the 59 numbers. Now as we have all the numerals for the number system, we need to understand how to write to write a number.

Positional Significance Consider the number 4256. This is a numeral. What is the value of this numeral ? How is it evaluated? 4256 = 4*1000 + 2*100 + 5*10 + 6 Powers of 10 Positional significance Positional significance is the mechanism by which a symbol is elevated in it’s value to easily create bigger numbers. This is done by writing the symbols adjacent to each other to create bigger numerals.

Sumerian/ Babylonian Numbers For example, let us try writing in Babylonian system, the value 4256 As the Babylonian system has a base of 60, the positional significance of each symbol varies with as a power of 60. So, 4256 can be written as 4256 = 1*3600 + 10*60 + 56*1 = 1*602 + 10*601 + 56*600 Now, we have to represent this in terms of the Babylonian numerals 4256 Exercise: Represent the quantity 2764 in Babylonian number system. Solution:

Positional significance A quick recap Topic Tally Marks Conventional Sumerian Symbol 0,1,2,3… ,9 Numerals 4256 Base 1 10 60 Positional significance Powers of 1 Powers of 10 Powers of 60

Problems in Sumerian/ Babylonian System Now there is a potential problem with the system. Using this number system let us represent the two numbers 61 and 2. First 61 = 1*60 +1*1 is represented as The only difference being the space between the symbols. And 2 = 2*1 is represented as A much more serious problem was the fact that there was no symbol for zero. Let us see for ourselves. Let us represent the numbers 1 and 60. 1 = 1*1 60= 1*60 They have exactly the same representation and now there was no way that spacing could help.

Egyptian Civilization 20000BC 3000BC 500BC 300BC 500AD 1000AD Egyptian Civilization The ancient Egyptians were possibly the first civilization to practice the scientific arts. But each symbol represented a power of 10. All other decimal numbers were represented using the above symbols

Egyptian Number System 20000BC 3000BC 500BC 300BC 500AD 1000AD Egyptian Number System To represent a quantity in Egyptian system, we first represent the quantity in terms of the powers of 10, similar to the present day system. For example, 3244 3*1000 + 2*100 + 4*10 +4*1 Exercise: Represent in Egyptian number system the quantity 21,237. Solution:

The different number systems Tally Marks(20000BC) Sumerian(3000BC) Egyptian(3000BC) Easy to update the number Only two symbols used to generate all numerals. Ease of representation and manipulation. Larger numbers become difficult to represent and manipulate. Manipulation is cumbersome because of the larger number of numerals. Difficult to use too, but has a hint of the modern base-10 number system in it’s positional significance.

Content Take home exercise Continuation of History of Numbers Bayes’ Theorem General Product Rule Expectations

Who was Bayes? Thomas Bayes was a British mathematician and a Church minister He formulated the famous Bayes theorem his work on this was published posthumously as Essay Towards Solving a Problem in the Doctrine of Chances (1764) His work on Bayes theorem gave birth to the branch of Statistics Bayesian probability is the name given to several related interpretations of probability, they have in common the notion of probability as something like a partial belief, rather than a frequency. "Bayesian" has been used in this sense since about 1950

Deriving Bayes Theorem with an Example Suppose you have a closed box containing a large number of black and white balls. you do not know the proportion of black and white balls You take out a sample of balls from the box and find that there are three-fourths of black balls in the sample Bayes worked out a theorem which indicates exactly how opinions held before the experiment should be modified by the evidence of the sample What is your guess about the composition of balls in the box? Now, what is your guess about the composition of balls in the box?

Birth of Statistics Statistics arose from the need of states to collect data on their people and economies for administrative purposes started in 18th century Bayes theorem provided the mathematical basis for this branch initial intuition was given by Francis Bacon Thomas Bayes provided the first mathematical basis to this branch of logic Its meaning broadened in the early 19th century to include the collection and analysis of data in general. today statistics is widely employed in government, business, and the natural and social sciences.

Probability Theory Vs Statistics Probability theory computes the probability that future (and hence presently unknown) samples out of a known population turn out to have stated characteristics Statistics looks at the present and hence known sample taken out of an unknown population, and makes estimates of what the population is likely to be, compares likelihood of various populations and tells how confident you have a right to be about these estimates

What is Bayes Theorem? Bayes' theorem relates the conditional and unconditional probabilities of events A and B, where B has a non-zero probability: Each term in Bayes' theorem has a conventional name: P(A) is the prior probability or unconditional probability of A. It is "prior" in the sense that it does not take into account any information about B. P(A|B) is the conditional probability of A, given B. P(B|A) is the conditional probability of B given A. P(B) is the prior or marginal probability of B

Alternate Form of Bayes Theorem Consider that A has two events : A1 and A2 If we want to compute the probability of A1 given B, then But, P(B) can be written as Hence, we get More generally, Bayes theorem can be written as

Understanding Bayes Theorem Bayes theorem is often used to compute posterior probabilities given observations. For example, a patient may be observed to have certain symptoms. Bayes' theorem can be used to compute the probability that a proposed diagnosis is correct, given that observation. Intuitively, Bayes’ theorem in this form describes the way in which one's beliefs about observing ‘A’ are updated by having observed ‘B’.

Derivation of Bayes Theorem To derive the theorem, we start from the definition of conditional probability. The probability of event A given event B is Equivalently, the probability of event B given event A is Rearranging and combining these two equations, we find Dividing both sides by P(B), provided that it is non-zero, we obtain Bayes' theorem: ( is probability of A and B occurring simultaneously)

Content Take home exercise Continuation of History of Numbers Bayes’ Theorem General Product Rule Expectations

General Product Rule All along, we have been using product rule as given below P(A and B and C and …) = P(A)P(B)P(C)…. The above formula is a “Special case” of the general Product Rule. All the problems we have been dealing with have consisted of “Independent” Events Rolling of a pair of dies Tossing of coins Therefore, P(A and B and C and ….) = P(A)P(B)P(C)….. But what if they were not independent? Will the same formula work? NO!! So is there a general product rule which can be applied? YES!!

General Product Rule Suppose we are interested in simultaneous occurrence of event A, B and C. Suppose these events are all dependent on each other P(A and B and C) = P(A)P(B|A)P(C|A,B) In general for n different dependent events A1, A2, A3….An P(A1 and A2 and A3 …. An) = P(A1)P(A2|A1)P(A3|A1,A2)P(A4|A1,A2,A3)……………… P(An|A1,A2,A3,….,An-1) Can we derive it?

Proof for General Product Rule Let us consider just 2 “Dependent” events A1 and A2 Definition of conditional probability is P(A2| A1) = P(A1 and A2 )/P(A1) So P(A1 and A2 ) = P(A2| A1) P(A1) Now let us add a third event A3 We must somehow represent P(A1 and A2 and A3 ) it in terms of P(A1 and A2 ). How about P(A3| A1 A2)? P(A3| A1 A2) = P(A1 and A2 and A3 )/P(A1 and A2) P(A1 and A2 and A3 ) = P(A3| A1 A2 ) P(A1 and A2) = P(A3| A1 A2 ) P(A2| A1) P(A1) In general we can extend this to n events

Content Take home exercise Continuation of History of Numbers Bayes’ Theorem General Product Rule Introduction to Expectations

Any guesses In-class exercise -1 Let us change gears and move to a new topic Consider the following exercise You will be given a coin and you toss it If you get heads you get some reward (2 chips) And if you get tails you do not get anything After 10 rounds, how many chips do you think you will have ? Let us test it out for ourselves. Any guesses

Number of chips till that point Maintain the record of your game in the following way Play this game for 20 rounds Toss Outcome Number of chips till that point 1 2 3 4 5 6 7 8 9 10 Observe these numbers Can you now take a guess of your earnings after 20 turns, 50 turns, 100 turns ?

½ Can you now take a guess of your earnings after 20 turns, 50 turns, 100 turns ? Answering this question is very important to get an idea of how much you are going to earn. Let us use probability i p HEAD=0 ½ TAIL=1 First let us calculate the earnings we can expect in one turn So we have ½ chance of earning 2 chips We have ½ chance of earning nothing So we can expect to earn ½ * 2 = 1 chip at the end of every turn on an average Very important

Number of chips till that point Let us see another example This time we will consider another randomizing device – our friendly die Maintain a similar record till 20 throws in the following format Earnings Outcome Earning 1 2 3 4 5 6 Throw Outcome Number of chips till that point 1 2 3 4 5 6 7 8 9 10

Can you now take a guess of your earnings after 20 turns, 50 turns, 100 turns ? We again visit the same question. This time we know that the probability function of a die can be represented as i p(i) 1 1/6 2 3 4 5 6 There is a 1/6th chance of winning 2 chips There is a 1/6th chance of winning 2 chips There is a 1/6th chance of winning 2 chips There is a 1/6th chance of winning 1 chip There is a 1/6th chance of winning 1 chip There is a 1/6th chance of winning 1 chip On an average we can expect (1/6)*2 + (1/6)*2 + (1/6)*2 + (1/6)*1+ (1/6)*1 + (1/6)*1 = 3/2 chips every throw What does this mean ?

What do we mean when we say that the earning per game is 3/2 chips? That does not seem right!! To understand this, let us first consider the following question. Consider an unbiased coin toss. The probability of obtaining a HEAD = ½ But for n trials of the experiment do we always get n/2 HEADs and n/2 TAILs ? Consider the following experiment: Toss a coin 5, 10, 50, 100, 500, 1000 … 10000 times. At each point collect the data regarding number of HEADs and number of TAILs. Now let us analyze data obtained from one such experiment.

Notice that the Y-axis is in logarithmic scale What do you observe as the number of trials grows large ?

tends to agree with the ideal case ? Can you observe that as the number of trials grows “large” the result of the experiment tends to agree with the ideal case ?

What does this mean with regard to expectation of 3/2 chip for each experiment ? It means that as the number of trials(n) grows “large” then it can be expected that the earnings will be equal to 3/2 * n Take Home: Perform a similar analysis of the coin to the die experiment. Show that on an average when the number of trials grows very large the earnings is 3/2 per trial.

END