M ☺ deling of User Behavior In Matching Task Based on Previous Reward History and Personal Risk Factor April 1, 2004 Helen Belogolova Amy Daitch.

Slides:

Advertisements

Similar presentations

An Example of Quant’s Task in Croatian Banking Industry

Advertisements

Bison Management Suppose you take over the management of a certain Bison population. The population dynamics are similar to those of the population we.

Cognitive Modelling – An exemplar-based context model Benjamin Moloney Student No:

Mean-variance portfolio theory

Experiment Basics: Variables Psych 231: Research Methods in Psychology.

Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.

April 2, 2015Applied Discrete Mathematics Week 8: Advanced Counting 1 Random Variables In some experiments, we would like to assign a numerical value to.

T T18-03 Exponential Smoothing Forecast Purpose Allows the analyst to create and analyze the "Exponential Smoothing Average" forecast. The MAD.

Simulating Normal Random Variables Simulation can provide a great deal of information about the behavior of a random variable.

AMSR-E Soil Moisture Retrievals Using the SCA During NAFE’06 T.J. Jackson and R. Bindlish USDA ARS Hydrology and Remote Sensing Lab September 22, 2008.

Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.

Reward-based decision making under social interaction Damon Tomlin MURI Kick-Off meeting September 13, 2007.

Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.5 Comparing and Summarizing Performance.

MAE 552 Heuristic Optimization

Chapter 3 Forecasting McGraw-Hill/Irwin

Transforms What does the word transform mean?. Transforms What does the word transform mean? –Changing something into another thing.

MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #16 3/1/02 Taguchi’s Orthogonal Arrays.

Current State of Play Cognitive psychology / neuroscience: –Increasingly rigorous models of individual decision making –Virtual ignorance (and ignoring)

1 A Novel Binary Particle Swarm Optimization. 2 Binary PSO- One version In this version of PSO, each solution in the population is a binary string. –Each.

1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.

Chapter 5 Continuous Random Variables and Probability Distributions

Volatility Chapter 9 Risk Management and Financial Institutions 2e, Chapter 9, Copyright © John C. Hull

Forecasting McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.

Chemometrics Method comparison

Theory of Decision Time Dynamics, with Applications to Memory.

Lecture 7: Simulations.

Forecasting Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.

Statistical learning and optimal control:

© 2012 Cengage Learning. All Rights Reserved. May not scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Chapter.

ICS 145B -- L. Bic1 Project: Page Replacement Algorithms Textbook: pages ICS 145B L. Bic.

1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)

Some Background Assumptions Markowitz Portfolio Theory

Investment Analysis and Portfolio Management Chapter 7.

597 APPLICATIONS OF PARAMETERIZATION OF VARIABLES FOR MONTE-CARLO RISK ANALYSIS Teaching Note (MS-Excel)

5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete Probability Distributions.

LECTURE 19 THURSDAY, 14 April STA 291 Spring

Return and Risk for Capital Market Securities. Rate of Return Concepts Dollar return Number of $ received over a period (one year, say) Sum of cash distributed.

Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems.

Modern Navigation Thomas Herring

CPSC 7373: Artificial Intelligence Lecture 10: Planning with Uncertainty Jiang Bian, Fall 2012 University of Arkansas at Little Rock.

Lesson 4 -Part A Forecasting Quantitative Approaches to Forecasting Components of a Time Series Measures of Forecast Accuracy Smoothing Methods Trend Projection.

© 2014 Carl Lund, all rights reserved A First Course on Kinetics and Reaction Engineering Class 13.

Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.

Statistical learning and optimal control: A framework for biological learning and motor control Lecture 4: Stochastic optimal control Reza Shadmehr Johns.

Issues in Estimation Data Generating Process:

5-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. May 28, 2009 Inventory # Chapter 5 Six Sigma.

1 EXAKT SKF Phase 1, Session 2 Principles. 2 The CBM Decision supported by EXAKT Given the condition today, the asset mgr. takes one of three decisions:

Estimating Volatilities and Correlations Chapter 21.

© 2014 Carl Lund, all rights reserved A First Course on Kinetics and Reaction Engineering Class 13.

5-4 Parameters for Binomial Distributions In this section we consider important characteristics of a binomial distribution including center, variation.

Copyright © 2014, 2011 Pearson Education, Inc. 1 Active Learning Lecture Slides For use with Classroom Response Systems Chapter 9 Random Variables.

STOCHASTIC HYDROLOGY Stochastic Simulation of Bivariate Distributions Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National.

Optimal Eye Movement Strategies In Visual Search.

Why do we analyze data?  It is important to analyze data because you need to determine the extent to which the hypothesized relationship does or does.

COMP 2208 Dr. Long Tran-Thanh University of Southampton Reinforcement Learning.

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.

Does the brain compute confidence estimates about decisions?

T T18-02 Weighted Moving Average Forecast Purpose Allows the analyst to create and analyze the "Weighted Moving Average" forecast for up to 5.

Dynamics of Reward Bias Effects in Perceptual Decision Making Jay McClelland & Juan Gao Building on: Newsome and Rorie Holmes and Feng Usher and McClelland.

Lecture 1.31 Criteria for optimal reception of radio signals.

Estimating Volatilities and Correlations

A Step-By-Step Tutorial for the Discipline Data Reporting Tool The Delaware Positive Behavior Support Project Slide 1: Welcome to.

A Step-By-Step Tutorial for the Discipline Data Reporting Tool The Delaware Positive Behavior Support Project Slide 1: Welcome to.

Sean Duffy Steven Gussman John Smith

Measures of central tendency

Learning Theory Reza Shadmehr

Christine Fry and Alex Park March 30th, 2004

Neural Signatures of Economic Preferences for Risk and Ambiguity

Mathematical Foundations of BME

Presentation transcript:

M ☺ deling of User Behavior In Matching Task Based on Previous Reward History and Personal Risk Factor April 1, 2004 Helen Belogolova Amy Daitch

Project Summary Experiment: –Subjects given matching task in which they choose between button A and B –Received reward based on predetermined reward functions Our Model: –Subject’s memory decay: leaky integration –Personal Risk Factor –Cumulative Risk Factor

Method for Modeling the Behavior General Method –Part I. Exploratory Phase P(A) = 0.5, P(B) = 0.5 First 10 trials have an equal probability of choosing A or B –Part II. Choices Based on Past Reward History Reward function took into account 40 trial buffer updated after each trial Vector of rewards weighted based on leaky integrator model with decay parameter d: weighted_rewards_vector = [exp(1*d) exp(2*d) … exp(240*d) ]’ * reward_vector Most recent reward carries most influence on subject’s next move

Method for Modeling the Behavior To choose between A and B we sum up the weighted rewards after A button presses (rewA) and B button presses (rewB) P(A) = rewA/(rewA+rewB) P(B) = 1-P(A) Based on these total rewards the next choice is generated like this: if rand(1) < p(A)  choice A else  choice B

Method for Modeling the Behavior Model Accounting for Risk –Risk = subject’s willingness to deviate from optimal choice based on past trials –Personality Risk Constant in experiment, Range from 0 to 1 Function of personality = willingness to take risks in general –Cumulative Risk, Range from 0 to 1 Increases as the Cumulative Reward increases cumulative_risk(trial) = cumulative_reward(trial)/max_cumulative_reward Maximum Cumulative Reward in our case was 6

Method for Modeling the Behavior Weights of Personal Risk Factor and Cumulative Reward Risk Factor make up Total Risk Factor: total_risk = personal_risk*personal_risk_weight + cumulative_risk*cumulative_risk_weight With the total risk parameter as above, the decisions are made like this and the choice of A or B is generated the same way as in the general model: p(A) = rewA/(rewA + rewB) – (rewA/(rewA + rewB) – 0.5)*2*total_risk p(B) = 1 – p(A)

Results Ran experiment on model, varying one parameter at a time Since stochastic decisions, ran experiment several times for each set of parameters to diminish the effects of randomness A subject could produce somewhat different results if experiment done more than once = we ran the experiment on the model many times to see how a subject with certain characteristics would behave.

Results We then plotted the ratio of the subject’s button press within the buffer vs. the trial number and observed that: –Varying only personal risk factor = most successful when risk factor very high or very low (same in this experiment) Below: personal risk, cumulative risk = 0

Personal risk = 0.5, Cumulative Risk = 0

Personal Risk = 1, Cumulative Risk = 0

Results – Varying only cumulative reward risk factor = more successful as cumulative reward risk increases –Below(cumulative risk = 0.25, personal risk = 0)

Cumulative risk = 1, Personal risk = 0

Results – Decay rates 0.5, 1.0, and 2.0 while keeping risk factor zero At decay rate of 2.0 succeeded the most At the decay rate of 0.5 had the least success. This suggests that the most important rewards to remember are the ones in the immediate past

Decay rate = 0.5

Decay Rate = 2.0

Comparison of Results With Real Data Compared choices of our model with those of the tested subjects –Cross correlated the choice vector of the subject (real data) with the choice vectors we generated by our model for all of the variations

Comparison of Results With Real Data –Observed strong correlation between our subjects and the models with very high personal risk factors and very low personal risk factors (below: p-risk, c-risk = 0)

Comparison of Results With Real Data –For the cumulative reward risk parameter we found that as it increased, with personal risk constant at zero, the correlation improved (below: cumulative risk = 0.25)

Cumulative risk = 1, Personal risk = 0

Comparison of Results With Real Data -Changing the decay rate in the model didn’t appear to affect correlation between model and subject generated data (decay rate = 1)

Decay rate = 2