Graphical Multiagent Models Quang Duong Computer Science and Engineering Chair: Michael P. Wellman 1.

Slides:

Advertisements

Similar presentations

Recommender System A Brief Survey.

Advertisements

1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.

Autonomic Scaling of Cloud Computing Resources

Evolving Cooperation in the N-player Prisoner's Dilemma: A Social Network Model Dept Computer Science and Software Engineering Golriz Rezaei Michael Kirley.

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

Partially Observable Markov Decision Process (POMDP)

3. Basic Topics in Game Theory. Strategic Behavior in Business and Econ Outline 3.1 What is a Game ? The elements of a Game The Rules of the.

Discrete Choice Model of Bidder Behavior in Sponsored Search Quang Duong University of Michigan Sebastien Lahaie

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

Background Reinforcement Learning (RL) agents learn to do tasks by iteratively performing actions in the world and using resulting experiences to decide.

1 Graphical Models for Online Solutions to Interactive POMDPs Prashant Doshi Yifeng Zeng Qiongyu Chen University of Georgia Aalborg University National.

Xiaowei Ying, Xintao Wu, Daniel Barbara Spectrum based Fraud Detection in Social Networks 1.

Graduate Center/City University of New York University of Helsinki FINDING OPTIMAL BAYESIAN NETWORK STRUCTURES WITH CONSTRAINTS LEARNED FROM DATA Xiannian.

Introduction of Probabilistic Reasoning and Bayesian Networks

Modeling Seller Listing Strategies Quang Duong University of Michigan Neel Sundaresan Nish Parikh Zeqiang Shen eBay Research Labs 1.

A camper awakens to the growl of a hungry bear and sees his friend putting on a pair of running shoes, “You can’t outrun a bear,” scoffs the camper. His.

Algoritmi per Sistemi Distribuiti Strategici

Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.

Networked Games: Coloring, Consensus and Voting Prof. Michael Kearns Networked Life MKSE 112 Fall 2012.

Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.

Temporal Action-Graph Games: A New Representation for Dynamic Games Albert Xin Jiang University of British Columbia Kevin Leyton-Brown University of British.

Localized Techniques for Power Minimization and Information Gathering in Sensor Networks EE249 Final Presentation David Tong Nguyen Abhijit Davare Mentor:

AWESOME: A General Multiagent Learning Algorithm that Converges in Self- Play and Learns a Best Response Against Stationary Opponents Vincent Conitzer.

Computer vision: models, learning and inference Chapter 10 Graphical Models.

Impact of Problem Centralization on Distributed Constraint Optimization Algorithms John P. Davin and Pragnesh Jay Modi Carnegie Mellon University School.

Simple search methods for finding a Nash equilibrium Ryan Porter, Eugene Nudelman, and Yoav Shoham Games and Economic Behavior, Vol. 63, Issue 2. pp ,

A Study of Computational and Human Strategies in Revelation Games 1 Noam Peled, 2 Kobi Gal, 1 Sarit Kraus 1 Bar-Ilan university, Israel. 2 Ben-Gurion university,

Models of Influence in Online Social Networks

01/16/2002 Reliable Query Reporting Project Participants: Rajgopal Kannan S. S. Iyengar Sudipta Sarangi Y. Rachakonda (Graduate Student) Sensor Networking.

Strategic Modeling of Information Sharing among Data Privacy Attackers Quang Duong, Kristen LeFevre, and Michael Wellman University of Michigan Presented.

Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.

History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA.

Learning Structure in Bayes Nets (Typically also learn CPTs here) Given the set of random variables (features), the space of all possible networks.

Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Generalized and Bounded Policy Iteration for Finitely Nested Interactive POMDPs: Scaling Up Ekhlas Sonu, Prashant Doshi Dept. of Computer Science University.

Some Analysis of Coloring Experiments and Intro to Competitive Contagion Assignment Prof. Michael Kearns Networked Life NETS 112 Fall 2014.

第十讲概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models

1 The Price of Defense M. Mavronicolas , V. Papadopoulou , L. Michael ¥, A. Philippou , P. Spirakis § University of Cyprus, Cyprus  University of Patras.

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Computing and Approximating Equilibria: How… …and What’s the Point? Yevgeniy Vorobeychik Sandia National Laboratories.

Modeling Agents’ Reasoning in Strategic Situations Avi Pfeffer Sevan Ficici Kobi Gal.

Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots that Learn from Humans in Creating Brain-like Intelligence. Course: Robots.

How to Analyse Social Network? : Part 2 Game Theory Thank you for all referred contexts and figures.

Networked Games: Coloring, Consensus and Voting Prof. Michael Kearns Networked Life NETS 112 Fall 2013.

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

Learning and Acting with Bayes Nets Chapter 20.. Page 2 === A Network and a Training Data.

Repeated Game Modeling of Multicast Overlays Mike Afergan (MIT CSAIL/Akamai) Rahul Sami (University of Michigan) April 25, 2006.

MAIN RESULT: We assume utility exhibits strategic complementarities. We show: Membership in larger k-core implies higher actions in equilibrium Higher.

Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.

04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.

1 (Chapter 3 of) Planning and Control in Stochastic Domains with Imperfect Information by Milos Hauskrecht CS594 Automated Decision Making Course Presentation.

Introduction on Graphic Models

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)

Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Building Valid, Credible & Appropriately Detailed Simulation Models

Integrative Genomics I BME 230. Probabilistic Networks Incorporate uncertainty explicitly Capture sparseness of wiring Incorporate multiple kinds of data.

Keep the Adversary Guessing: Agent Security by Policy Randomization

Semi-Supervised Clustering

Non-additive Security Games

Recovering Temporally Rewiring Networks: A Model-based Approach

CASE − Cognitive Agents for Social Environments

Markov Random Fields Presented by: Vladan Radosavljevic.

Building Valid, Credible, and Appropriately Detailed Simulation Models

Reinforcement Learning Dealing with Partial Observability

Normal Form (Matrix) Games

Presentation transcript:

Graphical Multiagent Models Quang Duong Computer Science and Engineering Chair: Michael P. Wellman 1

Example: Election In The City Of AA 2 Political discussion Vote May, political analyst Phone surveys Demographic information Party registration …

Modeling Objectives Construct a model that takes into account people (agent) interactions (graph edges) in: – Representing joint probability of all vote outcomes* – Computing marginal and conditional probabilities Vote Republican or Democrat? 3

Modeling Objectives (cont.) Generate predictions: – Individual actions, dynamic behavior induced by individual decisions – Detailed or aggregate 4

More Applications Of Modeling Multiagent Behavior 5 Financial Institutions Computer Network/ Internet Social Network

Challenges: Uncertainty from the system modeler’s perspective 1a. Agent choice Vote for personal favorite or conform with others? 1b. Correlation Will the historic district of AA unanimously pick one candidate to support? 1c. Interdependence May does not know all friendship relations in AA 6

Challenges: Complexity 2a. Representation and inference Number of all action configurations (all vote outcomes) is exponential in the number of agents (people). 2b. Historical information People may change their minds about whom to vote for after discussions. 7

Existing Approaches That This Work Builds On Game-theory Approach: Assume game structure/perfect rationality Statistical Modeling Approach: Aggregate statistical measures/ make simplifying assumptions 8

Approach Outline Graphical Multiagent Models (GMMs) are probabilistic graphical models designed to Facilitate expressions of different knowledge sources about agent reasoning Capture correlated behaviors while Exploiting dependence structure 9 uncertainty complexity

Roadmap (Ch. 3) GMM (static) (Ch. 4) History- Dependent GMM (Ch. 6) Application: Information Diffusion 10 (Ch. 2) Background (Ch. 5) Learning Dependence Graph Structure (Ch. 2) Background

Multiagent Systems n agents {1,…,i,…,n} Agent i chooses action a i, joint action (action configuration) of the system: a = (a 1,…, a n ) In dynamic settings: – time period t, time horizon T. – history H t of history horizon h, H t = (a t-h,…,a t-1 ) 11

Game Theory Each player (agent i) chooses a strategy (action a i ). Strategy profile (joint action a) of all players. Payoff function: u i (a i,a -i ) Player i‘s regret ε i (a): maximum gain if player i chooses strategy a i ’, instead of strategy a i, given than everyone else fixes their strategies. a * is a Nash equilibrium (NE) if for every player i, regret ε i (a) = 0. 12

Graphical Representations of Multiagent Systems 1.Graphical Game Models [Kearns et al. ‘01] An agent’s payoff depends on strategy chosen by itself and its neighbors J i Payoff/utility: u i (a i,a J i ) Similar approaches: Multiagent influence diagrams (MAIDs) [Koller & Milch ’03] Networks of Influence Diagrams [Gal & Pfeffer ’08] Action-graph games [Jiang et al ‘11]. 13

Graphical Representations (cont.) 2. Probabilistic graphical models Markov random field (static) [Kindermann & Laurie ’80, KinKoller & Friedman ‘09] Dynamic Bayesian Networks [Kanazawa & Dean ’89, Ghahramani ’98] 14

Probabilistic Graphical Models This Work demonstrate and examine the benefits of applying probabilistic graphical models to the problem of modeling multiagent behavior in scenarios with different sets of assumptions and information available to the system modeler. 15 Building on Game Models incorporating

Roadmap (Ch. 3) GMM (static) (Ch. 4) History- Dependent GMM (Ch. 5) Learning Dependence Graph Structure (Ch. 6) Application: Information Diffusion 16 (Ch. 2) Background 1. Overview 2. Examples 3. Knowledge Combination 4. Empirical Study

Graphical Multiagent Models (GMMs) [Duong, Wellman & Singh ‘08] Nodes: agents. Edges: dependencies among agent actions Dependence neighborhood N i

GMMs Pr(a) ∝ Π i π i (a N i ) Joint probability distribution of system’s actions Joint probability distribution of system’s actions potential of neighborhood’s joint actions 18 Factor joint probability distribution into neighborhood potentials. (Markov random field for graphical games [Daskalakis & Papadimitriou ’06])

Example GMMs Markov Random Field for computing pure strategy Nash equilibrium Markov Random Field for computing correlated equilibrium Information diffusion GMMs [Ch. 6] Regret GMMs [Ch. 3] 19

Examples: Regret potential Assume a graphical game Regret ε(a N i ) π i (a N i ) = exp(-λ ε i (a N i )) Illustration: 20 Assume: prefers Republican to Democrat (fixing others’ choices) Near zero λ: picks randomly Larger λ: more likely to pick Republican

Flexibility: Knowledge Combination Assume known graph structures, given GMMs G 1 and G 2 that represent 2 different knowledge sources Final GMM finalG GMM 2 GMM 1 Knowledge Combination Regret GMM reG Heuristic Rule-based GMMhG 21 1.Direct update 2.Opinion pool 3.Mixing data

Example Domain: Technology Upgrade Node: company Edge: partnership Action: upgrade or retain existing technology Ground truth: Reinforcement learning process (derived from both reG and hG) [Krebs ’02] 22

Empirical Study Combining knowledge sources in one GMM improves predictions Combined models fail to improve on input models when input does not capture any underlying behavior ratio > 1: combined model performs better than input model 23 Mixing data GMM vs. regret GMM Mixing data GMM vs. heuristic GMM

Empirical Results Combining knowledge sources in one GMM improves predictions ratio > 1: combined model performs better than input model 24

Empirical Results (cont.) Combined models fail to improve on input models when input does not capture any underlying behavior 25 Input D & test D’: similar behavior Input E & test E’: similar behavior

Summary Of Contributions (Ch. 3) (I.A) GMMs accommodate expressions of different knowledge sources (I.B) This flexibility allows the combination of models for improved predictions 26

Roadmap (Ch. 3) GMM (static) (Ch. 4) History- Dependent GMM (Ch. 6) Application: Information Diffusion 27 (Ch. 2) Background 1. Consensus Dynamics 2. Description 3. Joint vs. individual behavior 4. Empirical study (Ch. 5) Learning Dependence Graph Structure

Example: Consensus Dynamics [Kearns et al. ’09] abstracted version of the AA mayor election example AgentBlue consensus Red consensus neither Observation graph Agent 1’s perspective 28 Examine the ability to make collective decisions with limited communication and observation

Network structure here plays a large role in determining the outcomes 29 time

Modeling Multiagent Behavior In Consensus Dynamics Scenario time Time series action data + observation graph 1. Predict detailed actions 2. Predict aggregate measures or

History-Dependent Graphical Multiagent Models (hGMMs) [Duong, Wellman, Singh & Vorobeychik ’10] We condition actions on abstracted history H t Note: dependence graphs can be different from observation graphs t-1tt+1 31

hGMMs (Undirected) within-time edges: dependencies between agent actions in the same time period, and define dependence neighborhood N i for each agent i. A GMM at every time t t-1tt+1 32

hGMMs (Directed) across-time edges: dependencies of agent i’s action on some abstraction of prior actions by agents in i’s conditioning set Γ i Example: frequency function t-1t+1 33

hGMMs Pr(a t | H) ∝ Π i π i (a t N i | H t Γ i ) Joint probability distribution of system’s actions at time t Joint probability distribution of system’s actions at time t potential of neighborhood’s joint actions at t history of the conditioning set 34

Challenge: Dependence Conditional independence Dependence induced by history abstraction/summarization (*) t-2t-1t t-2t-1t

Individual vs. Joint Behavior Models Given complete history, autonomous agents’ behaviors are conditionally independent Individual behavior models: π i (a t i | H t Γ i,complete ) Joint behavior models allow specifying any action dependence within one’s within-time neighborhood, given some (abstracted) history π i (a t N i | H t Γ i,abstracted ) 36

Empirical Study: Summary Evaluation: compares joint behavior and individual behavior models by likelihood of testing data (time-series votes) * Observation graph defines both dependence neighborhoods N and conditioning sets Γ 1.Joint behavior outperform individual behavior models for shorter history lengths, which induce more action dependence. 1.Approximation does not deteriorate performance 37

Summary Of Contributions (Ch. 4) (II.A) hGMMs support inference about system dynamics (II.B) hGMMs allow the specification of action dependence emerging from history abstraction 38

Roadmap (Ch. 3) GMM (static) (Ch. 4) History- Dependent GMM (Ch. 6) Application: Information Diffusion 39 (Ch. 2) Background 1. Learning Graphical Game Models (Ch. 5) Learning Dependence Graph structure 2. Learning hGMMs

Learning Graphical Game Models [Duong, Vorobeychik, Singh & Wellman ‘09] Payoff dependence (graphical games) ≠ Probabilistic dependence (GMMs) Example: u 1 (a 1,a 3,a 2, a 5,a 6 ) Do not directly observe the underlying graphical structures, but strategy profiles and their corresponding payoffs

Contributions (Ch. 5.1) (III.A) The learning problem’s definition, theoretical characteristics and evaluation metrics are formally introduced and formulated. (III.B) An evaluation of structure learning algorithms reveals that a greedy approach often offers best time-performance tradeoff. 41

Learning History-Dependent Graphical Multiagent Models Objective Given action data + observation graph, build a model that predicts: – Detailed actions in next period – Aggregate measures of actions in the more distant future Challenge: Learn dependence graph – (Within-time) Dependence graph ≠ observation graph – Complexity of the dependence graph 42

Consensus Dynamics Joint Behavior Model Extended Joint Behavior hGMM (eJCM) π i (a N i | H t Γ i ) = r i (a N i ) f(a i, H t Γ i ) γ Ι(a i, H t i ) β 1.r i (a N i ) = reward for action a i, discounted by the number of dissenting neighbors in N i 1.frequency of a i chosen previously by agents in the conditioning set Γ i 2.inertia proportional to how long i has maintained its most recent action

Consensus Dynamics Individual Behavior Models 1. Extended Individual Behavior hGMM (eICM): similar to eJCM but assumes that N i contains i only π i (a i | H t Γ i ) = Pr(a i | H t Γ i ) ∝ r i (a i ) f(a i, H t Γ i ) γ Ι(a i, H t i ) β 2. Proportional Response Model (PRM): only incorporates the most recent time period [Kearns et al., ‘09]: Pr(a i | H t Γ i ) ∝ r i (a i ) f(a i, H t Γ i ) 3. Sticky Proportional Response Model (sPRM) 44

Learning hGMMS 45 Input: observation graph Search space: 1.Model parameters γ, β 2.Within-time edges Output: hGMM Objective: likelihood of data Constraint: max node degree

Greedy Learning Initialize the graph with no edges Repeat: Add edges that generate the biggest increase (>0) in the training data’s likelihood Until no edge can be added without violating the maximum node degree constraint 46

Empirical Study: Learning from human-subject data Use asynchronous human-subject data Vary the following environment parameters: Discretization intervals, delta (0.5 and 1.5 seconds) History lengths, h Graph structures/payoff functions: coER_2, coPA_2, & power22 (strongly connected minority) Goal: evaluate eJCM, eICM, PRM, and sPRM using 2 metrics Negative likelihood of agents’ actions Convergence rates/outcomes 47

Predicting Dynamic Behavior eJCMs and eICMs outperform the existing PRMs/sPRMs eJCMs predict actions in the next time period noticeably more accurately than PRMs and sPRMs, and (statistically significantly) more accurate than eICMs 48

Predicting Consensus Outcomes eJCMs have comparable prediction performance with other models in 2 settings: coER_2 and coPA_2. In power22, eJCM predict consensus probability and colors much more accurately. 49

Graph Analysis In learned graphs, intra edges >> inter edges. In power22, a large majority of edges are intra red  identify the presence of a strongly connected red minority 50

Summary Of Contributions (Ch. 5.2) (II.B) [revisit] This study highlights the importance of joint behavior modeling (III.C) It is feasible to learn both dependence graph structure and model parameters (III.D) Learned dependence graphs can be substantially different from observation graphs 51

Modeling Multiagent Systems: Step By Step 52 Given as input Learn from data Intuition, background information Approximation Dependence graph structure Potential function GMM hGMM Observation graph structure

Roadmap (Ch. 3) GMM (static) (Ch. 4) History- Dependent GMM (Ch. 6) Application: Information Diffusion 53 (Ch. 2) Background (Ch. 5) Learning Dependence Graph structure 1. Definition 2. Joint behavior modeling 3. Learning missing edges 4. Experiments

Networks with Unobserved Links Links facilitate how information diffuses from one node to another Real-world nodes have links unobserved by third parties 54 True network G* Observed Network G

Problem Given: a network (with missing links) and snapshots of the network states over time. Objective: model information diffusions on this network 55 [Duong, Wellman & Singh ‘11] 1.Network G 2.Diffusion traces (on G*)

Approach 1: Structure Learning Recover missing edges Learn network G’ Learn parameters of an individual behavior model built on G’ Learning algorithms: NetInf [Gomez-Rodriguez et al. ’10] and MaxInf 56

Approach 2: Potential Learning Construct an hGMM on G without recovering missing links hGMMs allow capturing state correlations between neighbors who appear disconnected in the input network Theoretical evidence [6.3.2] Empirical illustrations: hGMMs outperform individual behavior models on learned graph – random graph with sufficient training data – preferential attachment graph (varying amounts of data) 57

Summary of Contributions (Ch. 6) (II.C) Joint behavior hGMM, can capture state dependence caused by missing edges 58

Conclusions 1. The machinery of probabilistic graphical models helps to improve modeling in multiagent systems by: allowing the representation and combination of different knowledge sources of agent reasoning relaxing assumptions about action dependence (which may be a result of history abstraction or missing edges) 2. One can learn from action data both: (i) model parameters, and (ii) dependence graph structure, which can be different from interaction/observation graph structure 59

Conclusions (cont.) 3. The GMM framework contributes to the integration of: strategic behavior modeling techniques from AI and economics probabilistic models from statistics that can efficiently extract behavior patterns from massive amount of data for the goal of understanding fast-changing and complex multiagent systems. 60

Summary Graphical multiagent models: flexibility to represent different knowledge sources and combine them [UAI ’08] History-dependent GMM: capture dependence in dynamic settings [AAMAS ’10, AAMAS ’12] Learning graphical game models [AAAI ’09] Learning hGMM dependence graph, distinguishing observation/interactions graphs and probabilistic dependence graphs [AAMAS ‘12] Modeling information diffusion in networks with unobserved links [SocialCom ‘11] 61

Acknowledgments Advisor: Professor Michael P. Wellman Committee members: Prof. Satinder Singh Baveja, Prof. Edmund H. Durfee, and Asst. Prof. Long Nguyen Research collaborators: Yevgeniy Vorobeychik (Sandia Labs), Michael Kearns (U Penn), Gregory Frazier (Apogee Research), David Pennock and others (Yahoo/Microsoft Research) Undergraduate advisor: David Parkes. Family Friends CSE staff 62

THANK YOU! 63