Download presentation
Presentation is loading. Please wait.
1
Applications of Multi-Agent Learning in E-Commerce and Autonomic Computing Jeff Kephart IBM Research kephart@us.ibm.com December 14, 2002
2
Two broad application areas E-commerce Large-scale competitive MAS Billions of economically motivated agents Buying and selling information goods and services Adaptive, and coupled directly and indirectly (through markets) http://www.research.ibm.com/infoecon Autonomic computing Large-scale cooperative or competitive MAS Self-managing computing systems Self-configuring, Self-healing, Self-optimizing, Self-protecting http://www.research.ibm.com/autonomic IEEE Computer, January 2003
3
The Future Information Economy Billions of interacting, adaptive agents. What emergent behaviors will arise?
4
Experiment Bidding Agents vs. Humans CDA is common in financial markets Extensive prior literature –All-human experiments (Vernon Smith) –All-agent experiments (SFI DA, Gode-Sunder, Cliff, TAC) Market History 0 25 50 75 100 0246810 Time Price Trade Ask Bid Continuous Double Auction
5
Watson Experimental Economics Lab Copyright New York Times
6
Seller GUI (CDA) Limit Prices Bid Queue Ask Queue Submit
7
Wakeup? Compute Order Brain Bookkeeper Message Handler Market state Agent state Bidding Agent Place Order Bidding Agent Architecture Auction Info Orders Auctioneer Agent GUI
8
Agent-Human experiments Human subjects recruited from local colleges and IBM Research given interactive instructions and test paid in proportion to surplus Setup 6 Humans, 6 Agents 6 Buyers, 6 Sellers Each agent shares limit prices with a human Experiment 9 to 16 3-minute periods Limit prices change every 3-5 periods Record bids, asks, trades
9
Experiment #6: Fast GD vs. Humans
10
Summary of experimental results Agents won by substantial margins in all experiments ~20% more surplus than novice humans ~5-7% more surplus than experienced humans Agents and humans interact with one another Not two decoupled markets ~30-50% of trades are agent-human Market efficiency improves with number of agents Humans fare better when there are more agents Agents can supplant humans as economic decision makers
11
All-agent experiments Simulator Discrete-time; stochastic asynchronous dynamics. Ran mixtures of several strategies and variants Market parameters 10 buyers, 10 sellers 10 units each. Fixed limit prices (chosen randomly) 100 expts, 5 trading pds/expt, 300 time steps/pd. Experimental comparisons Homogeneous (0 A vs. 20 B) One-in-Many Tests (1 A vs. 19 B) Balanced Team Tests (10 A vs. 10 B)
12
Expt. 2: 1 A vs. 19 B Differential Efficiency ZIP and MGD invade ZI, Kaplan & GD But don’t invade one another Kaplan can invade all strategies All strategies invade ZI ZI doesn’t invade any
13
Expt. 3: 10 A vs. 10 B Differential Surplus (out of 2612 total) ZI beats Kaplan 100-0! Other strategies beat Kaplan, but by smaller margin GDX > MGD > ZIP > GD > ZI > Kaplan
14
Evolutionary dynamics CDA game What happens when agents gradually switch to more successful strategies? No strategy is dominant. This is a useful view for Mechanism design Agent design
15
Dynamic pricing game Shopbots and pricebots
16
Dynamic Pricing Meta-payoff matrix
17
Evolutionary dynamics Dynamic pricing game NE Computation 5 Agents20 Agents
18
Dynamic pricing game Shopbots and pricebots
19
A cure for myopia
20
Price response curves Symmetric and Asymmetric solutions Higher efficiency, but unstable!
21
Sequential Learning Limit Cycle
22
Autonomic Computing Self-managing computing systems Administration of individual systems is increasingly difficult 100s of configuration, tuning parameters for DB2, WebSphere Heterogeneous systems are becoming increasingly connected Integration becoming ever more difficult Architects can't intricately plan interactions among components Increasingly dynamic; more frequently with unanticipated components More of the burden must be assumed at run time But human system administrators can't assume the burden; already 6:1 cost ratio between storage admin and storage 40% outages due to operator error We need self-managing computing systems Behavior specified by sys admins via high-level policies System and its components figure out how to carry out policies
23
Evolving towards Self-management TodayThe Autonomic Future Self- configure Corporate data centers are multi-vendor, multi-platform. Installing, configuring, integrating systems is time- consuming, error-prone. Automated configuration of components, systems according to high-level policies; rest of system adjusts automatically. Seamless, like adding new cell to body or new individual to population. Self-heal Problem determination in large, complex systems can take a team of programmers weeks Automated detection, diagnosis, and repair of localized software/hardware problems. Self-optimize WebSphere, DB2 have hundreds of nonlinear tuning parameters; many new ones with each release. Components and systems will continually seek opportunities to improve their own performance and efficiency. Self-protect Manual detection and recovery from attacks and cascading failures. Automated defense against malicious attacks or cascading failures; use early warning to anticipate and prevent system-wide failures.
24
Autonomic element structure Fundamental atom of the architecture Managed element(s) Database, storage system, etc. Plus one autonomic manager Responsible for: Providing its service Managing its own behavior in accordance with policies Interacting with other autonomic elements An Autonomic Element Monitor Analyze Sensors Execute Plan Effectors Knowledge Autonomic Manager Managed Element SensorsEffectors
25
Autonomic elements interaction Relationships Dynamic, ephemeral Formed by agreement May be negotiated Full spectrum Peer-to-peer Hierarchical Subject to policies
26
AC System and Infrastructure SentinelBrokerAggregatorNegotiatorArbiterPlannerReputation Authority Registry Monitor SentinelNegotiatorArbiterBrokerEvent Correlator ServerDatabaseStorageNetworkServer ProvisionerWorkload Manager
27
Multi-agent Learning scenarios Designing system behavior Interacting feedback loops Negotiation and resource allocation Problem determination
28
Control & harness emergent behavior Understand, control, and exploit emergent behavior in autonomic systems. How do self-*, stability, etc. depend on Behaviors and goals of the agents Pattern and type of interactions among agents External influences and demands on system How to invert this relationship to achieve system goals?
29
Interacting control/optimization loops Transaction Requests Increase demand Server 1 DB Service Server 2 File System Storage Service 2 Storage Service 1 Increase service Feedback control & optimization of single autonomic elements Done for 1-2 variables What happens when feedback loops interact?
30
Interacting control/optimization loops Transaction Requests Increase demand Server 1 DB Service Server 2 File System Storage Service 2 Storage Service 1 Capacity limit reached: Get more storage X
31
Interacting control/optimization loops Transaction Requests Demand not being met: Find alternate supplier Server 1 DB Service Server 2 File System Storage Service 2 Storage Service 1 Getting more storage X
32
Interacting control/optimization loops Transaction Requests Sorry; already found an alternative Server 1 DB Service Server 2 File System Storage Service 2 Storage Service 1 Ready to give you that extra service X
33
Transaction Requests Server 1 DB 1 Server 2 File System 1 Storage 2Storage 1 Negotiation and resource allocation Request( QueryService, Queries = 800/sec, Type = 2, RT = 5 sec) Request( QueryService, Queries = 400/sec, Type = 5, RT = 3 sec) Request( TableSpace, Size = 3 GBytes, Reads = 2000/sec, Writes = 100/sec) Request( LogicalVolume, Size = 12 Gbytes, Reads = 500/sec, Writes = 500/sec) Counterpropose( TableSpace, Size = 3 GBytes, Reads = 1600/sec, Writes = 100/sec) Counterpropose( QueryService, Queries = 320/sec, Type = 5, RT = 4 sec) Policies: utility functions Compute costs, benefits from business contract, propagate them down. Forms of negotiation: Bilateral Multilateral Auction Supply chain Competitive/coop Learning During a negotiation Strategy evolution Collective behavior?
34
Problem Determination Construct adaptive statistical models of large networked systems Learn about inter-element dependencies (within locale) Determine model structure, parameters e.g. Bayes Net Monitor logs, use model to Detect potential problems Set up monitors as needed Diagnose problems Challenge Shifting topology
35
Closing remarks E-commerce: competitive, giga-agent MAS “Naïve” learning: apply single-agent learning, see what happens Analyze outcome Compute Nash eq. of meta-game, study evolutionary dynamics To improve strategies or improve market mechanisms http://www.research.ibm.com/infoecon Autonomic computing: cooperative or competitive MAS Coordinated learning is more possible Explicit correlation signals to reach correlated equilibria? Multi-agent Q learning Can dictate goals and strategies to agents http://www.research.ibm.com/autonomic IEEE Computer, January 2003
36
Backup slides
37
Autonomic Computing Architecture Based on a distributed, service-oriented architectural approach E.g., OGSA Every component provides or consumes services Policy-based management Autonomic elements Every component must be resilient, robust, self-managing Autonomic elements are the architecture’s way of achieving this for a component Behavior is specified and driven by policies Relationships between autonomic elements Based on agreements established and maintained by autonomic elements Governed by policies Give rise to resiliency, robustness, self-management of system Relationships are the architecture’s way of achieving this for system as a whole No architectural single point of failure
38
Engineering Challenge Conflict Resolution Increase throughput Decrease throughput Workload Manager Intrusion Detector Give me 700 MIPs 500 MBytes Give me 500 MIPs 600 MBytes Application Manager 1 Application Manager 2 Network Manager Server Element 1000 MIPs 1000 MBytes Priority(ID) > Priority(WLM) Utility(AM1) > Utility(AM2)
39
Engineering Challenge Human-Computer Interface Develop new languages, metaphors and translation technologies that enable humans to monitor, visualize, and control AC systems Specify goals and objectives to AC systems, and visualize their potential effect Techniques must be Sufficiently expressive of preferences regarding cost vs. performance, security, risk and reliability Sufficiently structured and/or naturally suited to human psychology and cognition to keep specification errors to an absolute minimum Robust to specification errors
40
Scientific Challenge Multi-agent Learning Today: Lots of good practical techniques for single agent to learn about a static agent or environment, with solid theory to back it up.
41
Scientific Challenge Multi-agent Learning Challenge: Establish theoretical foundation for understanding and performing learning and optimization in multi-agent systems.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.