Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group.

Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group

Intelligent perception of the external environment, self- management, mining data and discovering knowledge about it, reasoning new facts about it, planning own behavior within it and acting based on plans - are among the basic abilities of an intelligent agent Knowledge and facts Agent Environment Behavior Plans

The Agent Architecture: A Model Actuators Motorics Sensors Head: General Abilities Body: Application- specific Abilities

Agent-Driven Self-Management “Device” “Expert” “Service” Resource Agent

IBM: Autonomic Computing (1) §The computing domain is now a vast and diverse matrix of complex software, hardware and services. By 2020 we expect billions of devices and trillions of software processes, with a lot of data. And it's not just a matter of numbers. It's the complexity of these systems and the way they work together that is creating a shortage of skilled IT workers to manage all of the systems. It's a problem that's not going away, but will grow exponentially, just as our dependence on technology has. §Autonomic Computing is about how to enable computing systems to operate in a fully autonomous manner. No administration, just simple high-level policy statements. §Autonomic Computing is an approach to self-managed computing systems with a minimum of human interference. The term derives from the body's autonomic nervous system, which controls key functions without conscious awareness or involvement.

IBM: Autonomic Computing (2)

“Self-(re)configuration” example Professor in Software Engineering; Head of Artificial Intelligence Dep.; Head of “MetaIntelligence” Research Lab. … in Kharkov National University of Radioelectronics (Ukraine) Professor in Distributed Systems; Head of Industrial Ontologies Group … … in the Department of Mathematical Information Technology, University of Jyvaskyla (Finland) “Self-reconfigurable“ means that the system is capable of utilizing its own system of control to change its overall structural shape.

Self-Configurable Systems: Invented by Hollywood?

Why self (re)configuration? Versatility: Self-reconfigurable systems are potentially more adaptive than conventional systems. The ability to reconfigure allows a system to disassemble and reassemble components to form new morphologies that are better suited for new tasks, Robustness: Since system components are interchangeable (within a system and between different systems), self-reconfigurable systems can also replace faulty parts autonomously, leading to self-repair. Low Cost: Self-reconfigurable systems can potentially lower overall cost by making many copies of one (or relatively few) type of modules so economies of scale and mass production come into play. Also, a range of a system can be made from one set of modules, saving costs through reuse and generality of the system. One source of inspiration for the development of these systems comes from biological systems that self-construct out of a relatively small repertoire of lower- level building blocks (cells or amino acids, depending on the scale of interest). This architecture underlies the ability of biological systems to physically adapt, grow, heal, and even self-replicate - capabilities that would be desirable in many engineered systems.

Dynamic reconfigurability vs. self-configurability Dynamic reconfigurability is reconfigurability at run-time, but not necessarily based on system self-awareness and intentions and not necessarily supported by any special algorithms. Self-configurability is going one step beyond – the system is expected to autonomously and deliberately perform the reconfiguration.

Composition (content: components of data and capabilities), Structure (partonomy, business logic and interaction applied to content) and Parameters (features of the content and structure) What is configuration?

What is configuration of a self-configurable system? What is configuration of a self-configurable system? Everything is configurable !

What is an Intelligent Agent? Self-Configurability ! ENVIRONMENT Behavior Events Self-Configuration Self-Awareness

“Read Only” world “Write Only” world “Read and Write” world ωiωi φiφi Agent i alone may “play” a function: φ i = f i (ω i ), Where f i is individual behavior Agent i fifi “Behavior” function Agent and the World (W = Internal + External Environments)

Indirect Collaboration (via communication) Communication Agent i and Agent k together may “play” 6 types of functions: φ i = f i (ω i ); φ k = f k (ω k ); φ i = f ki (ω k ); φ k = f ik (ω i ); F 1 ()F 2 () φ i = F 1 (f i (ω i ), f ki (ω k )); φ k = F 2 (f k (ω k ), f ik (ω i )), F - where F - collaborative behavior ωiωi φiφi ωkωk φkφk fifi fkfk Need ontological mapping between Ont(Agent i ) and Ont(Agent k )

Indirect Control (via environment) Agent i may “play” 2 types of functions knowing rules of the environment: R φ i = f ( ω i ); φ j = R (f (ω i )), R where R – function of environmental rules ωiωi φiφi ωjωj φjφj Environmental Rules

Summary: Why Agents ? Growing complexity of computer systems and networks Distributed nature of systems (data, software, users, etc.) Ubiquitous computing, “Internet of Things” scalability challenges Need for self-manageability of a complex system Need for new software development paradigms in designing distributed systems Agent-based approach meets the above challenges … And finally: Agents are excellent tool for self-configuration !!!

Web of Configurations … is the Web of “partonomy” (a classification based on part-of relation; not the same as taxonomy, which is a classification based on similarities). Configuration of an object (parts and their relationships) together with all policies applied to these parts fully describes the object from inside. Facilitates Configuration- to-Configuration interaction isPartOf

Proactive Configuration Part_of product hierarchy in the ontology results to hierarchical MAS

Configuration of objects vs processes Axiom 1: Each resource in dynamic Industrial World is a process and each process in this world is a resource. Axiom 2: Hierarchy of subordination among resource agents in GUN corresponds to the “part-of” hierarchy of the Industrial World resources. 1.1 1 1.2 1.1.2 1.1.1 1.1.31.2.1 1.2.2 1.2.3 1 1.1 1.2 1.1.2 1.1.1 1.1.3 1.2.2 1.2.1 1.2.3

Resource Configuration Example hasColor (ID3, “Muticolor”) hasBehind (ID3, ID4) Locomotive (ID3) hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID6) hasConfiguration (ID4, ID7) hasConfiguratioin (ID5, ID8) hasConfiguration (ID1,ID2) hasColor (ID4, “Beige”) hasBehind (ID4, ID5) Car (ID4) hasColor (ID5, “Red”) hasAhead (ID5, ID4) Car (ID5) hasAhead (ID4, ID3)

Configuration Components hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID6) hasConfiguration (ID4, ID7) hasConfiguratioin (ID5, ID8) hasConfiguration (ID1,ID2) Object of configuration Content of configuration Class of the resource Structure of the resource Parameters’ values of the resource Configuration of structural components Content of configuration

ReconfigurationReconfiguration hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID6) hasConfiguration (ID4, ID7) hasConfiguratioin (ID5, ID8) hasConfiguration (ID1,ID2) hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID9) hasConfiguration (ID4, ID10) hasConfiguratioin (ID5, ID11) hasConfiguration (ID1,ID8)

Reconfiguration behavior (option 1: reordering) hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID9) hasConfiguration (ID4, ID10) hasConfiguratioin (ID5, ID11) hasConfiguration (ID1,ID8) hasColor (ID3, “Muticolor”) hasBehind (ID3, ID5) Locomotive (ID3) hasColor (ID4, “Beige”) hasBehind (ID5, ID4) Car (ID4) hasColor (ID5, “Red”) hasAhead (ID5, ID3) Car (ID5) hasAhead (ID4, ID5)

Reconfiguration behavior (option 2: recolor) hasPart (ID1,ID3) hasPart (ID1, ID4) Train (ID1) hasPart (ID1, ID5) hasDestinationTo (ID1, “Paris”) hasDestinationFrom (ID1, “Amsterdam”) hasConfiguration (ID3,ID6) hasConfiguration (ID4, ID13) hasConfiguratioin (ID5, ID14) hasConfiguration (ID1,ID12) hasColor (ID3, “Muticolor”) hasBehind (ID3, ID4) Locomotive (ID3) hasColor (ID5, “Beige”) hasBehind (ID4, ID5) Car (ID5) hasColor (ID4, “Red”) hasAhead (ID4, ID3) Car (ID4) hasAhead (ID5, ID4)

UBIWARE Abstract Architecture

Current UBIWARE Agent Architecture S-APL S-APL – Semantic Agent Programming Language (RDF-based) S-APL: http://www.cs.jyu.fi/ai/OntoGroup/ubidoc/http://www.cs.jyu.fi/ai/OntoGroup/ubidoc/ S-APL S-APL – is a hybrid of semantics (metadata / ontologies/ rules) specification languages, semantic reasoners, and agent programming languages. It integrates the semantic description of domain resources with the semantic prescription of the agents' behaviors S-APL

UBIWARE 3.0 (2009-2010) platform for cyber-physical systems (August 2010) UBIWARE 3.0 is a platform for creating and executing configurable distributed systems based on generalized and reusable business scenarios, which heterogeneous components (actors) are not predefined but can be selected, replaced and configured in runtime.

Key Components of UBIWARE Scientific Impact 3. Language 1. UBIWARE: Approach and Architecture 2. Engine 4. Ontonuts

Environment HardBody SoftBody SoftMind HardMind HardSoul SoftSoul UbiDubi: UBIWARE-driven-UBIWARE (i.e. UBIWARE architectural components are also agent-driven) RAB RAB – Reusable Atomic Behavior RBE RBE – Reusable Behavior Engine RAB RABRABRAB RBE RBERBERBE Beliefs (facts, rules, policies, plans, collaboration protocols) Shared Beliefs Shared RABs Shared RBEs Shared Meta-Beliefs Meta-Beliefs (contexts) “Life” Behavior Configuration (GENOME) Shared Hardware “Visible” to other agents through observation Ontobility Ontobility is self- contained, self-described, semantically marked-up proactive agent capability (agent-driven ontonut), which can be “seen”, discovered, exchanged, composed and “executed” (internally or remotely) across the agent platform in a task-driven way and which can perform social utility-based behavior Genome Genome is part of semantically marked-up agent configuration settings, which can serve as a tool for agent evolution: inheritance crossover and mutation May be an agent

Consequence of Self-Management: "Everything-as-a-User"

Brief Summary: My view to the future of doing internet-assisted business is based on the assumption that quite soon we (humans) will not be the only ones who will use (decide why and what, order and pay for) the services provided through the Web. The ongoing trend on bridging heterogeneous Webs (Web of Humans, Web of Services, Web of Things, Web of Knowledge, etc.) has resulted to the following well-known slogan: EaaS - "Everything-as-a-Service". However, why not to say wider: EaaS4E - "Everything-as-a-Service-for-Everything", which also implies: EaaU - "Everything-as-a-User"? I suggest to discuss this. It is really difficult to imagine, how many different and interesting challenges should be addressed to make this vision happen! I will try to point out some of them. To make the discussion more concrete, I would suggest to consider 3 groups of services (before - traditionally human-centric): 1. Education for Everything; 2. Wellness and Healthcare for Everything; 3. Financial Services for Everything. Lets imagine that actual service consumer is a device or machine, software component or software system, mathematical abstraction, knowledge or intelligence,... and try to discuss and figure out together, why and how all of these may proactively discover and utilize various services through the Web. And finally the most interesting one: how we (humans) will benefit out of it? "Everything-as-a-User""Everything-as-a-User"

Services as well as products can be consumed not only by humans

Educational Services for Everything?

Wellness and Healthcare Services for Everything ?

Financial Services for Everything ?

Urban Services for Everything ?

Have we realized already? We are not only users of the agents any more... We are “Service-Providers” as well Are we ready for that?

Agents and Decision-Making

“Culture of an individual or a group is a systematic manner of the deliberate use of own freedom.” Vagan Terziyan, 3 March, 2013

Watch this video first ! https://www.youtube.com/watch?v=1bqMY82xzWo “The Paradox of Choice” (by Prof. Renata Salecl)

Culture of an individual or a group is a systematic manner of the deliberate use of own freedom We mean the freedom to choose the preferred alternative of an action from the list of possible ones in certain situations... Either choice is made intentionally, with understanding the possible effects (impact) on the basis of previously accumulated (own and others') experience and knowledge, and not on the basis of instincts (as in the animal world) or random (chaotic) guessing... The choice of action is done on the basis of the stable well-established system of values (vector of weights of criteria for evaluating the alternatives) and its steady evolutionary dynamics (if any), and, therefore, can be predicted with the reasonable confidence and precision... Thus, the stable value system completely determines the culture of its host, and, because such a system can be easily formalized in mathematical terms, it allows to formalize, document, provide, upon request, visualize, analyze, compute (and so on) also the culture, as an object for possible use in the information systems to support the decision-making processes.4 1 2 3

Change internal environment; Change external environment; Move to other place; Communicate someone; Clone oneself; Collaborate with someone; Infer new knowledge; Make a plan; Buy new components; Choosing objectives, methods, strategies, tactics … … Good selection and objective evaluation of alternatives is the basis of decision-making.

Making predictive assessment of available choices aiming to maximize expected utility (positive impact and personal benefits from the choice) and in the same time to minimize cost (negative impact and personal losses from the choice).

Laws, formal duties, regulations, rules, policies and instructions Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative Knowledge, capabilities, experience

Laws, formal duties, regulations, rules, policies and instructions Knowledge, capabilities, experience Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative Comparably easy to change or reconfigure

Laws, formal duties, regulations, rules, policies and instructions Knowledge, capabilities, experience Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative More difficult to change as it demands “re-training”, reconfiguring or replacement of agents

Laws, formal duties, regulations, rules, policies and instructions Knowledge, capabilities, experience Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative Extremely difficult to capture, recognize, predict, influence and drive the change …

Laws, formal duties, regulations, rules, policies and instructions Knowledge, capabilities, experience Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative … and this is the most vulnerable place (week spot) for the “corruption” and agent’s not- predictability!

Laws, formal duties, regulations, rules, policies and instructions Knowledge, capabilities, experience Personal interest, culture, value system and emotions List of alternatives for the decision and relevant input data Decision made / chosen alternative

Major components of a personal «value systems» in terms of «Business Intelligence» Components of a Value System: 1.Quality indicators; 2.What is good and what is bad among quality indicators (“+” or “–” attached to weights in the formula); 3.Absolute Importance of an indicator (absolute values of weights normalized to [0,1]); 4.Harmony of Indicators (relative importance of the indicators or covariance matrix); 5.Choice-of-decision-function weights; 6.Trust-to-information-sources weights; 7.Meta-Indicators (weights of importance for imported value systems) Business Intelligence – «The road from data to evaluations and decisions supported by analytics»

Voting? … maybe but definitely not a democratic one… … i.e., before to assess possible decision options, one have to assess the decision-makers themselves first (weighted voting). Value Meta-System Consensus decision-making: majority approves and minority agrees; Voting-based methods: range voting lets each member score one or more of the available options and the option with the highest average is chosen; majority voting requires support from more than 50% of the members of the group; weighted voting is based on the idea that not all voters are equal, etc. Delphi method: is a facilitator-driven structured communication technique for groups; Auctions: used in multi-agent-systems to support group decision-making.

Decisions in conflict situations (Games Theory) Strategy anticipation, mixing strategies and expected utility http://www.st.ewi.tudelft.nl/~mathijs/tutorial/Paul-GTEASSS2007.pdf

A Taxonomy of Decision-Making Problems in Multi-Agent Systems

Agent Planning The task of coming up with a sequence of actions that will achieve a goal is called planning.

Logic as a KR and planning language Propositional Logic First Order Higher Order Modal Fuzzy Logic Multi-valued Logic Probabilistic Logic Temporal Non-monotonic Logic A modal logic is any logic for handling modalities: concepts like possibility, existence, necessity, eventually, formerly, can, could, might, may, must, etc. Temporal logic is used to describe any system of rules and symbolism for representing, and reasoning about, propositions qualified in terms of time. A non-monotonic logic is a formal logic, in which adding a formula to a theory may produce a reduction of its set of consequences. Fuzzy logic is derived from fuzzy set theory dealing with reasoning that is approximate rather than precisely deduced from classical first-order logic. Probabilistic logic is the logic, where the truth values of sentences are probabilities. Multi-valued logics is logic, in which there are more than two truth values. Propositional logic is logic that studies ways of joining and/or modifying entire propositions, statements or sentences to form more complicated ones, as well as the logical relationships and properties that are derived from these methods of combining or altering statements. First-order logic is a system of deduction extending propositional logic by the ability to express relations between individuals (e.g. people, numbers, and "things") more generally. A higher-order logic is the logic where it is allowed to quantify over predicates. A higher-order predicate is a predicate that takes one or more other predicates as arguments.

Syntax of FOL: Basic elements ConstantsKingJohn, 2, Penn,... PredicatesBrother, >,... FunctionsSqrt, LeftLegOf,... Variablesx, y, a, b,... Connectives , , , ,  Equality= Quantifiers , 

Atomic sentences Term =function (term 1,...,term n ) or constant or variable Atomic sentence =predicate (term 1,...,term n ) or term 1 = term 2 For example: – Brother(KingJohn, RichardTheLionheart) –> (Length(LeftLegOf(Richard)),Length(LeftLegOf(KingJohn)))

Complex sentences Complex sentences are made from atomic sentences using connectives  S, S1  S2, S1  S2, S1  S2, S1  S2, For example Sibling(KingJohn,Richard)  Sibling(Richard,KingJohn)

Precondition and Effect of Agent Actions JFKSFO P1P1 P2P2

What is Planning? Given knowledge about task domain (actions) Given problem specified by initial state configuration and goals to achieve Agent tries to find a solution, i.e. a sequence of actions that solves a problem Room 2 Room 1 Agent

Go to the basketGo to the cas Notions Plan sequence of (actions) transforming the initial state into a final state Operators representation of actions Planner algorithm that generates a plan from a (partial) description of initial and final state and from a specification of operators Room 2 Room 1

State-Space Search: Vacuum World example Initial state Goal L: moveLeft R: moveRight S: suck

Planning Example

Planning alternatives: Forward vs Backward Chaining i.e., Data vs Goal Driven vs Prediction Diagnostics

Goals and Anti-Goals: Attractors vs Reflectors Attractors: desirable states (goals) Reflectors: not desirable states (“traps”) Policies: constraints (“rules of the game”) Planning is about finding a course of actions (with respect to the policies) aiming to meet the attractors and avoid the reflectors

Example: The “Wolf-Goat-Cabbage” Problem A man (M) once had to travel with a wolf (W), a goat (G) and a cabbage (C). He had to take good care of them, since the wolf would like to taste a piece of goat if he would get the chance, while the goat appeared to long for a tasty cabbage. After some traveling, he suddenly stood before a river. This river could only be crossed using the small boat laying nearby at a shore. The river separates the beach A from the beach B. The boat was only good enough to take himself and one of his loads across the river. The other two subjects/objects he had to leave on their own. How must the man row across the river back and forth, to take himself as well as his luggage safe to the other side of the river, without having one eating another?

Location Predicate: At X 1 X 2 At (Where: X 1 ; What: X 2 ) A B Initial State S0S0 A B Goal (Attractor) Goal Example: The “Wolf-Goat-Cabbage” Problem S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) Location Predicate Policy: At X 1 X 2 ≠ X 3 X 1 At X 3 X 2 At (X 1 ; X 2 ) ˄ ≠ (X 3 ; X 1 )   At (X 3 ; X 2 ) Goal: At BMAt BWAt BGAt BC Goal: At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (B; C)

Example: The “Wolf-Goat-Cabbage” Problem R 1 : At AMAt BWAt BGAt BC R 1 : At (A; M) ˄ At (B; W) ˄ At (B; G) ˄ At (B; C) R 2 : At AMAt AWAt BGAt BC R 2 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) R 3 : At AMAt ACAt BGAt BW R 3 : At (A; M) ˄ At (A; C) ˄ At (B; G) ˄ At (B; W) R 4 : At AWAt AGAt ACAt BM R 4 : At (A; W) ˄ At (A; G) ˄ At (A; C) ˄ At (B; M) A B A B A B R1R1 R2R2 R3R3 A B A B A B R4R4 R5R5 R6R6 Anti-Goals (Reflectors) R 5 : At AGAt ACAt BMAt BW R 5 : At (A; G) ˄ At (A; C) ˄ At (B; M) ˄ At (B; W) R 6 : At AWAt AGAt BCAt BM R 6 : At (A; W) ˄ At (A; G) ˄ At (B; C) ˄ At (B; M)

Policies (allowed action settings) : A B B A A B B A A B B A A B B A a1a1 a3a3 a5a5 a7a7 a2a2 a4a4 a6a6 a8a8 Action Available: Carry X 1 X 2 X 3 Carry (From: X 1 ; To: X 2 ; Whom: X 3 ) X1X1 X2X2 X3X3 Ranges (policies applied to parameters) : Range X 1 AB Range (X 1 ) = {A; B}; Range X 2 AB Range (X 2 ) = {A; B}; Range X 3 WGCNone Range (X 3 ) = {W; G; C ; None} a 1 : Carry ABNone a 1 : Carry (A; B ; None) 8 legal action configurations: a 2 : Carry BANone a 2 : Carry (B; A ; None) a 3 : Carry ABC a 3 : Carry (A; B ; C) a 4 : Carry BAC a 4 : Carry (B; A ; C) a 5 : Carry ABW a 5 : Carry (A; B ; W) a 6 : Carry BAW a 6 : Carry (B; A ; W) a 7 : Carry ABG a 7 : Carry (A; B ; G) a 8 : Carry BAG a 8 : Carry (B; A ; G) Action Precondition and Effect: Precondition: At X 1 X 3 At X 1 M At X 2 X 3 At X 2 M Precondition: At (X 1 ; X 3 ) ˄ At (X 1 ; M); [implicit precondition:  At (X 2 ; X 3 ) ˄  At (X 2 ; M)] Effect: At X 2 X 3 At X 2 M At X 1 X 3 At X 1 M Effect: At (X 2 ; X 3 ) ˄ At (X 2 ; M) ; [implicit effect:  At (X 1 ; X 3 ) ˄  At (X 1 ; M)]

Suggest your solution!

Goal: At BMAt BWAt BGAt BC Goal: At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (B; C) S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 Goal a 7 : Carry ABG a 7 : Carry (A; B ; G) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 3 : Carry ABC a 3 : Carry (A; B ; C) S 0 : At BMAt AWAt BGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) S3S3 a 8 : Carry BAG a 8 : Carry (B; A ; G) S 0 : At AMAt AWAt AGAt BC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (B; C) S4S4 a 5 : Carry ABW a 5 : Carry (A; B ; W) S 0 : At BMAt BWAt AGAt BC S 0 : At (B; M) ˄ At (B; W) ˄ At (A; G) ˄ At (B; C) S5S5 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt BWAt AGAt BC S 0 : At (A; M) ˄ At (B; W) ˄ At (A; G) ˄ At (B; C) S6S6 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Planning

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (1)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 1 : Carry ABNone a 1 : Carry (A; B ; None) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (2)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 1 : Carry ABNone a 1 : Carry (A; B ; None) S 0 : At BMAt AWAt AGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) R4R4 The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (3)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 1 : Carry ABNone a 1 : Carry (A; B ; None) S 0 : At BMAt AWAt AGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) R4R4 The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (4)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 3 : Carry ABC a 3 : Carry (A; B ; C) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (6)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 3 : Carry ABC a 3 : Carry (A; B ; C) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (7) S 0 : At BMAt AWAt AGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (A; G) ˄ At (B; C) R6R6

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 3 : Carry ABC a 3 : Carry (A; B ; C) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (8) S 0 : At BMAt AWAt AGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (A; G) ˄ At (B; C) R6R6

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 5 : Carry ABW a 5 : Carry (A; B ; W) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (10)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 5 : Carry ABW a 5 : Carry (A; B ; W) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (11) S 0 : At BMAt BWAt AGAt AC S 0 : At (B; M) ˄ At (B; W) ˄ At (A; G) ˄ At (A; C) R5R5

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 5 : Carry ABW a 5 : Carry (A; B ; W) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (12) S 0 : At BMAt BWAt AGAt AC S 0 : At (B; M) ˄ At (B; W) ˄ At (A; G) ˄ At (A; C) R5R5

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (14)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (15) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (16) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (17) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 8 : Carry BAG a 8 : Carry (B; A ; G)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (18) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 8 : Carry BAG a 8 : Carry (B; A ; G)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (20) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 1 : Carry ABNone a 1 : Carry (A; B ; None)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (21) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 1 : Carry ABNone a 1 : Carry (A; B ; None)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (23) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 3 : Carry ABC a 3 : Carry (A; B ; C) S 0 : At BMAt AWAt BGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) S3S3

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (24) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 3 : Carry ABC a 3 : Carry (A; B ; C) S 0 : At BMAt AWAt BGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) S3S3 S 0 : At BMAt BWAt BGAt AC S 0 : At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (A; C) S7S7 a 5 : Carry ABW a 5 : Carry (A; B ; W)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (25) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 3 : Carry ABC a 3 : Carry (A; B ; C) S 0 : At BMAt AWAt BGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) S3S3 S 0 : At BMAt BWAt BGAt AC S 0 : At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (A; C) S7S7 a 5 : Carry ABW a 5 : Carry (A; B ; W)

S 0 : At AMAt AWAt AGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (A; C) S0S0 a 7 : Carry ABG a 7 : Carry (A; B ; G) The “Wolf-Goat-Cabbage” Problem: Forward Chaining Planning (26) S 0 : At BMAt AWAt BGAt AC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S1S1 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt AWAt BGAt AC S 0 : At (A; M) ˄ At (A; W) ˄ At (B; G) ˄ At (A; C) S2S2 a 3 : Carry ABC a 3 : Carry (A; B ; C) S 0 : At BMAt AWAt BGAt BC S 0 : At (B; M) ˄ At (A; W) ˄ At (B; G) ˄ At (B; C) S3S3 S 0 : At BMAt BWAt BGAt AC S 0 : At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (A; C) S7S7 a 5 : Carry ABW a 5 : Carry (A; B ; W) Goal: At BMAt BWAt BGAt BC Goal: At (B; M) ˄ At (B; W) ˄ At (B; G) ˄ At (B; C) Goal a 8 : Carry BAG a 8 : Carry (B; A ; G) S 0 : At AMAt AWAt AGAt BC S 0 : At (A; M) ˄ At (A; W) ˄ At (A; G) ˄ At (B; C) S4S4 a 5 : Carry ABW a 5 : Carry (A; B ; W) S 0 : At BMAt BWAt AGAt BC S 0 : At (B; M) ˄ At (B; W) ˄ At (A; G) ˄ At (B; C) S5S5 a 2 : Carry BANone a 2 : Carry (B; A ; None) S 0 : At AMAt BWAt AGAt BC S 0 : At (A; M) ˄ At (B; W) ˄ At (A; G) ˄ At (B; C) S6S6 a 7 : Carry ABG a 7 : Carry (A; B ; G) a 8 : Carry BAG a 8 : Carry (B; A ; G) S 0 : At AMAt BWAt AGAt AC S 0 : At (A; M) ˄ At (B; W) ˄ At (A; G) ˄ At (A; C) S8S8 a 3 : Carry ABC a 3 : Carry (A; B ; C)

Goal a 7 : Carry ABG a 7 : Carry (A; B ; G) a 2 : Carry BANone a 2 : Carry (B; A ; None) a 3 : Carry ABC a 3 : Carry (A; B ; C) a 8 : Carry BAG a 8 : Carry (B; A ; G) a 5 : Carry ABW a 5 : Carry (A; B ; W) a 2 : Carry BANone a 2 : Carry (B; A ; None) a 7 : Carry ABG a 7 : Carry (A; B ; G) S0S0 The “Wolf-Goat- Cabbage” Problem: Two Valid Final Plans Goal a 7 : Carry ABG a 7 : Carry (A; B ; G) a 2 : Carry BANone a 2 : Carry (B; A ; None) a 5 : Carry ABW a 5 : Carry (A; B ; W) a 8 : Carry BAG a 8 : Carry (B; A ; G) a 3 : Carry ABC a 3 : Carry (A; B ; C) a 2 : Carry BANone a 2 : Carry (B; A ; None) a 7 : Carry ABG a 7 : Carry (A; B ; G) S0S0

Forward-Chaining Planning Formally, forward-chaining planning can be described as search through a landscape where each node is defined by a tuple. STATE is a world state comprised of predicate facts and PLAN is the plan (a series of ordered actions) used to reach current STATE from the initial state STATE 0. Search begins from the initial problem state, corresponding to a tuple. Edges between pairs of nodes in the search landscape correspond to applying actions to lead from one state to another. As unguided search in this manner is computationally expensive, heuristics are used to guide search. Commonly, a heuristic value is used to provide a goal distance estimate. STATE 0 STATE 1 STATE 2 STATE 3 STATE 4 STATE 5 GOAL { A1} { A2} A6 A1 A2 A3 A4 A5 { A1, A3} { A1, A4} { A1, A4, A5} { A1, A4, A6} or { A1, A4, A5, A7} { } A7

Backward-Chaining Planning Formally, backward-chaining planning can be described as search through a landscape where each node is defined by a tuple. STATE is a world state comprised of predicate facts and PLAN is the plan (a series of ordered actions) used to reach the final GOAL (the desired world state) from the current node. Search begins from the GOAL state, corresponding to a tuple and ends when the currently valid state STATE 0 will be approached. Edges (inverted) between pairs of nodes in the search landscape correspond to applying actions to lead from one state to another. STATE 0 STATE 1 STATE 2 STATE 3 STATE 4 STATE 5 GOAL { A3, A6} { A4,A7} A6 A1 A2 A3 A4 A5 { A6} { A7} {A5} { } A7 {A1,A6} or {A2,A4,A7}

Bi-Directional Planning (incrementally combining forward and backward chaining) Schematic view of a bidirectional plan search is about to succeed, when a branch from the start node meets a branch from the goal node. The motivation is that the area of the two small circles is less than the area of one big circle centered on the start and reaching to the goal. !

Depth-first search 1 S0S0 Goal

Depth-first search 1 2 S0S0 Goal

Depth-first search 1 2 3 S0S0 Goal

Depth-first search 1 2 3 4 S0S0 Goal

Depth-first search 1 2 3 4 5 S0S0 Goal

Depth-first search Not necessarily shortest path Limited memory requirement 6 1 2 3 4 5 S0S0 Goal

Breadth-first search Goal 1 2 S0S0

Breadth-first search Goal 1 2 3 4 5 6 S0S0

Breadth-first search Goal 1 2 3 4 5 6 7 8 9 S0S0

Breadth-first search Finds shortest path Large memory requirement Goal 1 2 3 4 5 6 7 8 9 S0S0

Both Depth-first and Breadth-first search can be: Forward (from the initial state to the goal) Backward (from the goal to the initial state) Bi-Directional (from both starting points until meeting point)

Depth-Limited and Iterative Deepening search Usually, breadth first search requires too much memory to be practical. Main problem with depth first search: –can follow a dead-end path very far before this is discovered. Depth-Limited search:  impose a depth limit l: never explore nodes at depth > l Iterative Deepening search is depth-limited search with increasing limit  solution improves with more computation time

Depth-Limited search (limit = 3) example

Depth-Limited search (limit = 2) example 1 S0S0 Goal

Depth-Limited search (limit = 2) example 1 2 S0S0 Goal

Depth-Limited search (limit = 2) example 1 2 3 S0S0 Goal

Depth-Limited search (limit = 2) example 1 2 3 4 S0S0 Goal

Depth-Limited search (limit = 2) example 1 2 3 4 5 S0S0 Goal

Depth-Limited search (limit = 2) example 6 1 2 3 4 5 S0S0 Goal “Deeper” than 2 goal is not found I II

Iterative Deepening search example

Uniform-Cost search (“Cheap-First Search”) Uniform-cost search is a tree search algorithm used for traversing or searching a weighted tree, tree structure, or graph. The search begins at the start or goal node. The search continues by visiting the next node which has the least total cost from the root. S0S0 Goal cost

Goal as a “class” Goal (of, e.g., black): a checkmate (… a position in which a player's king is directly attacked by an opponent's piece or pawn and has no possible move to escape the check. The attacking player thus wins the game) Instances of the goal only forward chaining is feasible?

Utility-Based Planning (“Best-First Search” or “enjoy your life while on-the-move”) Goals (as desired states) by themselves are not drivers for the planning. Based on the comparison between different states of the world, a utility value is assigned to each state and the utility function would map a state (or a sequence of states) to a numeric representation of satisfaction. So, the ultimate objective of this type of planning task would be to maximize the utility value derived from exploring the world by actions. A goal based agent for, e.g., playing chess is infeasible: every time it decides which move to play next, it sees whether that move will eventually lead to a checkmate. Instead, it would be better for the agent to assess it's progress not against the overall goal, but against a localized measure. Agent's programs often have a utility function which calculates a numerical value for each world state the agent would find itself in if it undertook a particular action. Then it can check which action would lead to the highest value being returned from the set of actions it has available. Usually the best action with respect to a utility function is taken, as this is the rational thing to do. Utility  “System of values” aka “personal price list of the options”

Planning as Problem Solving A problem is a situation which is experienced by an agent as different from the situation which the agent ideally would like to be in. A problem is solved by a sequence of actions that reduce the difference (measured by a fitness function aka distance) between the initial situation and the goal. where: A x1x1 x2x2 xkxk RkRk R1R1 R2R2 A y1y1 y2y2 ykyk RkRk R1R1 R2R2 D(X,Y) Current state Goal state Fitness function (example)

Fitness Function (Example) - 1 Current state: White wine served at 15° C Goal state: Red wine served at 25° C Importance: Wine color: ω 1 = 0.7 Wine temperature: ω 2 = 0.3

129 Fitness Function (Example) - 2 where: P(wine|colour = white) = = 100 / 500 = 0.2 P(wine|colour = red) = = 200 / 300 = 0.67 Domain objects: 1000 drinks; 300 red, 500 white, 200 - other Soft drinks: 600; 100 red, 400 white, 100 - other Wines: 400; 200 red, 100 white, 100 - other P(soft_drink|colour = white) = = 400 / 500 = 0.8 P(soft drink|colour = red) = = 100 / 300 = 0.33

Fitness Function (Example) - 3 130 where: P(wine|colour = white) = = 100 / 500 = 0.2 P(wine|colour = red) = = 200 / 300 = 0.67 P(soft_drink|colour = white) = = 400 / 500 = 0.8 P(soft drink|colour = red) = = 100 / 300 = 0.33 d (“white”, “red”) = √ [( P(soft_drink|colour = white) - P(soft drink|colour = red) ) 2 + + ( P(wine|colour = white) - P(wine|colour = red) ) 2 ] = = √ [( 0.8 – 0.33 ) 2 + ( 0.2 – 0.67 ) 2 ] ≈ 0.665 D ( Current State, Goal State ) = √ (0.7 0.665 + 0.3 0.5) ≈ 0.784

Partial Plan Example

Planning for multiple agents Centralized planning, decentralized execution Merging plans

Multi-Agent Planning Example

Collaborative Planning as a Constraint Satisfaction Problem (example) Amount of conflicts Constraint: Avoid conflicts, i.e., being at the same row or at the same column with any other agent Each agent has control over own location (x, y) variables in the 4x4 space and can move (change location) according to the chess Queen rules Goal achieved: constraint satisfied!

More Planning Issues Planning with fuzzy goals Planning with multiple goals Planning in dynamic environment Worst-case scenario planning Planning to keep the situation Planning with multiple (collaborative) agents Planning with multiple (competing) agents Real-time (re)planning (Re)configuration as part of a plan … many others …

Complex planning example

Some relaxing video on “Planning” (for “present-”, “past-”, and “future-oriented” agents) “Are you past oriented or future oriented” https://www.youtube.com/watch?v=isPj5KgpVqg

Agents and Machine Learning

Smart machines are systems that use machine learning (data mining and knowledge discovery) to perform work traditionally conducted by humans in an effort to boost efficiency and productivity.

Machine Learning: ML Supervised Learning Unsupervised Learning Reinforcement Learning Data Mining and Knowledge Discovery instrument for … used to guide and optimize an agent behavior (optimal behavior, self- management and control) … used by agents to automatically build own decision models based on examples (classification, regression, …, etc.) … used by agents to discover new (hidden) patterns within the (sensor or communicated) data (clustering, anomaly detection, …, etc.)

Reinforcement Learning "Reinforcement learning is learning what to do---how to map situations to actions---so as to maximize a numerical reward signal. The learner is not told which actions to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them. In the most interesting and challenging cases, actions may affect not only the immediate reward, but also the next situation and, through that, all subsequent rewards. These two characteristics---trial-and-error search and delayed reward---are the two most important distinguishing features of reinforcement learning.“ Reinforcement Learning: An Introduction. By Richard S. Sutton and Andrew G. Barto. MIT Press, Cambridge, MA, 1998. (book online: http://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html )Reinforcement Learning: An IntroductionRichard S. SuttonAndrew G. Bartohttp://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html

Reinforcement Learning basics (Markov Decision Processes) Source: http://www.cs.colorado.edu/~grudic/teaching/CSCI4202/RL.pdf

Reinforcement Learning basics (Agent’s Learning Task)

Reinforcement Learning basics (Value Function)

Reinforcement Learning basics (What to Learn?)

Reinforcement Learning basics (Training rule to learn Q)

Reinforcement Learning basics (Q learning for deterministic worlds)

Swarm Intelligence () Swarm Intelligence (Ant colony optimization) http://www.scholarpedia.org/article/Ant_colony_optimization http://www.math.ucla.edu/~wittman/10c.1.11s/Lectures/Raids/ACO.pdf

Cooperative Search by Pheromone Trails http://staff.washington.edu/paymana/swarm/krink_01.pdf

Positive vs Negative Pheromone Trails Food Enemy Attractor Reflector Nest Positive pheromone trail Negative pheromone trail

See also: http://c2class.com/Multi-Robot-Formations http://c2class.com/Multi-Robot-Formations

Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group.

Similar presentations

Presentation on theme: "Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group.

Similar presentations

Presentation on theme: "Agents and Artificial Intelligence Based on various tutorials and presentations: Edited by V. Terziyan and Industrial Ontologies Group."— Presentation transcript:

Similar presentations

About project

Feedback