Artificial Intelligence/Intelligent Agents: Difference between revisions

From Dev Wiki
Jump to navigation Jump to search
(Update page)
(Correct formatting)
 
(3 intermediate revisions by the same user not shown)
Line 3: Line 3:


== Terminology ==
== Terminology ==
* PEAS - An abbreviation to remember the "basics" of a functioning agent.
* '''PEAS''' - An abbreviation to remember the "basics" of a functioning agent.
** Stands for "Performance Measures, Environment, Actuators, Sensors"
** Stands for "Performance Measures, Environment, Actuators, Sensors"
* Sensor - A device that takes some form of input from the surrounding environment.
* '''Sensor''' - A device that takes some form of input from the surrounding environment.
* Actuator - A device that achieves physical movement of some kind, by converting energy into mechanical force.
* '''Actuator''' - A device that achieves physical movement of some kind, by converting energy into mechanical force.
* Percept - An agent's perceptual inputs at a given time.
* '''Percept''' - An agent's perceptual inputs at a given time.
* Percept Sequence - The complete history of everything the agent has perceived, in order.
* '''Percept Sequence''' - The complete history of everything the agent has perceived, in order.
* Agent Function - A mathematical function that maps behavior to percept sequences.
* '''Agent Function''' - A mathematical function that maps behavior to percept sequences.
** This "Agent Function" is effectively the theoretical implementation of the agent.
** This "Agent Function" is effectively the theoretical implementation of the agent.
* Agent Program - A concrete implementation of an agent, within some physical system.
* '''Agent Program''' - A concrete implementation of an agent, within some physical system.
** This "Agent Program" is effectively the practical implementation of the agent.
** This "Agent Program" is effectively the practical implementation of the agent.
* '''Degree of Happiness''' - Maps the current state or outcome to some real number. This number is compared to an ultimate objective number, which indicates having reached the end-goal and/or optimal state.
** Effectively another way of saying "degree of success".




Line 67: Line 69:


=== Agent Knowledge Properties ===
=== Agent Knowledge Properties ===
Below are various properties that affect an agent's sensory input and overall environmental knowledge.
Below are various properties that affect an agent's sensory input and overall environmental knowledge.<br>
They should be taken into account when trying to evaluate the performance of an agent.




Environments can be "Fully Observable", "Partially Observable", or "Unobservable":
Environments can be "Fully Observable", "Partially Observable", or "Unobservable":
* Fully Observable - Sensors can detect all aspects of environment that are relevant to the agent's actions.
* '''Fully Observable''' - Sensors can detect all aspects of environment that are relevant to the agent's actions.
* Unobservable - Sensors are broken or otherwise unable to gather data relevant to the agent's actions.
* '''Unobservable''' - Sensors are broken or otherwise unable to gather data relevant to the agent's actions.
* Partially Observable - Some combination of the above two states.
* '''Partially Observable''' - Some combination of the above two states.




Sensor data can be "Episodic" or "Sequential":
Sensor data can be "Episodic" or "Sequential":
* Episodic - One precept leads to exactly one action.
* '''Episodic''' - One precept leads to exactly one action.
** Agents with episodic sensory data tend to not retain memory of past precepts.
** Agents with episodic sensory data tend to not retain memory of past precepts.
** For example, maybe a robotic arm at a factory. It only cares if an item is present or not at the given moment.
** For example, maybe a robotic arm at a factory. It only cares if an item is present or not at the given moment.
* Sequential - Multiple precepts (aka, multiple moments in time) are used together to determine an action.
* '''Sequential''' - Multiple precepts (aka, multiple moments in time) are used together to determine an action.
** Agents with sequential sensory data have to retain some memory of past precepts.
** Agents with sequential sensory data have to retain some memory of past precepts.
** For example, a self-driving car needs to, at minimum, retain recent history of the past few minutes, in order to determine road state, nearby traffic, etc.
** For example, a self-driving car needs to, at minimum, retain recent history of the past few seconds, in order to determine road state, nearby traffic, etc.




On taking an action, the resulting state can have one of three properties:
On taking an action, the resulting state can have one of three properties:
* Deterministic - The next state is always determined with 100% certainty, by a combination of the current state plus the agent's action.
* '''Deterministic''' - The next state is always determined with 100% certainty, by a combination of the current state plus the agent's action.
** In other words, upon taking an action, the next state is always fully predictable.
** In other words, upon taking an action, the next state is always fully predictable.
** Ex: A robotic arm in a controlled factory setting. It will know what the resulting state of the environment will be, before taking an action.
** Ex: A robotic arm in a controlled factory setting. It will know what the resulting state of the environment will be, before taking an action.
* Non-deterministic - The next state is not determinable by the agent's action. The agent will need to take an action and then check the environment to find out the current state.
* '''Non-deterministic''' - The next state is not determinable by the agent's action. The agent will need to take an action and then check the environment to find out the current state.
** Ex: A self-driving car may turn left. But that only tells the car what it's doing. It has no idea what other entities on the road will do, regardless of its action.
** Ex: A self-driving car may turn left. But that only tells the car what it's doing. It has no idea what other entities on the road will do, regardless of its action.
* Stochastic - The next state is uncertain, but quantifiable by probabilities.
* '''Stochastic''' - The next state is uncertain, but quantifiable by probabilities.
** Ex: If a vacuum cleaning a square has a 90% chance to clean, and a 10% chance to spread debris to nearby locations.
** Ex: If a vacuum cleaning a square has a 90% chance to clean, and a 10% chance to spread debris to nearby locations.
*** In such a case, the next state is uncertain, but the agent can make intelligent assumptions, based on the likely-hood of what will happen from each action.
*** In such a case, the next state is uncertain, but the agent can make intelligent assumptions, based on the likely-hood of what will happen from each action.
Line 97: Line 100:


Environment can be one of several states, as an agent determines its next action:
Environment can be one of several states, as an agent determines its next action:
* Static - The environment does not change while the agent considers its next action.
* '''Static''' - The environment does not change while the agent considers its next action.
** Ex: An agent designed to play a game of solitaire. The board will not change until the agent takes an action.
** Ex: An agent designed to play a game of solitaire. The board will not change until the agent takes an action.
* Dynamic - The environment may change while the agent considers its next action.
* '''Dynamic''' - The environment may change while the agent considers its next action.
** Ex: A self-driving car. Nearby entities will act, regardless of how long the current agent takes to determine an action.
** Ex: A self-driving car. Nearby entities will act, regardless of how long the current agent takes to determine an action.
* Semi-Dynamic - The environment technically does not change while the agent considers its next action. BUT the agent's overall performance score can change depending on how long it takes to decide.
* '''Semi-Dynamic''' - The environment technically does not change while the agent considers its next action. BUT the agent's overall performance score can change depending on how long it takes to decide.




Related to above:
Related to above:
* Discrete Time - The environment is split into disparate steps.
* '''Discrete Time''' - The environment is split into disparate steps.
** Ex: A card game.
** Ex: A card game.
* Continuous Time - The environment is a continuous string of input.
* '''Continuous Time''' - The environment is a continuous string of input.
** Ex: A self-driving car on the road.
** Ex: A self-driving car on the road.




Lastly, the number of other agents in the environment also make an impact:
Lastly, the number of other agents in the environment also make an impact:
* Single Agent Environment - Only one agent exists in the environment. No consideration is needed for other entities.
* '''Single Agent Environment''' - Only one agent exists in the environment. No consideration is needed for other entities.
* Multi-Agent Environment - Two or more agents exist in the same environment. They may or may not need to account for each other in their actions.
* '''Multi-Agent Environment''' - Two or more agents exist in the same environment. They may or may not need to account for each other in their actions.
** Multi-Agent Environments can be further divided into the following:
** Multi-Agent Environments can be further divided into the following:
*** Cooperative Multi-Agent Environment - Two or more agents exist in the same environment, and work together to achieve some goal.
*** '''Cooperative Multi-Agent Environment''' - Two or more agents exist in the same environment, and work together to achieve some goal.
*** Competitive Multi-Agent Environment - Two or more agents exist in the same environment, and directly oppose each other in achieving goals.
*** '''Competitive Multi-Agent Environment''' - Two or more agents exist in the same environment, and directly oppose each other in achieving goals.
 
 
== Basic Agent Types ==
The following are general types of simple/basic agents.<br>
Each one generally builds on the previous.
 
=== Simple Reflex Agent ===
Is the simplest agent type.
* Selects actions based on current precept.
* Ignores precept history. Aka, only the current state in time is considered.
* Generally implemented via "condition -> action" (aka if-then-else) rules.
 
Limitations:
* Generally requires the environment be fully observable.
* Correct decision can only be made on basis of current precept. Applications that require two or more precepts can't use this agent type.
 
=== Modal-Based Reflex Agent ===
Similar to a "Simple Reflex Agent", but:
* Allows environment to not be fully-observable.
* Keeps track of the part of environment it cannot currently see, using last known state.
** In other words, it tracks minimal state-history, in order to make more informed decisions.
 
Limitations:
* Requires some knowledge of the "goal", in order to accurately use state info and make good decisions.
* However, this agent technically doesn't really have means to account for the goal. So it's effectively useless outside of theoretical exercises.
 
=== Goal-Based Agent ===
Is effectively a "Model-Based Reflex Agent", but able to take goal information into account.
* Goal should describe a desirable state or ending outcome.
* Goal should be quantifiable in some manner, so the agent can determine which states/actions will progress towards said goal.
 
Limitations:
* Different action sequences can lead to the same ultimate outcome, but with different levels of efficiency.
* Similarly to above, this agent type technically doesn't account for this. So it's effectively useless outside of theoretical exercises.
 
=== Utility-Based Agent ===
This is the most advanced of the "basic" agent types.
* Has a "utility function" of some kind.
** This function maps a state (or sequence of states) to a real number.
** The agent attempts to increase the function by getting as close to these states as possible.
** This real number describes a "degree of happiness" with the final outcome.
* Actions taken should attempt to maximize this function. Ideally it attempts to maximize at each individual step.
 
Limitations:
* Agent still only knows exactly what it was programmed to handle, and cannot learn more functionality on its own.
 
=== Learning Agent ===
This agent type can technically be applied to all of the above agents.<br>
However, it's by far the most common with a Utility-Based Agent, so it can map learning values to a "degree of happiness", indicating how close/far the new actions or states are in relation to the objective.
* Generally uses some form of feedback to learn and improve performance.
* Ex: [[Neural Networks]] which adjust weights to learn how to categorize new items appropriately.

Latest revision as of 23:27, 12 October 2021

This page discusses the general ideas, and overall implementation of an intelligent agent.


Terminology

  • PEAS - An abbreviation to remember the "basics" of a functioning agent.
    • Stands for "Performance Measures, Environment, Actuators, Sensors"
  • Sensor - A device that takes some form of input from the surrounding environment.
  • Actuator - A device that achieves physical movement of some kind, by converting energy into mechanical force.
  • Percept - An agent's perceptual inputs at a given time.
  • Percept Sequence - The complete history of everything the agent has perceived, in order.
  • Agent Function - A mathematical function that maps behavior to percept sequences.
    • This "Agent Function" is effectively the theoretical implementation of the agent.
  • Agent Program - A concrete implementation of an agent, within some physical system.
    • This "Agent Program" is effectively the practical implementation of the agent.
  • Degree of Happiness - Maps the current state or outcome to some real number. This number is compared to an ultimate objective number, which indicates having reached the end-goal and/or optimal state.
    • Effectively another way of saying "degree of success".


What is an Agent?

As mentioned in Artificial Intelligence, an "Intelligent Agent" is an entity that can act.
It generally acts to achieve the "best possible" outcome, and uses information from the surrounding environment in some manner.


To elaborate, an Intelligent Agent acquires environmental information via some form of sensors.
And it often acts upon that environment through some sort of actuators.


Said actions will depend on the agent's programming, as well as the current input from its sensors.
It may process this input using one (or multiple) of the following:

  • Reasoning via probabilities and statistics.
  • Decisions via Bayesian Networks, analysis of observation (sensory) sequences, or other methods.
  • Learning via training from statistical models, neural networks, and other data-driven methods.


Types of Sensors

Sensors are anything which detect or otherwise mimic input of the 5 human senses (sight, touch, smell, hearing, taste), or even other values that humans can't inherently detect. Includes:

  • Cameras - Mimics "sight" in the form of continuous image sequences.
  • Microphones - Mimics "sound" in the form of wavelength sequences.
  • Thermometers - Mimics "touch" in the form of detecting temperature.
  • Current & Voltage Sensors - Detects electricity through a given circuit.
  • Chemical & Gas Sensors - Detects environmental composition.
  • Etc


What sensor an AI uses heavily depends on the AI's application, as well as the environment it will reside in.
For example, an AI that exists within a closed, indoor space may never need a thermometer.
Meanwhile, a self-driving car AI may need a thermometer to help detect the type of weather its driving in.

Types of Actuators

An actuator is anything that exerts some sort of force on the agent or the surrounding environment.

For example, an agent might have an actuator in the form of:

  • A robotic arm that grabs or manipulates objects.
  • A wheel for a self-driving car, which allows the car to move.

Evaluating Performance

We generally use objective, quantifiable metrics to describe the level of success for a given agent's behavior.
The exact details of this measurement is often called "Performance Measures".


For AI meant to have rational behavior, we tend to measure based on the following:

  • Prior knowledge of environment.
  • Possible actions that can be performed.
  • Precept sequence at the time of the action that was taken.

Ideally, for each possible percept sequence, the agent will take an action that will maximize objective success.
Prior knowledge, as well as current understanding of the environment, can play a large role in this.


Agent Knowledge Properties

Below are various properties that affect an agent's sensory input and overall environmental knowledge.
They should be taken into account when trying to evaluate the performance of an agent.


Environments can be "Fully Observable", "Partially Observable", or "Unobservable":

  • Fully Observable - Sensors can detect all aspects of environment that are relevant to the agent's actions.
  • Unobservable - Sensors are broken or otherwise unable to gather data relevant to the agent's actions.
  • Partially Observable - Some combination of the above two states.


Sensor data can be "Episodic" or "Sequential":

  • Episodic - One precept leads to exactly one action.
    • Agents with episodic sensory data tend to not retain memory of past precepts.
    • For example, maybe a robotic arm at a factory. It only cares if an item is present or not at the given moment.
  • Sequential - Multiple precepts (aka, multiple moments in time) are used together to determine an action.
    • Agents with sequential sensory data have to retain some memory of past precepts.
    • For example, a self-driving car needs to, at minimum, retain recent history of the past few seconds, in order to determine road state, nearby traffic, etc.


On taking an action, the resulting state can have one of three properties:

  • Deterministic - The next state is always determined with 100% certainty, by a combination of the current state plus the agent's action.
    • In other words, upon taking an action, the next state is always fully predictable.
    • Ex: A robotic arm in a controlled factory setting. It will know what the resulting state of the environment will be, before taking an action.
  • Non-deterministic - The next state is not determinable by the agent's action. The agent will need to take an action and then check the environment to find out the current state.
    • Ex: A self-driving car may turn left. But that only tells the car what it's doing. It has no idea what other entities on the road will do, regardless of its action.
  • Stochastic - The next state is uncertain, but quantifiable by probabilities.
    • Ex: If a vacuum cleaning a square has a 90% chance to clean, and a 10% chance to spread debris to nearby locations.
      • In such a case, the next state is uncertain, but the agent can make intelligent assumptions, based on the likely-hood of what will happen from each action.


Environment can be one of several states, as an agent determines its next action:

  • Static - The environment does not change while the agent considers its next action.
    • Ex: An agent designed to play a game of solitaire. The board will not change until the agent takes an action.
  • Dynamic - The environment may change while the agent considers its next action.
    • Ex: A self-driving car. Nearby entities will act, regardless of how long the current agent takes to determine an action.
  • Semi-Dynamic - The environment technically does not change while the agent considers its next action. BUT the agent's overall performance score can change depending on how long it takes to decide.


Related to above:

  • Discrete Time - The environment is split into disparate steps.
    • Ex: A card game.
  • Continuous Time - The environment is a continuous string of input.
    • Ex: A self-driving car on the road.


Lastly, the number of other agents in the environment also make an impact:

  • Single Agent Environment - Only one agent exists in the environment. No consideration is needed for other entities.
  • Multi-Agent Environment - Two or more agents exist in the same environment. They may or may not need to account for each other in their actions.
    • Multi-Agent Environments can be further divided into the following:
      • Cooperative Multi-Agent Environment - Two or more agents exist in the same environment, and work together to achieve some goal.
      • Competitive Multi-Agent Environment - Two or more agents exist in the same environment, and directly oppose each other in achieving goals.


Basic Agent Types

The following are general types of simple/basic agents.
Each one generally builds on the previous.

Simple Reflex Agent

Is the simplest agent type.

  • Selects actions based on current precept.
  • Ignores precept history. Aka, only the current state in time is considered.
  • Generally implemented via "condition -> action" (aka if-then-else) rules.

Limitations:

  • Generally requires the environment be fully observable.
  • Correct decision can only be made on basis of current precept. Applications that require two or more precepts can't use this agent type.

Modal-Based Reflex Agent

Similar to a "Simple Reflex Agent", but:

  • Allows environment to not be fully-observable.
  • Keeps track of the part of environment it cannot currently see, using last known state.
    • In other words, it tracks minimal state-history, in order to make more informed decisions.

Limitations:

  • Requires some knowledge of the "goal", in order to accurately use state info and make good decisions.
  • However, this agent technically doesn't really have means to account for the goal. So it's effectively useless outside of theoretical exercises.

Goal-Based Agent

Is effectively a "Model-Based Reflex Agent", but able to take goal information into account.

  • Goal should describe a desirable state or ending outcome.
  • Goal should be quantifiable in some manner, so the agent can determine which states/actions will progress towards said goal.

Limitations:

  • Different action sequences can lead to the same ultimate outcome, but with different levels of efficiency.
  • Similarly to above, this agent type technically doesn't account for this. So it's effectively useless outside of theoretical exercises.

Utility-Based Agent

This is the most advanced of the "basic" agent types.

  • Has a "utility function" of some kind.
    • This function maps a state (or sequence of states) to a real number.
    • The agent attempts to increase the function by getting as close to these states as possible.
    • This real number describes a "degree of happiness" with the final outcome.
  • Actions taken should attempt to maximize this function. Ideally it attempts to maximize at each individual step.

Limitations:

  • Agent still only knows exactly what it was programmed to handle, and cannot learn more functionality on its own.

Learning Agent

This agent type can technically be applied to all of the above agents.
However, it's by far the most common with a Utility-Based Agent, so it can map learning values to a "degree of happiness", indicating how close/far the new actions or states are in relation to the objective.

  • Generally uses some form of feedback to learn and improve performance.
  • Ex: Neural Networks which adjust weights to learn how to categorize new items appropriately.