Artificial Intelligence/Intelligent Agents

From Dev Wiki
Jump to navigation Jump to search

This page discusses the general ideas, and overall implementation of an intelligent agent.


Terminology

  • PEAS - An abbreviation to remember the "basics" of a functioning agent.
    • Stands for "Performance Measures, Environment, Actuators, Sensors"
  • Sensor - A device that takes some form of input from the surrounding environment.
  • Actuator - A device that achieves physical movement of some kind, by converting energy into mechanical force.
  • Percept - An agent's perceptual inputs at a given time.
  • Percept Sequence - The complete history of everything the agent has perceived, in order.
  • Agent Function - A mathematical function that maps behavior to percept sequences.
    • This "Agent Function" is effectively the theoretical implementation of the agent.
  • Agent Program - A concrete implementation of an agent, within some physical system.
    • This "Agent Program" is effectively the practical implementation of the agent.
  • Degree of Happiness - Maps the current state or outcome to some real number. This number is compared to an ultimate objective number, which indicates having reached the end-goal and/or optimal state.
    • Effectively another way of saying "degree of success".


What is an Agent?

As mentioned in Artificial Intelligence, an "Intelligent Agent" is an entity that can act.
It generally acts to achieve the "best possible" outcome, and uses information from the surrounding environment in some manner.


To elaborate, an Intelligent Agent acquires environmental information via some form of sensors.
And it often acts upon that environment through some sort of actuators.


Said actions will depend on the agent's programming, as well as the current input from its sensors.
It may process this input using one (or multiple) of the following:

  • Reasoning via probabilities and statistics.
  • Decisions via Bayesian Networks, analysis of observation (sensory) sequences, or other methods.
  • Learning via training from statistical models, neural networks, and other data-driven methods.


Types of Sensors

Sensors are anything which detect or otherwise mimic input of the 5 human senses (sight, touch, smell, hearing, taste), or even other values that humans can't inherently detect. Includes:

  • Cameras - Mimics "sight" in the form of continuous image sequences.
  • Microphones - Mimics "sound" in the form of wavelength sequences.
  • Thermometers - Mimics "touch" in the form of detecting temperature.
  • Current & Voltage Sensors - Detects electricity through a given circuit.
  • Chemical & Gas Sensors - Detects environmental composition.
  • Etc


What sensor an AI uses heavily depends on the AI's application, as well as the environment it will reside in.
For example, an AI that exists within a closed, indoor space may never need a thermometer.
Meanwhile, a self-driving car AI may need a thermometer to help detect the type of weather its driving in.

Types of Actuators

An actuator is anything that exerts some sort of force on the agent or the surrounding environment.

For example, an agent might have an actuator in the form of:

  • A robotic arm that grabs or manipulates objects.
  • A wheel for a self-driving car, which allows the car to move.

Evaluating Performance

We generally use objective, quantifiable metrics to describe the level of success for a given agent's behavior.
The exact details of this measurement is often called "Performance Measures".


For AI meant to have rational behavior, we tend to measure based on the following:

  • Prior knowledge of environment.
  • Possible actions that can be performed.
  • Precept sequence at the time of the action that was taken.

Ideally, for each possible percept sequence, the agent will take an action that will maximize objective success.
Prior knowledge, as well as current understanding of the environment, can play a large role in this.


Agent Knowledge Properties

Below are various properties that affect an agent's sensory input and overall environmental knowledge.
They should be taken into account when trying to evaluate the performance of an agent.


Environments can be "Fully Observable", "Partially Observable", or "Unobservable":

  • Fully Observable - Sensors can detect all aspects of environment that are relevant to the agent's actions.
  • Unobservable - Sensors are broken or otherwise unable to gather data relevant to the agent's actions.
  • Partially Observable - Some combination of the above two states.


Sensor data can be "Episodic" or "Sequential":

  • Episodic - One precept leads to exactly one action.
    • Agents with episodic sensory data tend to not retain memory of past precepts.
    • For example, maybe a robotic arm at a factory. It only cares if an item is present or not at the given moment.
  • Sequential - Multiple precepts (aka, multiple moments in time) are used together to determine an action.
    • Agents with sequential sensory data have to retain some memory of past precepts.
    • For example, a self-driving car needs to, at minimum, retain recent history of the past few seconds, in order to determine road state, nearby traffic, etc.


On taking an action, the resulting state can have one of three properties:

  • Deterministic - The next state is always determined with 100% certainty, by a combination of the current state plus the agent's action.
    • In other words, upon taking an action, the next state is always fully predictable.
    • Ex: A robotic arm in a controlled factory setting. It will know what the resulting state of the environment will be, before taking an action.
  • Non-deterministic - The next state is not determinable by the agent's action. The agent will need to take an action and then check the environment to find out the current state.
    • Ex: A self-driving car may turn left. But that only tells the car what it's doing. It has no idea what other entities on the road will do, regardless of its action.
  • Stochastic - The next state is uncertain, but quantifiable by probabilities.
    • Ex: If a vacuum cleaning a square has a 90% chance to clean, and a 10% chance to spread debris to nearby locations.
      • In such a case, the next state is uncertain, but the agent can make intelligent assumptions, based on the likely-hood of what will happen from each action.


Environment can be one of several states, as an agent determines its next action:

  • Static - The environment does not change while the agent considers its next action.
    • Ex: An agent designed to play a game of solitaire. The board will not change until the agent takes an action.
  • Dynamic - The environment may change while the agent considers its next action.
    • Ex: A self-driving car. Nearby entities will act, regardless of how long the current agent takes to determine an action.
  • Semi-Dynamic - The environment technically does not change while the agent considers its next action. BUT the agent's overall performance score can change depending on how long it takes to decide.


Related to above:

  • Discrete Time - The environment is split into disparate steps.
    • Ex: A card game.
  • Continuous Time - The environment is a continuous string of input.
    • Ex: A self-driving car on the road.


Lastly, the number of other agents in the environment also make an impact:

  • Single Agent Environment - Only one agent exists in the environment. No consideration is needed for other entities.
  • Multi-Agent Environment - Two or more agents exist in the same environment. They may or may not need to account for each other in their actions.
    • Multi-Agent Environments can be further divided into the following:
      • Cooperative Multi-Agent Environment - Two or more agents exist in the same environment, and work together to achieve some goal.
      • Competitive Multi-Agent Environment - Two or more agents exist in the same environment, and directly oppose each other in achieving goals.


Basic Agent Types

The following are general types of simple/basic agents.
Each one generally builds on the previous.

Simple Reflex Agent

Is the simplest agent type.

  • Selects actions based on current precept.
  • Ignores precept history. Aka, only the current state in time is considered.
  • Generally implemented via "condition -> action" (aka if-then-else) rules.

Limitations:

  • Generally requires the environment be fully observable.
  • Correct decision can only be made on basis of current precept. Applications that require two or more precepts can't use this agent type.

Modal-Based Reflex Agent

Similar to a "Simple Reflex Agent", but:

  • Allows environment to not be fully-observable.
  • Keeps track of the part of environment it cannot currently see, using last known state.
    • In other words, it tracks minimal state-history, in order to make more informed decisions.

Limitations:

  • Requires some knowledge of the "goal", in order to accurately use state info and make good decisions.
  • However, this agent technically doesn't really have means to account for the goal. So it's effectively useless outside of theoretical exercises.

Goal-Based Agent

Is effectively a "Model-Based Reflex Agent", but able to take goal information into account.

  • Goal should describe a desirable state or ending outcome.
  • Goal should be quantifiable in some manner, so the agent can determine which states/actions will progress towards said goal.

Limitations:

  • Different action sequences can lead to the same ultimate outcome, but with different levels of efficiency.
  • Similarly to above, this agent type technically doesn't account for this. So it's effectively useless outside of theoretical exercises.

Utility-Based Agent

This is the most advanced of the "basic" agent types.

  • Has a "utility function" of some kind.
    • This function maps a state (or sequence of states) to a real number.
    • The agent attempts to increase the function by getting as close to these states as possible.
    • This real number describes a "degree of happiness" with the final outcome.
  • Actions taken should attempt to maximize this function. Ideally it attempts to maximize at each individual step.

Limitations:

  • Agent still only knows exactly what it was programmed to handle, and cannot learn more functionality on its own.

Learning Agent

This agent type can technically be applied to all of the above agents.
However, it's by far the most common with a Utility-Based Agent, so it can map learning values to a "degree of happiness", indicating how close/far the new actions or states are in relation to the objective.

  • Generally uses some form of feedback to learn and improve performance.
  • Ex: Neural Networks which adjust weights to learn how to categorize new items appropriately.