Ghost API

Heuristic Search Algorithms

Loud Pumpkins — Tue, 02 Jul 2024 02:50:58 GMT

In the last article, we saw how a problem-solving agent could use search trees to look ahead and find a sequence of actions that will solve a problem, but we faced a drastic issue: uninformed search trees grow too fast!

This article will see how we can pick more promising paths first and put aside the least likely ones for last, known as a best-first search or uniform-cost search. The idea is that while breadth-first search spreads out in waves of uniform depths (depth 1 -> depth 2 -> … ), best-first search spreads out in waves of uniform evaluations ( f(n) ).

Best-first Search

A very general approach in which we choose a node, n, with minimum values of some evaluation function, f(n). On each iteration, we choose a node on the frontier with a minimum f(n) value, return it if its state is a goal state, and otherwise expand it to generate child nodes. Each child node is added to the frontier if it has not been reached before or is re-added if reached from a less costly path. By employing different evaluation functions, we get different specific algorithms, which we discuss in this article.

def best_first_search(problem: Problem, f: Callable) -> Optional[Node]:
   """
   Takes in an implemented `Problem` and returns a `Node` once a solution is
   found or `None` otherwise. Uses a priority queue to always explore nodes
   in the frontier with the lowest f(n) value.

   This algorithm will track reached/explored nodes and will insert a node into
   the frontier if that node has not been reached before, or if the new path has
   a lower cost. To avoid duplicates in the frontier, use an A* search with a
   monotonic heuristic.

   :param problem: Implemented Problem class
   :param f: Callable evaluation function that takes in 1 argument; Node
   :return: Node or None
   """
   explored = {}
   frontier = []

   root = Node(problem.initial_state)
   counter = itertools.count()  # tie breaker for heapq
   heapq.heappush(frontier, (f(root), next(counter), root))

   while frontier:
      node: Node = heapq.heappop(frontier)[2]
      if problem.is_goal(node.state):
         return node
      for child in problem.expand(node):
         if child.state in explored \
            and f(child) >= explored[child.state].path_cost:
            continue
         heapq.heappush(frontier, (f(child), next(counter), child))
         explored[node.state] = child
   return None

Heuristic Function

A heuristic function will take a node and evaluate how close it is to the solution. It uses domain-specific clues to estimate the cheapest path from the given node to a goal. We typically want heuristics that underestimate the actual cost ( optimistic heuristic ) rather than overestimating ( pessimistic heuristic ). If the heuristic is too high, then we don’t explore nodes, and we might miss a low-cost solution, but if the estimate is too low, then we tend to expand nodes and then later discard them when the actual cost for the node starts adding up.

Recall our fictional map of towns from the last article, for which we will create an example of an optimistic heuristic function.

Our heuristic will return the straight line distance between any node and the goal from that map. And we use that distance as an estimate to evaluate a node. It is optimistic because a straight line between two points will always be less than or equal to any other path.

Greedy Best-first Search

A greedy best-first search is a form of best-first search that expands the node with the lowest heuristic value or, in other words, the node that appears to be the most promising. And recall that a best-first search algorithm will pick the node with the lowest evaluation function. So, a greedy best-first search is a best-first search where f(n) = h(n).

As an example, we will look for a path from A to Z using the greedy approach.

Our heuristic values:

h(A) = 366	h(E) = 380	h(J) = 100	h(N) = 80	h(R) = 226
h(B) = 374	h(F) = 176	h(K) = 160	h(O) = 77	h(S) = 161
h(C) = 253	h(G) = 193	h(L) = 241	h(P) = 199	h(T) = 234
h(D) = 329	h(H) = 244	h(M) = 242	h(Q) = 151	h(Z) = 0

The algorithm will create a search tree internally, just like uninformed searches. Still, I find it easier to superimpose the search tree onto the state graph to demonstrate better how the frontier is expanded closer and closer toward the goal.

We first expand our initial node A, and we get three successor nodes; B, C and D. Looking at our heuristic values table, we determine that C is the closest to the goal, so we expand it and add its successors to the frontier. We continue to do so until we reach the goal. Notice that the algorithm did not expand a single node that is not on the path to the solution. However, the solution found is not the optimal solution. The optimal path is ‘go_to_C’, ‘go_to_G’, ‘go_to_J’, ‘go_to_Z’, but our algorithm did not find it. Greedy best-first search is not concerned with the cost already incurred nor the cost to reach a promising node. It just expands the most promising node from the frontier, and this is why it’s called a greedy approach.

When using this approach, we keep track of all visited nodes, and so, space and time complexity is equal to the size of the state space. And, assuming that the state space is finite, greedy search is a complete search algorithm.

**A* search**

A more common approach to solving search problems is the A-star search, which is a best-first search that employs the heuristic function as well, but unlike greedy best-first search, A-star search will include the cost incurred to reach a node in conjunction with the heuristic value. And so, its evaluation function is f(n) = g(n) + h(n) where g(n) is the cost to reach a node n from the initial state and h(n) is the heuristic value of that same node.

In other words, the evaluation function returns the exact cost to reach node n plus an estimate of the cost remaining to reach the goal from node n.

We will do another example using the same heuristic table and search problem as we did for the greedy best-first search, but this time using A* search.

There are a few things to note here. Contrary to the greedy search, A-star’s frontier expands conservatively yet steadily towards the goal. Also, notice that we had to explore more nodes to find a solution because A* search is careful. It also perceived the less optimal solution when it expanded node F (4th graph) with an f(n) value of 450, but another node, J, saw a possible path to the goal with an f(n) value of 416. And so, the algorithm explored node J before returning a solution that it had already found. Sure enough, the path through J found another, better solution. In fact, it’s the cost-optimal solution.

Whether A* is cost-optimal depends entirely on two properties of the heuristic function selected. It needs to be admissible and monotonic.

Admissibility

An admissible heuristic is one that never overestimates the cost to reach a goal. And as long as your heuristic is admissible, you will get a cost-optimal solution. The intuition behind it is that an admissible heuristic will always return the minimum cost, and at the very least, you will need to spend g(n) + h(n) to go down path n. If you found a solution that costs less than all the evaluation values in the frontier, you are guaranteed to have found the optimal solution. This can be shown more formally through a proof by contradiction.

Suppose an optimal solution exists with a cost C*, but our algorithm returned another solution with a cost C, greater than C*. Then that means a node n that is on the optimal path has not been expanded. And because we are using a best-first search algorithm, it must mean that the evaluation value of that node was greater than C, and thus, greater than C* (otherwise, we would have expanded it). For this proof, we denote c(n, z) as the optimal cost from node n to the goal.

by A* definition, we get

because h(n) is admissible, we get

because n is on the optimal path, the cost to reach n in conjunction with the optimal cost to reach the goal from n, we get the optimal cost, C*.

The first and last lines form a contradiction, so it is not possible to have a node on the optimal path unexplored, and thus, impossible to return a suboptimal solution.

Monotonicity

A stronger property of a heuristic function is monotonicity or sometimes called consistency. A heuristic is monotonic if the evaluation function is monotonically increasing as we explore a path. Or in other words, it means that if my heuristic evaluates a node at a particular value, not only will it be an optimistic estimate, but it is also the minimum cost to pursue a solution through that node.

More formally, we have:

Where n is any node and n’ is any of its successors. This inequality must hold for every node, so it’s true for its successors n’ as well. And so we have:

And from here, you can see a pattern. Once h(n) is evaluated, it can only increase or stay the same. Hence the name monotonic.

Admissible vs monotonic

Every monotonic heuristic is admissible, but not every admissible heuristic is monotonic. And so, just like an admissible heuristic, a monotonic heuristic will return a cost-optimal solution. But in addition, with a monotonic heuristic, the first time we reach a node, it will be on an optimal path, so we never have to re-add a node to the frontier.

If it cannot be shown that a heuristic is monotonically increasing, then we cannot guarantee that every node in the frontier has the lowest evaluation value. We may reach a node in the frontier from a different path with a lower cost. Suppose a heuristic is admissible but not monotonic. In that case, we have to keep track of all the nodes in the frontier and update their values, parent and successors when we find a better path, or we must introduce duplicate nodes in our frontier. For this reason, it’s best to use a monotonic heuristic if possible.

Inadmissible heuristics

As powerful as an A* search can be, it can sometimes have a high space and time complexity. An admissible heuristic cannot take risks because it needs to guarantee a minimal cost. This restriction can return conservative heuristic values and rank all successors equally. In such a case, an A* search can turn into a breadth-first search. But if we are willing to accept the possibility of a suboptimal solution, we may use an inadmissible heuristic. A heuristic that may overestimate the cost but does a much better job of estimating a solution’s cost.

Inventing heuristic functions

Sometimes inventing a good heuristic can be challenging, so I want to pass along a helpful tip before I end this article. Take an 8-puzzle problem as an example.

For those not familiar with an 8-puzzle problem, the premise is simple. You are presented with a board that can fit nine tiles, but only eight are present. You must rearrange the tiles until the numbers are sorted, like in the goal image below.

There is only one rule:

You can move tile X to position Y if Y is adjacent to X and if Y is empty.

A concrete implementation of an 8-Puzzle problem

class Puzzle(Problem):
   """
   The values of the state member variables of instances of this class are
   represented as a string as follows:

   "12345678_" where '_' represents the empty tile.

   The inital state of the Puzzle problem is completely random.
   """

   def actions(self, state):
      """
      Actions labeled with 'U', 'D', 'L' or 'R' indicate the direction taken
      by the blank space.

      eg: +-------+      +-------+
         | _ 2 3 |      | 1 2 3 |
         | 1 4 6 | -D-> | _ 4 6 |
         | 7 5 8 |      | 7 5 8 |
         +-------+      +-------+
      """
      actions = []
      index = state.index('_')
      if int(index / 3) > 0:
         # "move up allowed"
         actions.append('U')

      if int(index / 3) < 2:
         # "move down allowed"
         actions.append('D')

      if index % 3 > 0:
         # "move left allowed"
         actions.append('L')

      if index % 3 < 2:
         # "move right allowed"
         actions.append('R')

      return actions

   def is_goal(self, state):
      """
      Return True if the problem is solved.  I.e.: human and all items are on
      the right side.
      """
      return state == '12345678_'

   def transition(self, state, action):
      """
      Paths labeled with 'U', 'D', 'L' or 'R' indicate the direction taken by the
      blank space.

      eg: +-------+      +-------+
         | _ 2 3 |      | 1 2 3 |
         | 1 4 6 | -D-> | _ 4 6 |
         | 7 5 8 |      | 7 5 8 |
         +-------+      +-------+
      """
      index = state.index('_')
      if action == 'U':
         # "move up"
         new_state = list(state)
         new_state[index - 3], new_state[index] = new_state[index], new_state[index - 3]
         return ''.join(new_state)

      if action == 'D':
         # "move down"
         new_state = list(state)
         new_state[index + 3], new_state[index] = new_state[index], new_state[index + 3]
         return ''.join(new_state)

      if action == 'L':
         # "move left"
         new_state = list(state)
         new_state[index - 1], new_state[index] = new_state[index], new_state[index - 1]
         return ''.join(new_state)

      if action == 'R':
         # "move right"
         new_state = list(state)
         new_state[index + 1], new_state[index] = new_state[index], new_state[index + 1]
         return ''.join(new_state)

One method to create a heuristic for this problem is to use what’s known as a relaxed problem approach. We relax the restrictions and open more edges between nodes in the state graph. The added edges will simplify the computation needed to find a solution, and we use the solution to the simplified problem as a heuristic to our original problem. Because we are exploring the same state graph but with more paths, we are guaranteed to find an optimistic solution. For example, with the 8-puzzle problem, we can remove the restriction that Y must be adjacent to X and Y must be empty. The relaxed problem becomes:

You can move tile X to position Y ~~if Y is adjacent to X and if Y is empty~~.

This heuristic is called the ‘misplaced tiles’ heuristic. It will always underestimate the true cost because it is synonymous with you just lifting a tile and placing it where it belongs. For each tile that isn’t where it is supposed to be, it will cost you at least one action to move it.

Or, we can remove the restriction that Y must be empty, and we get:

You can move tile X to position Y if Y is adjacent to X ~~and if Y is empty~~.

The result is a heuristic that measures the sum of the distances of the tiles from their goal positions. Because this heuristic has more restrictions than the misplaced tiles heuristic, it is better at estimating the true cost, and thus, will find a solution quicker. But because it has fewer restrictions than the original problem, it is still an admissible heuristic.

Solving an 8-puzzle using a misplaced tiles heuristic:

def misplaced_tiles_heuristic(node: Node) -> int:
   """
   Return the heuristic value of a given node. The heuristic will count how
   many tiles are not in their designated position which indicates a minimum
   number of moves needed to solve the problem.

   :param node: Node
   :return: int
   """
   heuristic = -1
   for (e, i) in zip(node.state, '12345678*'):
      heuristic += int(e == i)
   return heuristic

Solution:

initial_state = '_13425786'
print('Initial State:')
print_puzzle(initial_state)
problem = Puzzle(initial_state)

node = a_star_search(problem, misplaced_tiles_heuristic)
if node is not None:
   solution = node.solution()
   print("Solution:", "".join(solution))
else:
   print("No solution found")

Initial State:
 +-------+
 | _ 1 3 |
 | 4 2 5 |
 | 7 8 6 |
 +-------+
Solution: RDRD

What is an intelligent agent

Loud Pumpkins — Tue, 02 Jul 2024 02:45:40 GMT

The fundamentals of an intelligent program.

What is AI?

Not in the philosophical sense, “Can only humans think? Is AI about making machines act like they are thinking? Or does thought transcend the human species, and biological organisms?”

No. I mean the practical definition. What is AI for someone trying to solve a real-world problem?

In practice, AI is a set of design principles for building successful systems that can be called intelligent, where ‘successful’ and ‘intelligent’ are left for you to decide. Those systems are known as intelligent agents.

An agent is anything that can be viewed as perceiving its environment through sensors and acting upon that environment through actuators. Think of a robot for example as the agent, the cameras, microphones, thermometers are the sensors, and the stepper motors controlling the robot’s various components are the actuators.

The environment could be anything you define to be relevant to the agent. It could be the actual world we live in, or it could be a small set of states. It’s the part that percepts come from and actions sent to.

The agent will rely on the percepts perceived by the sensors and its own built-in knowledge base to decide what action to take. It may have to analyze its entire percept sequence to make an intelligent decision. This mapping from percepts to actions is what’s known as an agent function.

The simplest agent implementation of this agent function is to hard code all possible combinations of percepts and all possible combinations of sequences of those percepts and manually assign an action to each one. This implementation of the agent function is called an agent program. It is one way to implement the agent function. Actually, it’s a terrible way to implement an agent function for any nontrivial problem, but it is a valid agent program.

In fact, the discussion of various ways to implement various agent functions is at the core of this site. But before we start solving problems using intelligent agents, we must first fully define the problem.

The task environment and its properties

The task environment is essentially the problem that our agent is trying to solve. For example, if we were building a self-driving vehicle, we would specify it as such:

Agent	Performance Measure	Environment	Actuators	Sensors
Self-driving vehicle	Safe, legal, comfortable ride, minimize fuel consumption, maximize time efficiency.	Roads, other vehicles, pedestrians, debris, wildlife, weather.	Steering column, engine throttle, brake cylinder, signals, horn.	Cameras, radar, speedometer, GPS, engine sensors, microphones.

The performance measure is what describes good scenarios and what we want our actions to lead to. In our example, some desirable qualities for the self-driving car are to be safe, legal, comfortable, etc. Note that some desired qualities conflict (such as ‘safe, yet fast’), so the agent will need to find means to maximize time efficiency without putting us in danger.

The environment in our example is anything that our vehicle would care about. The roads, other vehicles, pedestrians, wildlife, the weather, etc. Note that elements of the environment may be simple objects (e.g. road debris), but it may be other intelligent agents (e.g. other vehicles).

And finally, the actuators and sensors are the different components that let the vehicle interact with the world.

The range of task environments that might arise in AI is vast, but they do tend to share common properties which we can use to categorize environments. Once an environment is categorized, we can pick known algorithms that are suited to solve them or we can build upon a family of existing techniques.

1- FULLY OBSERVABLE vs. PARTIALLY OBSERVABLE

If an agent’s sensors have access to the state of all the relevant elements at all times, then it is a fully observable environment. On the other hand, a partially observable environment has access to a portion of the states of the environment. Fully observable environments are easier to handle as you do not need to maintain a record of states and infer probabilities of states of relevant components. You might even have an unobservable environment which can still be solved (with increased difficulty) if you have a way to measure performance.

Examples

Observable: [Chess] – The agent knows precisely where each piece is located, who owns it, who’s turn it is and the agent has access to this information at any point in time.

Partially observable: [Self-driving vehicle] – The agent has complete access to some of the information such as the current speed, a view of its immediate surrounding, weather conditions, road conditions, but it does not know if another vehicle is approaching around the corner, or what traffic is like a few intersections away.

2- SINGLE-AGENT vs. MULTIAGENT

Are other agents involved in the task environment? In a sudoku puzzle, no. But in a game of chess; absolutely. Simple enough, but sometimes it may not be clear if some elements of the task environment should be agents or simple objects. For example, in the self-driving vehicle environment, other vehicles are also agents. But what about pedestrians? Animals? Or fire hydrants?

For that, we observe the performance measure of those elements. Other vehicles have the same desired outcome of ‘safety’, and thus, working together to ‘not collide in one another’ is for each other’s benefit. Our agent and other vehicles form a cooperative multiagent environment. In turn, when two elements work towards maximizing their performance measure minimizing the other elements performance measure, such as in a game of chess, it is known as a competitive multiagent environment.

3- DETERMINISTIC vs. NONDETERMINISTIC

If the agent is aware with absolute certainty of the effects of all its actions on the environment and the following state, given any current state, then it’s a deterministic environment. But if the agent does not know with certainty what might happen next, it is a nondeterministic environment. Some textbooks will refer to the word ‘stochastic’ as a synonym for ‘nondeterministic’. Furthermore, some textbooks will make a distinction between ‘stochastic’ and ‘nondeterministic’ and define a task environment to be ‘stochastic’ if it explicitly deals with probabilities. (“There is a 60% probability of a collision given X” vs “This action might lead to a collision”)

Examples

Deterministic: [Chess] – The agent knows precisely what actions it can take and their consequences. It also knows of all the actions the competing agent can take and their consequences.

Nondeterministic: [Self-driving vehicles] – Just like most real-world problems, the agent never really knows how the vehicle will react when controlled. The brakes may malfunction, the engine may seize, the tires may slip. The agent is able to make good predictions, but not with absolute certainty.

4- EPISODIC vs. SEQUENTIAL

In an episodic task environment, the agent’s decisions are based on the present percepts. They are not influenced by past states or decisions and will not impact future states nor decisions. Think of a facial detection AI. Every image fed to the AI is an atomic event that is not influenced by previous images nor future ones. In contrast, a sequential environment has long-term consequences for present decisions. Sequential environments add another level of complexity because the agent now needs to look ahead before taking an action.

5- STATIC vs. DYNAMIC

Can the environment change while the agent is processing? If so, it is a dynamic environment. A self-driving car’s environment will constantly change in real-time whether the agent has deliberated an action yet or not. Time is of the essence. In a static environment, time essential stops while the agent is thinking. For example, in chess, the state of the board will not change until the agent takes an action. It is not to say that time is not part of the environment. It can and likely will be part of the performance measure. Swordfish (the leading chess agent), will search about 20-30 flops (half moves) ahead to satisfy its ‘time’ performance measure.

6- DISCRETE vs. CONTINUOUS

This property refers to the range of values the agent’s percepts, actions and states can hold. In the chess example, each percept, state, and action are distinct and finite. This is known as a discrete environment. In contrast, a self-driving vehicle’s percepts are continuous (speed, time, video feed, audio feed, etc.) and so are the actions (steering column angle, pressure on the engine throttle, etc.). Even the environment itself is continuous (with respect to time). The self-driving vehicle agent is part of a continuous environment.

7- KNOWN vs. UNKNOWN

Although it is a property of the task environment, known and unknown task environments refer to the agent’s knowledge of the environment. You can think of it as ‘understood’. An environment might be a fully observable environment, but completely unknown. For example, an agent interacting with a game whose rules are unknown. The agent is usually forced to experiment and learn the rules given some performance measure. In contrast, in a known environment, the consequences of each action are known (or the probability of a consequence in a stochastic environment).

Summary

Those are the fundamental properties of the task environments. It is interesting to note that the hardest problem to solve with intelligent agents is one that is partially observable, multiagent, nondeterministic, sequential, dynamic, continuous, and unknown. This almost perfectly describes the self-driving vehicle agent (except for the unknown; the consequences of our actions are known), yet, engineers all over the world are making great progress in this field!

Examples (omitted known/unknown as it depends on implementation):

Task Environment	Observable	Agents	Deterministic	Episodic	Static	Discrete
Chess	Fully	Multi	Deterministic	Sequential	Static	Discrete
Poker	Partially	Multi	Stochastic	Sequential	Static	Discrete
Image analysis	Fully	Single	Deterministic	Episodic	Static	Continuous
Refinery controller	Partially	Single	Stochastic	Sequential	Dynamic	Continuous

Now that we have discussed different types of agent environments, we delve into the last section of this article.

The different types of agent programs

The agent program (alongside the agent architecture) is what we are really interested in. It’s the solution to our problem. In fact, designing agent programs is at the center of modern AI. They typically follow this simple structure: a percept comes in, the program is executed, and an action comes out (‘do nothing’ is a valid action). Note that contrary to the agent function, the agent program will only take one percept at a time to process.

There are four basic types of agent programs.

1- Simple reflex agents

The simplest kind of agent is the simple reflex agent. These agents select actions on the basis of the current percept, ignoring past or possible future percepts.

Note that the ‘interpreter’ in our example is purely conceptual. It may be implemented using logical gates (e.g. a vending machine), neural networks with activation functions used to match percepts to an interpretation, or just ‘if – then –’ clauses.

2- Model-based reflex agents

In some situations, a simple if – then – approach is not enough. An agent may find it beneficial to act differently given the same percept at two different points in time. For example, if someone offers you water, you may or may not accept it depending on your current thirst levels. The model-based reflex agent adds an internal state to the agent. ‘If water is offered and state is “thirsty”, accept water’. Additionally, the model-based reflex agent keeps track of previous percepts. This is particularly helpful in a partially observable environment. The agent now has access to historical data to reference upon for some of the currently unobservable aspects of the environment.

Updating its internal state and the state of objects/agents around it requires two kinds of knowledge to be programmed into the model in some form. First, the agent needs to know ‘how the world works’, ‘how the world changes over time’. For example, a self-driving vehicle needs to know what happens when it opens the engine throttle or turns the steering column. This is called a transition model. Second, the agent needs some information about how the state of the world is reflected in the agent’s percepts. Or in other words, ‘how to make sense of the world it sees’. This is called a sensor model.

Together, they allow the agent to keep track of its state, the environment, and other objects/agents.

3- Goal-based agents

Knowing the current state of the environment is not always enough. Take chess for example. Knowing the rules of the game is not enough to succeed. You also need ‘goal information’ in this scenario which describes a desirable situation. Such as “don’t lose the king” or “get the opponent’s king”. This type of agent can easily be combined with a model-based agent with the difference being in the decision-making process. Where the model-based agent will map a state and percept to an action, the goal-based agent will decide on an action based on which action (or sequence of actions) will help achieve a goal. And so, a goal-based chess agent can sacrifice its own piece to protect its king not because this behaviour was mapped, but because it is the only action it predicts will achieve its goal of not losing the king.

4- Utility-based agents

Sometimes, achieving a goal is not enough. Sometimes we want to achieve said goal faster, cheaper, safer, reliably, etc. Think of google maps; we don’t want just some route to our destination, we want the best route to our destination. A goal-based agent will happily take you on a journey around the country as long as it gets you to your destination eventually.

What this agent type adds is an implementation of the performance measure called the utility function. It will weigh and balance all our desired qualities and rank a scenario. This is particularly helpful when we have conflicting goals (such as a ‘fast’ yet ‘safe’ self-driving vehicle), the utility function specifies the appropriate tradeoff. Second, when there are several goals that the agent can aim to achieve, the utility function can pick the path with the highest probability of success or importance. Or both.

And this concludes our lesson about intelligent agents. Next, we will learn about solving problems by searching.

Solving Problems by Searching

Loud Pumpkins — Tue, 02 Jul 2024 02:40:21 GMT

In this article, we will see how a problem-solving agent can use search trees to look ahead and find a sequence of actions that will solve our problem.

This method works particularly well in environments that are deterministic, fully observable, static, discrete, and known. Search trees can be used to solve multi-agent environments, which we will discuss in another article. In this article, we will only cover single-agent environments. Also, search problems must have atomic states; states with no internal structure or components.

We will start by introducing systematic algorithms, also known as uninformed algorithms, and in the next article, we will move on to intelligent algorithms, also known as informed algorithms.

Problem-solving agents are currently used in many real-world situations. Path finding in video games, routing video streams in computer networks, and finding optimal driving directions are amongst the first examples that come to mind. Those are known as route-finding problems.

But they are also used for touring problems (travelling salesperson problem) where a salesman needs to travel to every city most efficiently, VLSI layout problems where millions of transistors need to be strategically placed on a single chip to minimize area and signal delay (propagation delay). They are even used in robot navigation and automatic assembly sequencing.

Search Problems

We will start with a simple route-finding search problem.

Imagine that you have the following map of your local surroundings and that you need to build an agent that will find a path from your current location A to your goal destination Z.

Before we continue, we will formally define the problem to make our lives easier by defining:

The state space: A set of all possible states that the environment can be in.
The initial state: The initial state that the environment starts in (the “root” of the tree).
The goal state: A set of goal states or one goal state.
The actions: All the actions available to the agent. Given a state s, Actions(s) returns a finite set of actions that can be executed.
The transition model: A transition model which describes what each action does. Result(s, a) returns the state that results from doing action a in state s.
The action cost function: A function that takes in a state, action, and the resultant state to determine the cost of an action. This can be a measure of time, resources, or work.

State space = { A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, Z }
Initial state = A
Goal state = Z
actions = {
    Actions(A) = { go_to_B, go_to_C, go_to_D }
    Actions(B) = { go_to_A, go_to_E }
    Actions(C) = { go_to_A, go_to_E, go_to_G, go_to_F }
    Actions(D) = { go_to_A, go_to_H }
    Actions(E) = { go_to_B, go_to_C }
    [ … ]
}
transition model = {
    Result(A, go_to_B) = B
    Result(A, go_to_C) = C
    Result(A, go_to_D) = D
    Result(B, go_to_A) = A
    Result(B, go_to_E) = E
    [ … ]
}
action cost function = {
    ActionCost(A, go_to_B, B) = 75
    ActionCost(A, go_to_C, C) = 140
    ActionCost(A, go_to_D, D) = 118
    ActionCost(B, go_to_A, A) = 75
    ActionCost(B, go_to_E, E) = 71
    [ … ]
}

And we will define an abstract Problem class as such:

class Problem(object):
   """
   An abstract search problem with core functionality that needs to be
   implemented.
   """

   def __init__(self, initial_state):
      self.initial_state = initial_state

   def actions(self, state):
      """
      Generate a list of actions that can be taken given any state.

      :param state: Any: Same datatype as the `state` in `Node`
      :return: list
      """
      raise NotImplemented("Abstract class method not implemented")

   def is_goal(self, state):
      """
      Test the given state to see if a goal has been reached. The `state` is
      of the same data type as the state in `Node` from `search_problem.util`

      :param state: Any: Same datatype as the `state` in `Node`
      :return: bool
      """
      raise NotImplemented("Abstract class method not implemented")

   def transition(self, state, action):
      """
      Generate the resultant state given any state and a valid action.

      :param state: Any: Same datatype as the `state` in `Node`
      :param action: Any
      :return: result_state: Any: Same datatype as the `state` in `Node`
      """
      raise NotImplemented("Abstract class method not implemented")

   def expand(self, node: Node):
      """
      Expand the given node with a state, return a list of successor nodes
      using the problem's actions and transitions.

      :param node: Node
      :return: list[Node]
      """
      return [Node(self.transition(node.state, action), node, action) for action in
              self.actions(node.state)]

Recall that our objective is to find a path from A to Z where each letter is a fictional city or town. A solution is found once a path from the initial state to one of the goal states is found. And so, a path is just a sequence of actions.

By the way, you may wonder if the state space should be so abstract as to include only towns and cities. Surely we would want to include different intersections within each city or town. And the answer is: it depends.

The choice of a good abstraction involves removing as much detail as possible while retaining validity and ensuring that the abstract actions can be carried out. Too many real-world details will completely overwhelm the intelligent agent.

Search Algorithms

A search algorithm takes a search problem as input and returns a solution or an indication of failure. In a single-agent environment, the search algorithm will form a search tree that will superimpose the state space graph, forming various paths from the initial state, trying to find a path that reaches a goal state. Each node in the search tree corresponds to a state in the state space and the edges in the search tree correspond to actions. The root of the tree corresponds to the initial state of the problem.

It is important to understand the distinction between a state space graph and a search tree. The state space will only have as many nodes as there are states in the state space and only as many edges as there are actions, but the search tree will always have the initial state as the root node, potentially an infinite number of nodes (if we do not keep track of repeated states or cycles), but most importantly, each node in the search tree has a unique path back to the root.

The first step a search algorithm can do is expand the root node, by considering the available actions for that state through the given transition model. This will generate new nodes called successor nodes or child nodes.

Each generated node will record its parent node and will be placed in the frontier of the search tree (unexplored nodes).

Now we must choose which of these three child nodes to consider next. This is the essence of search – following up on one option now and putting the others aside for later.

Uninformed search algorithms

An uninformed search algorithm is given no clue about how close a state is to the goal(s). It is a systematic way of exploring state space graphs and we have two strategies: breadth-first search and depth-first search.

Breadth-first Search (first-in-first-out / queue)

In breadth-first search, the root node is expanded first, then all its successors, and then all their successors, and so on until a goal state is found or the entire graph has been traversed.

def breadth_first_search(problem: Problem) -> Optional[Node]:
   frontier = queue.SimpleQueue()
   frontier.put(Node(problem.initial_state))

   while not frontier.empty():
      node: Node = frontier.get()
      if problem.is_goal(node.state):
         return node
      for child in problem.expand(node):
         frontier.put(child)
   return None

This strategy is implemented quite regularly because it is a complete search algorithm. Meaning that if a solution exists, this algorithm will find it. Also, it will return the solution with the fewest number of actions. And if all actions have the same cost, the solution is also a cost optimal solution.

However, this algorithm has an exponential time complexity and space complexity. Meaning that it is expensive on time and memory.

Imagine a problem where, on average, each node has ten possible actions and the solution is fifteen levels deep in the search tree.

Then we will need to generate 1 node for the root node, 10 more to expand the root node, 10 more for each of those 10 nodes, and so on 15 times. That gives us:

And if each node is allocated just a kilobyte and takes just a millisecond to process, that’s over 1000 petabytes of RAM and over 35 000 years.

We can calculate the time and space complexity more abstractly by declaring a variable ‘d’ as the depth of the solution and ‘b’ as the branching factor (average actions per node):

For this reason, this algorithm is typically not used to solve non-trivial problems.

Depth-first Search (last-in-first-out / stack)

Depth-first search algorithms will always expand the deepest node in the frontier first. Implemented with a stack (last-in-first-out). The search proceeds to expand nodes deeper and deeper until the deepest level of the search tree, where the nodes have no successors and then “backs up” to the next deepest node.

def depth_first_search(problem: Problem) -> Optional[Node]:
   frontier = queue.LifoQueue()
   frontier.put(Node(problem.initial_state))

   while not frontier.empty():
      node: Node = frontier.get()
      if problem.is_goal(node.state):
         return node
      for child in problem.expand(node):
         frontier.put(child)
   return None

This method is also complete like the breadth-first search, but it has the added benefit of reduced space complexity which comes at a cost of forgoing the optimal solutions.

The space complexity is drastically reduced from being exponentially bound to linearly bound. We now only explore one branch at a time, so our frontier includes only the successors of one node per level in the tree. Meaning that the space complexity of depth-first search is O(bm) where ‘b’ is the branching factor and ‘m’ is the maximum depth of the tree.

Cycles and Redundant Paths

We saw how both of the uninformed searches are complete, but that is only true if the state graph has no cycles. Looking at our original route-finding problem’s state graph, we can see how an algorithm can easily get stuck in an infinite loop if it were to visit A > B > E > C > A > B > E > C > A or will add redundant paths. This is especially problematic in undirected graphs like ours as each node will have an edge back to its parent; creating a duplicate sub search tree of the parent node.

We have a couple of approaches we can take to avoid redundant paths and cycles.

The first one is to remember all states we have already visited and store them in a set of ‘explored nodes’.

def breadth_first_search_with_reached(problem: Problem) -> Optional[Node]:
   explored = set()
   explored.add(problem.initial_state)
   frontier = queue.SimpleQueue()
   frontier.put(Node(problem.initial_state))

   while not frontier.empty():
      node: Node = frontier.get()
      if problem.is_goal(node.state):
         return node
      for child in problem.expand(node):
         if child.state not in explored:
            explored.add(child.state)
            frontier.put(child)
   return None

However, storing all states in memory can lead to failure just as we saw for breadth-first search algorithms. So, as a compromise, instead of holding a set of explored nodes, we can traverse the chain of parent nodes to see if a given successor node has already been explored in this particular path. This will avoid cycles, and thus infinite loops, but it can still have the same node explored multiple times from different paths.

def depth_first_search_with_reached(problem: Problem) -> Optional[Node]:
   frontier = queue.LifoQueue()
   frontier.put(Node(problem.initial_state))

   while not frontier.empty():
      node: Node = frontier.get()
      if problem.is_goal(node.state):
         return node
      for child in problem.expand(node):
         if not child.explored(child.state):
            frontier.put(child)
   return None

Depth-limited and Iterative Deepening Search

To keep depth-first search algorithms from travelling too far down a certain path. We can define a limit ‘l’ and refuse to expand nodes on that level. This allows us to not worry about infinite cycles, but of course, sometimes we may fail to find a solution if our choice of ‘l’ is too low.

def depth_limited_search(problem: Problem, depth: int) -> Optional[Node]:
   frontier = queue.LifoQueue()
   frontier.put(Node(problem.initial_state))

   while not frontier.empty():
      node: Node = frontier.get()
      if problem.is_goal(node.state):
         return node
      path = node.path_to_root()
      for child in problem.expand(node):
         if any(filter(lambda n: n.state == child.state, path)) \
            or len(path) > depth:
            # this node has either been visited or max depth reached - stop
            continue
         frontier.put(child)
   return None

This algorithm has its place in solving search problems, but it is particularly helpful when used in conjunction with an iterative deepening search algorithm.

Iterative deepening solves the problem of picking the right value of ‘l’ by trying all values sequentially from 0 to infinity.

def iterative_deepening_search(problem: Problem) -> Optional[Node]:
   for e in range(sys.maxsize):
      result = depth_limited_search(problem, e)
      if isinstance(result, Node):
         return result

Just like depth-first search and breadth-first search, iterative deepening search is complete on finite state spaces (assuming the graph is acyclic or that cycles are handled somehow), and just like breadth-first search, it will return a solution with the fewest number of actions, thus, cost optimal if all actions have the same cost. But unlike breadth-first search, it has the space complexity of depth-first search.

This essentially combines the best of both uninformed algorithms into one.

Admittedly, iterative deepening search seems wasteful at first because each call to the depth limited search will rebuild the entire search tree. But recall that the growth rate of a search tree is exponential. Meaning that each level has ‘b’ times as many nodes as the level before it. So most of the computation is in the last level anyway.

For example, take a search problem with an average branching factor of just six with a depth limit of six.

Algorithm	Depth 0	Depth 1	Depth 2	Depth 3	Depth 4	Depth 5	Depth 6	Total
BFS	1	6	36	216	1296	7776	54432	63763
DLS(0)	1							1
DLS(1)	1	6						7
DLS(2)	1	6	36					43
DLS(3)	1	6	36	216				259
DLS(4)	1	6	36	216	1296			1555
DLS(5)	1	6	36	216	1296	7776		9331
DLS(6)	1	6	36	216	1296	7776	54432	63763
IDS(6)	7	36	180	864	3888	15552	54432	74959

Looking at the table, we can see that in the worst case, breadth-first search generates 63763 nodes and iterative deepening generates 74959 nodes. Not bad, right?

Note that the difference is even smaller with a larger branching factor. In real-world problems, such as chess, branching factors can easily reach 20+.

Concrete Implementation of a Search Problem

As an example, we will implement a “cabbage, goat, and wolf” problem which involves a person, travelling with a wolf, a goat and a cabbage that finds himself at a river. There is a single small boat to afford

passage across the river. The boat can hold the person and only one of the wolf, goat or cabbage. The person must ferry his animals and vegetables across the river making as many trips as necessary to do so. However, if the goat is left unattended with the cabbage, the goat will eat the cabbage. Similarly, if the wolf is left unattended with the goat, the wolf will eat the goat. How can the person ferry all items across the river without anything being eaten?

The search tree:

(nodes in red represent visited nodes and thus not explored)

Implementation:

class CGW(Problem):
   """
   This class implements a Cabbage-Goat-Wolf problem as a sub-class of the
   search `Problem` class.

   The values of the state member variables of instances of this class are
   represented in a tuple as follows:

   (human, left, right)

   Where:
   human is an integer;
     1 = human (and boat) on left side,
     2 = human (and boat) on right side,

   left is a string containing zero to three of the characters "CGW" indicating
   which entities are on the left side of the river;

   right is a string containing zero to three of the characters "CGW"
   indicating which entities are on the right side of the river.

   The initial state of the CGW problem is (1,"CGW","").
   """

   def __init__(self, initial_state):
      """
      Initialize a CGW state member variables based on the passed arguments.
      """
      super().__init__(self.validate(initial_state))

   def actions(self, state):
      """
      Return a list of valid actions to take. Which are simply the animal that
      needs to be moved to the other side with the human.

      Recall that a state is: (human, left_side, right_side) where human is
      1 if on the left side or 2 if on the right.

      So we can access the side we are working on using:

         state[state[0]] -> state[human location] -> the side human is on

      """
      actions = []
      for animal in state[state[0]]:
         actions.append(animal)  # animal crossing with human
      actions.append(None)  # the human crossing alone

      return actions

   def is_goal(self, state):
      """
       Return True if the problem is solved.  I.e.: human and all items are on
       the right side.
       """
      return state[0] == 2 and state[2] == "CGW"

   def transition(self, state, action):
      """
      The transition is to always move the human from one side to the other.
      The action contains a letter "C", "G", "W" to indicate which animal
      should the human take with him, or `None` if the human is to move alone.
      """
      animal = action
      human = state[0]
      left_side = state[1]
      right_side = state[2]

      if human == 1:
         # human is going from left to right, move animal from left to right
         human = 2
         left_side = self._remove_animal(left_side, animal)
         right_side = self._add_animal(right_side, animal)

      elif human == 2:
         human = 1
         # human is going from right to left, move animal from right to left
         right_side = self._remove_animal(right_side, animal)
         left_side = self._add_animal(left_side, animal)

      else:  # Where is the human ?
         raise IndexError("State[0] 'human' must be '1' or '2'.")

      state = (human, left_side, right_side)
      return self.validate(state)

   def validate(self, state):
      """
      Update a state variable based on the goat eating the cabbage or the wolf
      eating the goat.
      """
      if state[0] == 1:  # human is on left side
         right = state[2]  # retrieve the contents of the right side
         if "G" in right:  # goat is on right side
            right = self._remove_animal(right, "C")  # goat eats cabbage
         if "W" in right:  # wolf is on right side
            right = self._remove_animal(right, "G")  # wolf eats cabbage
         # reconstruct state which with new right-side contents
         state = (state[0], state[1], right)

      elif state[0] == 2:  # human is on right side
         left = state[1]  # retrieve the contents of the left side
         if "G" in left:  # goat is on the left side
            left = self._remove_animal(left, "C")  # goat eats cabbage
         if "W" in left:  # wolf is on the left side
            left = self._remove_animal(left, "G")  # wolf eats goat
         # reconstruct state with new left-side contents
         state = (state[0], left, state[2])

      else:  # Where is the human ?
         raise IndexError("State[0] 'human' must be '1' or '2'.")
      return state

   def _remove_animal(self, input_string, animal):
      """
      This is a utility function that returns a string that matches the
      input_string except with the given animal (letter) removed.

      exp: input: "CGW", animal: "G", output: "CW"
      """
      if animal is None:
         return input_string
      return input_string.replace(animal, "")

   def _add_animal(self, input_string, animal):
      """
      This is a utility function that returns a string that matches the
      input_string with the given animal added in alphabetical position.
      """
      if animal is None:
         return input_string
      return "".join(sorted(input_string + animal))

initial_state = (1, "CGW", "")
node = iterative_deepening_search(CGW(initial_state))
if node is not None:
   solution = node.solution()
   for index, action in enumerate(solution):
      if action is None:
         solution[index] = '-'
   print("".join(solution))
else:
   print("No solution found")

Run:

> G-WGC-G