AI_UNIT2
AI_UNIT2
(22MCA0304)
Unit-2
A* Search Algorithm:
Best-first Search Algorithm (Greedy Search):
Greedy best-first search algorithm always selects the path which appears best at that moment. It is the
combination of depth-first search and breadth-first search algorithms. It uses the heuristic function and
search. Best-first search allows us to take the advantages of both algorithms. With the help of best-first
search, at each step, we can choose the most promising node. In the best first search algorithm, we
expand the node which is closest to the goal node and the closest cost is estimated by heuristic function,
i.e.
f(n)= g(n).
Were, h (n)= estimated cost from node n to the goal.
The greedy best first algorithm is implemented by the priority queue.
In this search example, we are using two lists which are OPEN and CLOSED Lists. Following are the
iteration for traversing the above example.
Expand the nodes of S and put in the CLOSED list Initialization: Open [A, B], Closed [S]
Iteration 1: Open [A], Closed [S, B]
Iteration 2: Open [E, F, A], Closed [S, B]
: Open [E, A], Closed [S, B, F]
Iteration 3: Open [I, G, E, A], Closed [S, B, F]
A* Search Algorithm:
A* search is the most commonly known form of best-first search. It uses heuristic function h(n), and cost
to reach the node n from the start state g(n). It has combined features of UCS and greedy best-first search,
by which it solve the problem efficiently. A* search algorithm finds the shortest path through the search
space using the heuristic function. This search algorithm expands less search tree and provides optimal
result faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead of g(n).
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence we can
combine both costs as following, and this sum is called as a fitness number.
Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation function (g+h), if
node n is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list. For each
successor n', check whether n' is already in the OPEN or CLOSED list, if not then compute evaluation
function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the back pointer
which reflects the lowest g(n') value.
Step 6: Return to Step 2.
Advantages:
A* search algorithm is the best algorithm than other search algorithms.
A* search algorithm is optimal and complete.
This algorithm can solve very complex problems.
Disadvantages:
It does not always produce the shortest path as it mostly based on heuristics and approximation.
A* search algorithm has some complexity issues.
The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so it is
not practical for various large-scale problems.
Example
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all states is
given in the below table so we will calculate the f(n) of each state using the formula f(n)= g(n) + h(n), where
g(n) is the cost to reach any node from start state.
Here we will use OPEN and CLOSED list.
If the heuristic function is admissible, then A* tree search will always find the least cost path.
Time Complexity: The time complexity of A* search algorithm depends on heuristic function, and the
number of nodes expanded is exponential to the depth of solution d. So the time complexity is O(b^d),
where b is the branching factor.
In such cases, we can use local search algorithms. They operate using a single current state (rather
than multiple paths) and generally move only to neighbors of that state.
The important applications of these class of problems are (a) integrated-circuit design, (b) Factory-floor
layout,(c) job-shop scheduling,(d)automatic programming, (e)telecommunications network
optimization,(f)Vehicle routing, and (g) portfolio management.
Key advantages of Local Search Algorithms
They use very little memory – usually a constant amount; and
They can often find reasonable solutions in large or infinite (continuous) state spaces for which
systematic algorithms are unsuitable.
Local search can be used on problems that can be formulated as finding a solution maximizing a criterion
among a number of candidate solutions.
Local search algorithms move from solution to solution in the space of candidate solutions (the search
space) until a solution deemed optimal is found or a time bound is elapsed.
For example, the travelling salesman problem, in which a solution is a cycle containing all nodes of the
graph and the target, is to minimize the total length of the cycle. i.e. a solution can be a cycle and the
criterion to maximize is a combination of the number of nodes and minimize the length of the cycle.
A local search algorithm starts from a candidate solution and then iteratively moves to a neighbor
solution.
Terminate on a time bound or if the situation is not improved after number of steps.
Its success depends very much on the shape of the state-space land-scape: if there are few local
maxima, random-restart hill climbing will find a “good” solution very quickly. The hill-climbing search
algorithm, which is the most basic local search technique. At each step the current node is replaced by
the best neighbor; in this version, that means the neighbor with the highest VALUE, but if a heuristic cost
estimate h is used, we would find the neighbor with the lowest h.
To avoid being stuck in a local maxima, it tries randomly (using a probability function) to move to another
state, if this new state is better it moves into it, otherwise try another move… and so on.
Terminates when finding an acceptably good solution in a fixed amount of time, rather than the best
possible solution. Locating a good approximation to the global minimum of a given function in a large
search space. Widely used in VLSI layout, airline scheduling, etc.
The problem with this approach is that the neighbors of a state are not guaranteed to contain any of the
existing better solutions which means that failure to find a better solution among them does not guarantee
that no better solution exists. It will not get stuck to a local optimum. If it runs for an infinite amount of
time, the global optimum will be found.
GENETIC ALGORITHMS:
Inspired by evolutionary biology and natural selection, such as inheritance. Evolves toward better
solutions.A successor state is generated by combining two parent states, rather by modifying a single state.
Start with k randomly generated states (population), Each state is an individual.A state is represented as a
string over a finite alphabet (often a string of 0s and 1s) Evaluation function (fitness function). Higher
values for better states. Produce the next generation of states by selection, crossover, and mutation.
Commonly, the algorithm terminates when either a maximum number of generations has been produced,
or a satisfactory fitness level has been reached for the population.
Line search: tries to overcome the dilemma of adjusting α by extending the current gradient direction—
repeatedly doubling αuntil f starts to decrease again. The point at which this occurs becomes the new
current state.
In cases that the objective function is not differentiable: we calculate an empirical gradient by evaluating
the response to small increments and decrements in each coordinate. Empirical gradient search is the
same as steepest-ascent hill climbing in a discretized version of the state space.
Newton-Raphson method:
Constrained optimization: An optimization problem is constrained if solutions must satisfy some hard
constraints on the values of the variables. (e.g. : linear programming problems)
Many local search methods apply also to problems in continuous spaces. Linear programming and convex
optimization problems obey certain restrictions on the shape of the state space and the nature of the
objective function, and admit polynomial-time algorithms that are oftenly extremely efficient in practice.
Nondeterministic problems:
Transition model is defined by RESULTS function that returns a set of possible outcome states;
Solution is not a sequence but a contingency plan (strategy),
e.g.
[Suck, if State = 5 then [Right, Suck] else []];
In nondeterministic environments, agents can apply AND-OR search to generate contingent plans that
reach the goal regardless of which outcomes occur during execution.
AND-OR search trees
OR nodes: In a deterministic environment, the only branching is introduced by the agent’s own choices in
each state, we call these nodes OR nodes.
AND nodes: In a nondeterministic environment, branching is also introduced by the environment’s choice
of outcome for each action, we call these nodes AND nodes.
AND-OR tree: OR nodes and AND nodes alternate. States nodes are OR nodes where some action must be
chosen. At the AND nodes (shown as circles), every outcome must be handled.
A solution (shown in bold lines) for an AND-OR search problem is a sub tree that
Belief-state problem can be defined by: (The underlying physical problem P is defined by ACTIONS P,
RESULTP, GOAL-TESTP and STEP-COSTP)
·Belief states: Contains every possible set of physical states. If P has N states, the sensorless problem has
up to 2^N states (although many may be unreachable from the initial state).
·Initial states: Typically the set of all states in P.
·Actions:
a. If illegal actions have no effect on the environment, take the union of all the actions in any of the
physical states in the current belief b:
ACTIONS (b) =
b. If illegal actions are extremely dangerous, take the intersection.
·Transition model:
a. For deterministic actions,
b' = RESULT(b,a) = {s' : s' = RESULTP(s,a) and s∈b}. (b' is never larger than b).
b. For nondeterministic actions,
b' = RESULT(b,a) = {s' : s' = RESULTP(s,a) and s∈b} = (b' may be larger than b)
The process of generating the new belief state after the action is called the prediction step.
·Goal test: A belief state satisfies the goal only if all the physical states in it satisfy GOAL-TESTP.
·Path cost
If an action sequence is a solution for a belief state b, it is also a solution for any subset of b. Hence, we
can discard a path reaching the superset if the subset has already been generated. Conversely, if the
superset has already been generated and found to be solvable, then any subset is guaranteed to be
solvable.
The update stage determines, for each possible percept, the belief state that would result from the percept.
The new belief state bo is the set of states in that could have produced the percept:
Agent for partially observable environments is similar to the simple problem-solving agent (formulates a
problem, calls a search algorithm, executes the solution).
Main difference:
1) The solution will be a conditional plan rather than a sequence.
2) The agent will need to maintain its belief state as it performs actions and receives percepts.
Given an initial state b, an action a and a percept o, the new belief state is
b’ = UPDATE(PREDICT(b, a), o). //recursive state estimator
Sate estimation: a.k.a. monitoring or filtering, a core function of intelligent system in partially observable
environments—maintaining one’s belief state.
In general, an offline search would have to come up with an exponentially large contingency plan that
considers all possible happenings, while an online search need only consider what actually does happen.
For example,
A chess playing agent is well-advised to make its first move long before it has figured out the complete
course of the game. Online search is a necessary idea for an exploration problem, where the states and
actions are unknown to the agent. An agent in this state of Ignorance must use its actions as experiments
to determine what to do next, and hence must interleave computation and action.
The canonical example of online search is a robot that is placed in a new building and must explore it to
build a map that it can use for getting from A to B. Methods for escaping from labyrinths-required
knowledge for aspiring heroes of antiquity-are also examples of online search algorithms. Spatial
exploration is not the only form of exploration, however.
Consider a newborn baby: it has many possible actions, but knows the outcomes of none of them,
and it has experienced only a few of the possible states that it can reach. The baby’s gradual discovery
of how the world works is, in part, an online search process.
Note in particular that the agent cannot access the successors of a state except by actually trying all the
actions in that state. For example, in the maze problem shown in Figure, the agent does not know that
going Up from (1,l) leads to (1,2); nor, having done that, does it know that going Down will take it back to
(1,l). This degree of ignorance can be reduced in some applications-for example, a robot explorer might
know how its movement actions work and be ignorant only of the locations of obstacles.
We will assume that the agent can always recognize a state that it has visited before, and we will assume
that the actions are deterministic. Finally, the agent might have access to an, admissible heuristic
function h(s) that estimates the distance from the current state to a goal state. For example, in Figure, the
agent might know the location of the goal and be able to use the Manhattan distance heuristic.
The agent starts at S and must reach G, but knows nothing of the environment.
(b)
Two state spaces that might lead an online search agent into a dead end. Any given agent will fail in at
least one of these spaces.
A two-dimensional environment that can cause an online search agent to follow an arbitrarily inefficient
route to the goal. Whichever choice the agent makes, the adversary blocks that route with another long,
thin wall, so that the path followed is much longer than the best possible path.
Perhaps you find the term "accidentally" unconvincing-after all, there might be an algorithm that happens
not to take the dead-end path as it explores. Our claim, to be more precise, is that no algorithm can avoid
dead ends in all state spaces.
Consider the two dead-end state spaces in Figure (a). To an online search algorithm that has visited states
S and A, the two state spaces look identical, so it must make the same decision in both. Therefore, it will
Dead ends are a real difficulty for robot exploration--staircases, ramps, cliffs, and all kinds of natural
terrain present opportunities for irreversible actions. To make progress, we will simply assume that the
state space is safely explorable-that is, some goal state is reachable from every reachable state. State
spaces with reversible actions, such as mazes and 8-puzzles, can be viewed as undirected graphs and are
clearly safely explorable.
Even in safely explorable environments, no bounded competitive ratio can be guaranteed if there are paths
of unbounded cost. This is easy to show in environments with irreversible actions, but in fact it remains
true for the reversible case as well, as Figure (b) shows. For this reason, it is common to describe the
performance of online search algorithms in terms of the size of the entire state space rather than just the
depth of the shallowest goal.
For example, offline algorithms such as A* have the ability to expand a node in one part of the space and
then immediately expand a node in another part of the space, because node expansion involves simulated
rather than real actions. An online algorithm, on the other hand, can expand only a node that it physically
occupies. To avoid traveling all the way across the tree to expand the next node, it seems better to expand
nodes in a local order.
Depth-first search has exactly this property, because (except when backtracking) the next node expanded
is a child of the previous node expanded.
An online depth-first search agent is shown in Figure. This agent stores its map in a table, result [a, s],
that records the state resulting from executing action an in states. When ever an action from the current
state has not been explored, the agent tries that action.
The difficulty comes when the agent has tried all the actions in a state. In offline depth-first search, the
state is simply dropped from the queue; in an online search, the agent has to backtrack physically.
In depth-first search, this means going back to the state from which the agent entered the current state
most recently. That is achieved by keeping a table that lists, for each state, the predecessor states to
which the agent has riot yet backtracked. If the agent has run out of states to which it can backtrack,
then its search is complete.
The progress of ONLINE-DFS-AGENT can be traced when applied to the maze given in Figure. It is fairly
easy to see that the agent will, in the worst case, end up traversing every link in the state space exactly
twice.
For exploration, this is optimal; for finding a goal, on the other hand, the agent's competitive ratio could be
arbitrarily bad if it goes off on a long excursion when there is a goal right next to the initial state. An
online variant of iterative deepening solves this problem; for an environment that is a uniform tree, the
competitive ratio of such an agent is a small constant.
Because of its method of backtracking, ONLINE-DFS-AGENT works only in state spaces where the actions
are reversible. There are slightly more complex algorithms that work in general state spaces, but no such
Figure: An online search agent that uses depth-first exploration. The agent is applicable only in bidirected
search spaces.
Like depth-first search, hill-climbing search has the property of locality in its node expansions. In fact,
because it keeps just one current state in memory, hill-climbing search is already an online search
algorithm! Unfortunately, it is not very useful in its simplest form because it leaves the agent sitting at
local maxima with nowhere to go. Moreover, random restarts cannot be used, because the agent cannot
transport itself to a new state.
Instead of random restarts, one might consider using a random walk to explore the environment. A
random walk simply selects at random one of the available actions from the current state; preference can
be given to actions that have not yet been tried. It is easy to prove that a random walk will eventually find
a goal or complete its exploration, provided that the space is finite.15 On the other hand, the process can
be very slow. Figure shows an environment in which a random walk will take exponentially many steps to
find the goal, because, at each step, backward progress is twice as likely as forward progress.
The example is contrived, of course, but there are many real-world state spaces whose topology causes
these kinds of "traps" for random walks.
Augmenting hill climbing with memory rather than randomness turns out to be a more effective approach.
The basic idea is to store a "current best estimate" H(s) of the cost to reach the goal from each state that
has been visited. H(s) starts out being just the heuristic
Figure - An environment in which a random walk will take exponentially many steps to find the goal.
In the example, there are two actions with estimated costs 1 + 9 and 1 + 2, so it seems best to move right.
Now, it is clear that the cost estimate of 2 for the shaded state was overly optimistic. Since the best move
cost 1 and led to a state that is at least 2 steps from a goal, the shaded state must be at least 3 steps from
a goal, so its H should be updated accordingly, as shown in Figure. Continuing this process, the agent will
move back and forth twice more, updating H each time and "flattening out" the local minimum until it
escapes to the right.
An agent implementing this scheme, which is called learning real- time A* (LRTA*), is shown in Figure.
Like ONLINE-DFS-AGENT, it builds a map of the environment using the result table. It updates the cost
estimate for the state it has just left and then chooses the "apparently best" move according to its current
cost estimates. One important detail is that actions that have not yet been tried in a state s are always
assumed to lead immediately to the goal with the least possible cost, namely h(s). This optimism under
uncertainty encourages the agent to explore new, possibly promising paths.
The initial ignorance of online search agents provides several opportunities for learning. First, the agents
learn a "map" of the environment-more precisely, the outcome of each action in each state-simply by
recording each of their experiences. (Notice that the assumption of deterministic environments means that
one experience is enough for each action.) Second, the local search agents acquire more accurate
estimates of the value of each state by using local updating rules.
Five iterations of LRTA* on a one-dimensional state space. Each state is labeled with H(s), the current cost
estimate to reach a goal, and each arc is labeled with its step cost. The shaded state marks the location of
the agent, and the updated values at each iteration are circled.
These updates eventually converge to exact values for every state, provided that the agent explores the
state space in the right way. Once exact values are known, optimal decisions can be taken simply by
moving to the highest-valued successor-that is, pure hill climbing is then an optimal strategy.
If you followed our suggestion to trace the behavior of ONLINE-DFS- AGENT in the environment, you will
have noticed that the agent is not very bright. For example, after it has seen that the Up action goes from
(1,l) to (1,2), the agent still has no idea that the Down action goes back to (1,1), or that the Up action
also goes from (2,l) to (2,2), from (2,2) to (2,3), and so on. In general, we would like the agent to learn that
Up increases the y-coordinate unless there is a wall in the way, which Down reduces it, and so on. For
this to happen, we need two things. First, we need a formal and explicitly representation for these kinds of
general rules; so far, we have hidden the information inside the black box called the successor function.
*******