Airtificial Intelligence II
A simple Planning Agent
An agent uses the percepts provided by the environment to build a complete and correct model of the current world state, after which , to achieve its goal, it calls a suitable planning algorithm (which we will call IDEAL-PLANNER) to generate a plan of action.
A simple planning agent: The agent first generates a goal to achieve, and then constructs a plan to achieve it from the current state. Once agent has a plan, it keeps executing the plan until the plan is finished, then begins again with a new goal.
function SIMPLE-PLANNING-AGENT(percept) returns an action
static: KB, a knowledge base (includes action descriptions)
p, a plan, initially No Plan
t, a counter, initially 0, indicating time
local variables: G, a goal
current, a current state description
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
current = STATE-DESCRIPT1ON(KB, t)
if p = NoPlan then
G <— ASK(KB, MAKE-GOAL-QUERY(O)
p <— lDEAL-PLANNER(current, G, KB)
if p = No Plan or p is empty then action = No Op
else
action <= FIRST(p)
p <= Rest(p)
Tell(Kb, MAKE-ACTION-SENTENCE(action,t)
t=t+1
return action
Problem solving and Planning
Basic elements of a search-based problem-solver are Representation of actions,states,goals and plans.
To make planning practical
(1) Restrict the language with which PLANNER we define problems. With a restrictive
language, there are fewer possible solutions to search through.
(2) Use a special-purpose algorithm called a planner rather than a general-purpose
theorem prover to search for a solution.
Basic representation for planning
LEAST COMMITMENT
the principle of least commitment, which says that one should only make choices about things that you currently care about, leaving the other choices to be worked out later.
PARTIAL ORDER
A planner that can represent plans in which some steps are ordered (before or after) with respect to each other and other steps are unordered is called a partia order planner. The alternative is a total order planner, in which plans consist of a simple list of steps.
LINEARIZATION
A totally ordered plan that is derived from a plan P by adding ordering constraints is called a linearization of P.
What is a PLAN?
A plan is formally defined as a data structure consisting of the following four components:
- Set of plan steps
- Set of step ordering constraints
- Set of variable binding constraints
- Set of casual links
What is a Solution?
A solution is a plan that an agent can execute, and that guarantees achievement of the goal. If we wanted to make it really easy to check that a plan is a solution, we could insist that only fully instantiated, totally ordered plans can be solutions.
Partial Order Planning Algorithm
Partial order planning algorithm.
function POP(initial, goal, operators) returns plan
plan <— MAKE-MINIMAL-PLAN(initial, goal)
loop do
if SOLUTION?(plan) then return plan
Sneed, C = SELECT-SUBGOAL(plan)
CHOOSE-OPERATOR(plan, operators, Sneed, c)
RESOLVE-THREATS(plan)
end
function SELECT-SUBGOAL(plan) returns Sneed, c
pick a plan step Sneed from STEPS(plan)
with a precondition c that has not been achieved
return Sneed, c
procedure CHOOSE-OPERATOR(plan, operators, Sneed, c)
choose a step Sadd from operators or STEPS(plan)
that has c as an effect
if there is no such step then fail
add the causal link Sadd= Sneed to LINKS(plan)
add the ordering constraint Sadd=Sneed
to ORDERINGS(plan)
if Sadd is a newly added step from operators then
add Sadd to STEPS(plan)
add Start = Sadd X Finish to ORDERINGS( plan)
procedure RESOLVE-THREATS(plan)
for each Sthreat that threatens a link
Si ---Sj in LINKS(plan)do
choose either
Promotion: Add Sthreat = S; to ORDERINGS(plan)
Demotion: Add Sj = Sthreat to ORDERINGS(plan)
if not CONSISTENT(plan) then fail
end
How to Resolve threats?
There are three main approaches to dealing with possible threats:
- Resolve now with an equality constraint
- Resolve now with an inequality constraint
- Resolve Later
Knowledge Engineering for planning
- Decide what to talk about.
- Decide on a vocabulary of conditions (literals), operators, and objects.
- Encode operators for the domain.
- Encode a description of the specific problem instance.
- Pose problems to the planner and get back plans.
Practical Planning
Insufficiency of STRIPS for practical planning
The STRIPS language is insufficient for practical planning domain because it cannot express four key concepts:
- Hierarchical plans
- Complex conditions
- Time
- Resources
The problem that a factory solves is to take in raw materials and components, and assemble them into finished products. The problem can be divided into a planning task (deciding what assembly steps are going to be performed) and a scheduling task (deciding when and where each step will be performed).
Scheduling for space missions
Planning and scheduling systems have been used extensively in planning space missions as well as in constructing spacecraft. There are two main reasons for this.
- First, spacecraft are very expensive and sometimes contain humans, and any mistake can be costly and irrevocable.
- Second, space missions take place in space, which does not contain many other agents to mess up the expected effects of actions.
SIPE (System for Interactive Planning and Execution monitoring) was the first planner to deal with the problem of replanning, and the first to take some important steps toward expressive operators.
Hierchical decomposition
The practical planners have adopted the idea of hierarchical decomposition: that an abstract operator can be decomposed into a group of steps that forms a plan that implements the operator. These decompositions can be stored in a library of plans and retrieved as needed. Extending the language
To incorporate hierarchical decomposition, we have to make two additions to the description
of each problem domain.
- We partition the set of operators into primitive and nonprimitive operators. In our
domain, we might define Hammer(Nail) to be primitive and Build(House) to be nonprimitive. - We add a set of decomposition methods. Each method is an expression of the form
Decompose(o, p), which means that a nonprimitive operator that unifies with o can be decomposed into a plan p.
Hierarchical decomposition of the operator which build Hotel into small partial-order plans

Modifying the planner
A hierarchical decomposition partial-order planning algorithm(HD-POP): On each iteration of the loop we first achieve an unachieved condition (CHOOSEOPERATOR), then decompose a nonprimitive operator (CHOOSEDECOMPOSITION), then resolve threats.
function HD-POP(plan, operators, methods) returns plan
inputs: plan, an abstract plan with start
and goal steps (and possibly other steps)
loop do
if SOLUTION(plan) then return plan
Sneed, C = SELECT-SUB-GOAL(plan)
CHOOSE-OPERATOR(plan, operators, Sneed, c)
Snonprim = SELECT-NONPRIMITIVE(plan)
CHOOSE-DECOMPOSITION(PLOT, methods, Snonprim)
RESOLVE-THREATS( plan)
end
Detailed decomposition of a plan step.

Analysis of Hierarchical Decomposition
Abstract solution A plan that contains abstract operators, but is consistent and complete once an abstract solution is found we can prune away all other abstract plans from the search tree. This property is the downward solution property.
We can prune away all the descendants of any inconsistent abstract plan. This is called the
upward solution property
One way to guarantee the upward solution property is to make sure that each decomposition method satisfies the unique main subaction condition: that there is one step of the decomposed plan to which all preconditions and effects of the abstract operator are attached.
More Expressive operator descriptions
Conditional effects
Operators with conditional effects have different effects depending on what the world is like when they are executed.
A version of RESOLVE-THREATS with the confrontation technique for resolving conditional effects.
procedure RESOLVE-THREATS(plan)
for each Si=Sj , in LINKS(plan) do
for each Sthreat in STEPS(plan) do
for each c' in EFFECT(Sthreat) do
if SUBST(BINDINGS(plan),c) = SUBST(BINDINGS(plan),~ c) then
choose either
Promotion: Add Sthreat = Si to ORDERINGS(plan)
Demotion: Add Si = S,i,ml, to ORDERINGS(plan)
Confrontation: if c' is really of the form (~c when p) then
CHOOSE-OPERATOR(plan, operators, Sthreat, ~p)
RESOLVE-THREATS( plan)
if not CONSISTENT(plan) then fail
end
end
end
Resource constraints
Using measures in planning
The solution is to introduce numeric-valued measures. Measures such as the price of gas are realities with which the planner must deal, but over which it j has little control. Other measures, such as Cash and GasLevel, are treated as resources that can be produced and consumed.
Temporal constraints
In most ways, time can be treated like any other resource. The initial state specifies a start
time for the plan.
Planning and Acting
Its a process in which planning systems must face up to the awful prospect of actually having to take their own advice.
Ways to deal with the problems arising from incomplete and incorrect information:
Conditional planning: Also known as contingency planning, conditional planning deals with incomplete information by constructing a conditional plan that accounts for each possible situation or contingency that could arise. The agent finds out which part of the plan to execute by including sensing actions in the plan to test for the appropriate conditions.
example, the shopping agent might want to include a sensing action in its ; shopping plan to check the price of some object in case it is too expensive
Execution monitoring: once it has a plan to execute, it does not use its percepts to J select actions.
example, if the agent discovers that it does not have enough money to pay for all the items it has picked up, it can return some and replace them with cheaper versions.
Conditional Planning
The nature of conditional plans
The condition must be known to the agent at that point in the plan. To ensure that a conditional plan is executable, the agent must insert actions that cause the relevant conditions to become known by the agent.
The CPOP algorithm for constructing conditional plans.
function CPOP(initial, goals, operators) returns plan
plan <— MAKE-PLAN(initial, goals)
loop do
Termination:
if there are no unsatisfied preconditions
and the contexts of the finish steps are exhaustive
then return plan
Alternative context generation:
if the plans for existing finish steps are
complete and have contexts C1 ... Cn then
add a new finish step with a context i
(Ci V ... V Cn)
this becomes the current context
Subgoal selection and addition:
find a plan step Sneed with an open precondition c
Action selection:
choose a step Sadd from operators or STEPS(plan)
that adds c or
knowledge of c and has a context compatible with
the current context
if there is no such step
then fail
add Sadd and Sneed to LINKS( plan)
add Sadd < Swed to ORDERINGS( plan)
if Sadd is a newly added step then
add Sadd to STEPS(plan)
add Start < Sadd < Finish to ORDERINGS(plan)
Threat resolution:
for each step Sthreat that potentially
threatens any causal link Si
with a compatible context do{
choose one of
Promotion: AddS,i,rea, < S; to ORDERINGS(pfan)
Demotion: Add Sj < Sthreat; to ORDERINGS(plan)c;S}
Conditioning:
find a conditional step Scond possibly before
both Sthreat and Sj where
1. the context of Scond is compatible with
the contexts of Sthreat and Si;
2. the step has outcomes consistent with
Sthreat and Sj, respectively
add conditioning links for the outcomes from
Scond to Sthreat and Sj
augment and propagate the contexts of Sthreat
and Sj
if no choice is consistent
then fail
end
end
A simple replanning Agent
To check the preconditions of each action as it is executed, rather than checking the preconditions of the entire remaining plan. This is called action monitoring.
We can divide the causes of plan failure into two kinds, depending on whether it is possible to anticipate the possible contingencies:
Bounded indeterminacy: In this case, actions can have unexpected effects, but the possible effects can be enumerated and described as part of the action description axiom. For example, the result of opening a can of paint can be described as the disjunction of having paint available, having an empty can, or spilling the paint.
Unbounded indeterminacy: In this case, the set of possible unexpected outcomes is too large to be completely enumerated. This would be the case in very complex and/or dynamic domains such as driving, economic planning, and military strategy. In such cases, we can plan for at most a limited number of contingencies, and must be able to replan when reality does not behave as expected.
What is a Situated planning Agent?
Rather than thinking of Agent as the planner which passes its results to execution monitor as separate processes, we can think of them as a single process in a situated planning agent.
A situated planning agent.
function SITUATED-PLANNING-AGENT(percept) returns an action
static: KB, a knowledge base (includes action descriptions)
p, a plan, initially NoPlan
t, a counter, initially 0, indicating time
G, a goal
Tell(KB, MAKE-PERCEPT-SENTENCE(percept,
current = STATE-DESCRIPTION(KB, t)
EFFECTS(START(P)) — current
if p = NoPlan then
G = ASK(KB, MAKE-GOAL-QUERY(t))
p = MAKE-PLAN(current, G, KB)
action = NoOp (the default)
Termination:
if there are no open preconditions and p has no
steps other than START and FINISH then
p = NoPlan and skip remaining steps
Resolving standard flaws:
resolve any open condition by adding a causal
link from any existing
possibly prior step or a new step
resolve potential threats by promotion or demotion
Remove unsupported causal links:
if there is a causal link START
c; S protecting a proposition c
that no longer holds in START then
remove the link and any associated bindings
Extend causal links back to earliest possible step:
if there is a causal link Si = > Sk, such that
another step Si exists with Si < Sj and the
link Si , Sk is safe then
replace Si-->Sk with Sj --> Sk,
Remove redundant actions:
remove any step S that supplies no causal links
Execute actions when ready for execution:
if a step S in the plan other than FINISH
satisfies the following:
(a) all preconditions satisfied by START;
(b) no other steps necessarily between START and S; and
(c) S does not threaten any causal link in p then
add ordering constraints to force all other steps after S
remove S from p, and all causal links to and from S
action <= the action in S
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t — t+ 1
return action
Coercion and abstraction
Coercion, which reduces uncertainty about the world by forcing it into a known state regardless of the initial state.
Abstraction also allows the agent to ignore details of a problem about which it may not have exact and complete knowledge
Acting Under Uncertainty
Agents almost never have access to the whole truth about their environment. The agent must therefore act under uncertainty.
Uncertainty and rational decisions
The presence of uncertainty changes radically the way in which an agent makes decisions. To make such choices, an agent must first have preferences between the different possible outcomes of the various plans , utility theory can be used to represent and reason with preferences.
Design for a decision-theoretic agent
A decision-theoretic agent that selects rational actions
function DT-AGENT(percept) returns an action
static: a set probabilistic beliefs about the
state of the world
calculate updated probabilities for current
state based on
available evidence including current percept
and previous action
calculate outcome probabilities for actions,
given action descriptions and probabilities of
current states
select action with highest expected utility
given probabilities of outcomes and utility
information
return action.
Basic Probability notation
Prior probability
We will use the notation P(A) for the unconditional or prior probability that the proposition A is true. For example, if Cavity denotes the proposition that a particular patient has a cavity, P(Cavity) = O.1
Conditional probability
Once the agent has obtained some evidence concerning the previously unknown propositions making up the domain we use conditional or posterior probabilities, with the notation P(A\B). This is read as "the probability of A given that all we know is B." For example,
P(Cavity\Toothache) – 0.8
The Axioms of Probability
The following are suffecient axioms of probability:
1. All probabilities are between 0 and 1.
0 < P(A) < 1
2. Necessarily true (i.e., valid) propositions have probability 1, and necessarily false (i.e., unsatisfiable) propositions have probability 0.
P(True) = 1
P(False) = 0
3. The probability of a disjunction is given by
P(A V B) = P(A) + P(B) - P(A /\ B)
A Venn diagram showing the propositions A, B, A V B (the union of A and B), and A /\ B (the intersection of A and B) as sets of possible worlds.

a proposition in terms of the probability of the proposition itself:
P(A V ~A) = P(A) + P(~A) - P(A A ~A) (by 3 with B = ~A)
P(True} = P(A) + P(~A) – P(False) (by logical equivalence)
1 = P(A) + P(~A) (by 2)
P(~A) = 1 – P(A) (by algebra)
de Finetti proved something much stronger: if Agent 1 expresses a set of degrees of belief that violate the axioms of probability theory then there is a betting strategy for Agent 2 that guarantees that Agent 1 will lose money. So if you accept the idea that an agent should be willing to "put its money where its probabilities are," then you should accept that it is irrational to have beliefs that violate the axioms of probability.
The joint probability distribution
A probabilistic model of a domain consists of a set of random variables that can take on particular values with certain probabilities. Let the variables be X1 ... Xn. An atomic event is an assignment of particular values to all the variables—in other words, a complete specification of the state of the domain.
Bayes Rule and its Use
product rule:
P(A /\ B)=P(A|B)P(B) *
P(A /\ B)= P(B|A)P(A)
Baye's rule :

general version conditionalized on some background evidence E:

Applying Bayes' rule: The simple case
Suppose there is a school with 60% boys and 40% girls as students. The female students wear trousers or skirts in equal numbers; the boys all wear trousers. An observer sees a (random) student from a distance; all the observer can see is that this student is wearing trousers. What is the probability this student is a girl? The correct answer can be computed using Bayes' theorem.
The event A is that the student observed is a girl, and the event B is that the student observed is wearing trousers. To compute P(A|B), we first need to know:
- P(A), or the probability that the student is a girl regardless of any other information. Since the observers sees a random student, meaning that all students have the same probability of being observed, and the fraction of girls among the students is 40%, this probability equals 0.4.
- P(B|A), or the probability of the student wearing trousers given that the student is a girl. As they are as likely to wear skirts as trousers, this is 0.5.
- P(B), or the probability of a (randomly selected) student wearing trousers regardless of any other information. Since half of the girls and all of the boys are wearing trousers, this is 0.5×0.4 + 1×0.6 = 0.8.
Given all this information, the probability of the observer having spotted a girl given that the observed student is wearing trousers can be computed by substituting these values in the formula:

Where Do Probabilities Come from ?
The frequentist position is that the numbers can eome only from experiments: if we test 100 people and find that 10 of them have a cavity, then we can say the probability of a cavity is approximately 0.1. A great deal of work has gone into making such statistical assessments reliable.
The objectivist view is that probabilities are real aspects of the universe—propensities of objects
to behave in certain ways—rather than being just descriptions of an observer's degree of belief. In this view, frequentist measurements are attempts to observe the real probability value.
The subjectivist view describes probabilities as a way of characterizing an agent's beliefs, rather
than having any external physical significance. This allows the doctor or analyst to make these numbers up, to say, "In my opinion, I expect the probability of a cavity to be about 0.1." even if you not allow subjective methods, the choice of which of the experiments to use is a subjective choice known as the reference class problem..
What are Probabilistic Reasoning System?
Belief Network
Data structure called a belief network is used to represent the dependence between variables and to give a concise specification of the joint probability distribution. A belief network is a graph in which the following holds:
- A set of random variables makes up the nodes of the network.
- A set of directed links or arrows connects pairs of nodes. The intuitive meaning of an arrow from node X to node Y is that X has a direct influence on Y.
- Each node has a conditional probability table that quantifies the effects that the parents have on the node. The parents of a node are all those nodes that have arrows pointing to it.
- The graph has no directed cycles (hence is a directed, acyclic graph, or DAG).
A method for constructing belief networks
The general procedure for incremental network construction is as follows:
- Choose the set of relevant variables X, that describe the domain.
- Choose an ordering for the variables.
- While there are variables left:
(a) Pick a variable X,- and add a node to the network for it.
(b) Set Parents(Xi) to some minimal set of nodes already in the net such that the conditional
independence property (15.2) is satisfied.
(c) Define the conditional probability table for Xi
Compactness and node ordering
The compactness of belief networks is an example of a very general property of locally structured (also called sparse) systems
Network structure depends on order of introduction. the correct order to add nodes is to add the "root causes" first, then the variables they influence, and so on until we reach the "leaves" which have no direct causal influence on the other variables.
Types of Distribution
Canonical distributions—distributions fit some standard pattern. In such cases, the complete table can be specified by naming the pattern and perhaps supplying a few parameters.
A deterministic node has its value specified exactly by the values of its parents, with no uncertainty.
Conditional independence relations in belief networks
D-separation Three ways in which a path from X to Y can be blocked, given the evidence E. If every path from X to Y is blocked, then we say that E d-separates X and Y.

Inference in belief Network
An algorithm for answering queries

Inference in multi connected belief network
A multi connected graph is one in which two nodes are connected by more than one path.There exixts three basic classes of algorithms for evaluating multi connected networks:
Clustering methods transform the network into a probabilistically equivalent (but topologically
different) polytree by merging offending nodes.
Conditioning methods do the transformation by instantiating variables to definite values, and then evaluating a polytree for each possible instantiation.
Stochastic simulation methods use the network to generate a large number of concrete models of the domain that are consistent with the network distribution. They give an approximation of the exact evaluation.
Knowledge engineering for uncertainity reasoning
The approach to knowledge engineering for probabilistic reasoning systems
- Decide what to talk about.
- Decide on a vocabulary of random variables
- Encode general knowledge about the dependence between variables
- Encode a description of the specific problem instance.
- Pose queries to the inference procedure and get answers
Other approaches to uncertainity reasoning
Default reasoning
Rule-based methods for uncertain reasoning
Representing ignorance: Dempster-Shafer theory
Representing vagueness: Fuzzy sets and fuzzy logic
Fuzzy set theory is a means of specifying how well an object satisfies a vague description. For example, consider the proposition "Nate is tall." Is this true, given that Nate is 5' 10"? Most
people would hesitate to answer "true" or "false," preferring to say, "sort of." Note that this is not a question of uncertainty about the external world — we are sure of Nate's height. Rather it is a case of vagueness or uncertainty about the meaning of the linguistic term "tall.
Making Simple Decision
expected utility formulae :

The principle of maximum expected utility (MEU) says that a rational agent should choose an
action that maximizes the agent's expected utility.
The Basis for Utility Theory :: The axioms of utility theory
Ordcrability: Given any two states, a rational agent must either prefer one to the other or else rate the two as equally preferable. That is, an agent should know what it wants.
(A >- B) V (B >- A) V (A ~ B)
Transitivity: Given any three states, if an agent prefers A to B and prefers B to C, then the agent must prefer A to C.
(A >- B) A (B >- C) => (A >- C)
Continuity: If some state B is between A and C in preference, then there is some probability p for which the rational agent will be indifferent between getting B for sure and the lottery that yields A with probability and C with probability 1 — p.
A >~B y C => 3p [B, A; 1 — p, C]
Substitutability: If an agent is indifferent between two lotteries, A and B, then the agent is indifferent between two more complex lotteries that are the same except that B is substituted for A in one of them. This holds regardless of the probabilities and the other outcome(s) inthe lotteries.
A ~ B => lp,A; l-p,C]~ l-p,B; l-p,C]
Monotonicity: Suppose there are two lotteries that have the same two outcomes, A and B. If an agent prefers A to B, then the agent must prefer the lottery that has a higher probability for A (and vice versa).
A>B => (p>q O [p,A; l~p,B]t[q,A; 1 -q,B])
Decomposability: Compound lotteries can be reduced to simpler ones using the laws of probability. This has been called the "no fun in gambling" rule because it says that an agent should not prefer (or disprefer) one lottery just because it has more choice points than another.
[p, A; 1 — p, [q,B; 1 — q,C]\ ~ [p,A; (1 — p)q,B; (1 — p)(l — q),C]
also...
1 . Utility principle
If an agent's preferences obey the axioms of utility, then there exists a real-valued function U that operates on states such that U(A) > U(B) if and only if A is preferred to B, and U(A) - U(B) if and only if the agent is indifferent between A and B.
U(A) > U(B) o A X B
U(A) = U(B) ^ A ~ B
2. Maximum Expected Utility principle The utility of a lottery is the sum of the probabilities of
each outcome times the utility of that outcome.
Some definations
Utility Function : Utility is a function that maps from states to real numbers.
Multiattribute utility functions : Decision making in the field of public policy involves both millions of dollars and life and death.
Dominance : Suppose that airport site Si costs less, generates less noise pollution, and is safer than site S2. One would not hesitate to reject S2. We say that there is strict dominance of Si over S2
Decision Networks
Evaluating decision networks
The algorithm for evaluating decision networks is the following:
- Set the evidence variables for the current state.
- For each possible value of the decision node:
(a) Set the decision node to that value.
(b) Calculate the posterior probabilities for the parent nodes of the utility node, using a
standard probabilistic inference algorithm.
(c) Calculate the resulting utility for the action. - Return the action with the highest utility.
The Value of Informant
A general formula :

Decision Theoritic Expert Systems
The knowledge engineering process required for building and using decision-theoretic expert
system is as follows:
- Determine the scope of the problem
- Lay out the topology.
- Assign probabilities.
- Assign utilities.
- Enter available evidence.
- Evaluate the diagram.
- Obtain new evidence.
- Perform sensitivity analysis.
Making Complex Decisions
Sequential decision problems : where the agent's utility depends on a sequence of decisions.

In the deterministic version of the problem, each action reliably moves one square in the intended direction, except that moving into a wall results in no change in position.
The problem of calculating an optimal policy in an accessible, stochastic environment with a known transition model is called a Markov decision problem (MDP),
In an inaccessible environment, the percept does not provide enough information to determine the state or the associated transition probabilities. In the operations research literature, such problems are called partially observable Markov decision problems, or POMDP
An optimal policy for the stochastic environment

An agent that calculates and uses an optimal policy.
function SIMPLE-POLICY-AGENT(percept) returns an action
static: M, a transition model
U, a utility function on environment histories
P, a policy, initially unknown
if P is unknown then P = the optimal policy given U, M
return P[percept]
Value Iteration
The value iteration algorithm for calculating utilities of states.

Policy Iteration
The policy iteration algorithm for calculating an optimal policy.

A decision-theoretic agent
function DECISION-THEORETlC-AGENT(percept) returns action
calculate updated probabilities for current state based on
available evidence including current percept and previous action
calculate outcome probabilities for actions
given action descriptions and probabilities of current states
select action with highest expected utility
given probabilities of outcomes and utility information
return action
Sensing in uncertain worlds
Detailed design for a decision-theoretic agent.

Dynamic belief network
The environment is modelled by the conditional probability distribution P(X, X,_|,A,_|), which
describes how the state depends on the previous state and the action of the agent. As with the
sensor model, we make a stationarity assumption: the conditional probabilities are the same for
all t. In this section, we will cover the case where the agent is passively monitoring and
predicting a changing environment, rather than acting on it. The agent is thus concerned with a
sequence of X, values, where each one is determined solely by the previous one: P(X, X,_i). This
sequence is called a state evolution model or Markov chain.
.