Section 11.2. Planning with State-Space Search 385 Notice that there are many irrelevan actions that an as lead to a goal state. lan om IFK to SEO:this actio al state fr at JFK and all the satis allo ill still h und by y a ba T re evan rd search of than fo d s 1000a only 20 actions working b ckwar the go ing backwa ds is so etimes regressio The principa Iquestion in regres state ch applying a given action lea ne goa states is ca g t goal through the action.To see e the goal At(C1,B)A At(C2.B)A...AAt(C20,B) and the relevant action Unload(C,p,B),which achieves the first conjunct.The action will work only if its preconditions are satisfied.Therefore,any predeces r state must include these preconditions:In(C,p)AAt(p,B).Moreover,the subgoal At(C,B)should not be true in the predecessor state.Thus.the predecessor description is In(C1.p)A At(p,B)A At(C2.B)A...A At(C2o.B). In addition to insisting that actions achieve some desired literal,we must insist that the actions CONSISTENCY not undo any desired literals.An action that satisfies this restriction is called consistent.For example,the action Load(C2.p)would not be consistent with the current goal,because it would negate the literal At(C2,B). Given definitions of relevance and consistency we can describe the general process of constructing predecessors for backward search.Given a goal description G,let A be an action that is relevant and consistent.The corresponding predecessor is as follows: Any positive effects of A that appear inGare deleted. .Each precondition literal of A is added,unless it already appears. Any of the standard search algorithms can be used to carry out the search.Termination occurs when a predecessor description is generated that is satisfied by the initial state of the planning problem.In the first-order case,satisfaction might require a substitution for variables in the predecessor description.For example,the predecessor description in the preceding paragraph is satisfied by the initial state In(C1,P2)A At(Pi2,B)A At(C2,B)A...A At(C20,B) with substitution (p/P2).The substitution must be applied to the actions leading from the state to the goal,producing the solution Unload(C1.P12.B). If the subgoal e in the ould still lead to agoal state.On the other hand
Section 11.2. Planning with State-Space Search 385 Notice that there are many irrelevant actions that can also lead to a goal state. For example, we can fly an empty plane from JFK to SFO; this action reaches a goal state from a predecessor state in which the plane is at JFK and all the goal conjuncts are satisfied. A backward search that allows irrelevant actions will still be complete, but it will be much less efficient. If a solution exists, it will be found by a backward search that allows only relevant actions. The restriction to relevant actions means that backward search often has a much lower branching factor than forward search. For example, our air cargo problem has about 1000 actions leading forward from the initial state, but only 20 actions working backward from the goal. REGRESSION Searching backwards is sometimes called regression planning. The principal question in regression planning is this: what are the states from which applying a given action leads to the goal? Computing the description of these states is called regressing the goal through the action. To see how to do it, consider the air cargo example. We have the goal At(C1, B) ∧ At(C2, B) ∧ . . . ∧ At(C20, B) and the relevant action Unload(C1, p, B), which achieves the first conjunct. The action will work only if its preconditions are satisfied. Therefore, any predecessor state must include these preconditions: In(C1, p) ∧ At(p, B). Moreover, the subgoal At(C1, B) should not be true in the predecessor state.4 Thus, the predecessor description is In(C1, p) ∧ At(p, B) ∧ At(C2, B) ∧ . . . ∧ At(C20, B) . In addition to insisting that actions achieve some desired literal, we must insist that the actions CONSISTENCY not undo any desired literals. An action that satisfies this restriction is called consistent. For example, the action Load(C2, p) would not be consistent with the current goal, because it would negate the literal At(C2, B). Given definitions of relevance and consistency, we can describe the general process of constructing predecessorsfor backward search. Given a goal description G, let A be an action that is relevant and consistent. The corresponding predecessor is as follows: • Any positive effects of A that appear in G are deleted. • Each precondition literal of A is added, unless it already appears. Any of the standard search algorithms can be used to carry out the search. Termination occurs when a predecessor description is generated that is satisfied by the initial state of the planning problem. In the first-order case, satisfaction might require a substitution for variables in the predecessor description. For example, the predecessor description in the preceding paragraph is satisfied by the initial state In(C1, P12) ∧ At(P12, B) ∧ At(C2, B) ∧ . . . ∧ At(C20, B) with substitution {p/P12}. The substitution must be applied to the actions leading from the state to the goal, producing the solution [Unload(C1, P12, B)]. 4 If the subgoal were true in the predecessor state, the action would still lead to a goal state. On the other hand, such actions are irrelevant because they do not make the goal true
386 Chapter 11.Planning Heuristics for state-space search It turs out that neither forward nor backward search is efficient without a goo d heuristic es the distan in STRIPS planning,the cos so t nce The basic idea is to look at the ts of the actions and at the goals that must b achieved and to guess how many actions are needed to achieve all the goals s.Finding the exact number is NP hard,but it is possible to f nd reasonable estima s most of the time without too much computation.We might also be able to derive an admissible heuristic -one that does not overestimate.This could be used with A'search to find optimal solutions There are two approaches that can be tried. The first is to derive a relaxed problen from the given problem specification.as described in Chapter 4.The optimal solution cost for the relaxed problem-which we hope is very easy to solve- -gives an admissible heuristic for the original problem.The second approach is to pretend that a pure divide-and-conquer NDEFENDENCE algorithm will work.This is called the subgoal independence assumption:the cost ofsolving a conjunction of subgoals is approximated by the sum of the costs of solving each subgoa independently.The subgoal independence assumption can be optimistic or pessimistic s optimistic when there are negative interactions between the subplans for each subgoal- for example,when an action in one subplan deletes a goal achieved by another subplan It is pessimistic.and therefore inadmissible.when subplans contain redundant actions- instance,two actions that could be replaced by a single action in the merged plan. Let us consider how to derive relaxed planning problems.Since explicit representations of preconditions and effects are available,the process will work by modifying those repre- sentations.(Compare this approach with search problems,where the successor function is a black box.)The simplest idea is to relax the problem by removing all preconditions from the actions.Then every action will always be applicable,and any literal can be achieved in one step(if there is an applicable action-if not,the goal is impossible).This almost implies that the number of steps required to solve a coniunction of goals is the number of unsatisfied goals-almost but not quite,because (1)there may be two actions,each of which deletes the goal literal achieved by the other,and (2)some action may achieve multiple goals.If we combine our relaxed problem with the subgoal independence assumption,both of these issues are assumed away and the resulting heuristic is exactly the number of unsatisfied goals. In many cases a more accurate heuristic is obtained by considering at least the positive interactions arising from actions that achieve multiple goals.First,we relax the problem fur- ther by removing negative effects(see Exercise 11.6).Then,we count the minimum number of actions required such that the union of those actionspositive effects satisfies the goal.For example.consider Goal(AA BAC) X,EFFECT:AA P Action(Z,EFFECT:BA PAQ) assumption,which
386 Chapter 11. Planning Heuristics for state-space search It turns out that neither forward nor backward search is efficient without a good heuristic function. Recall from Chapter 4 that a heuristic function estimates the distance from a state to the goal; in STRIPS planning, the cost of each action is 1, so the distance is the number of actions. The basic idea is to look at the effects of the actions and at the goals that must be achieved and to guess how many actions are needed to achieve all the goals. Finding the exact number is NP hard, but it is possible to find reasonable estimates most of the time without too much computation. We might also be able to derive an admissible heuristic—one that does not overestimate. This could be used with A ∗ search to find optimal solutions. There are two approaches that can be tried. The first is to derive a relaxed problem from the given problem specification, as described in Chapter 4. The optimal solution cost for the relaxed problem—which we hope is very easy to solve—gives an admissible heuristic for the original problem. The second approach is to pretend that a pure divide-and-conquer algorithm will work. Thisis called the subgoal independence assumption: the cost ofsolving SUBGOAL INDEPENDENCE a conjunction of subgoals is approximated by the sum of the costs of solving each subgoal independently. The subgoal independence assumption can be optimistic or pessimistic. It is optimistic when there are negative interactions between the subplans for each subgoal— for example, when an action in one subplan deletes a goal achieved by another subplan. It is pessimistic, and therefore inadmissible, when subplans contain redundant actions—for instance, two actions that could be replaced by a single action in the merged plan. Let us consider how to derive relaxed planning problems. Since explicit representations of preconditions and effects are available, the process will work by modifying those representations. (Compare this approach with search problems, where the successor function is a black box.) The simplest idea is to relax the problem by removing all preconditions from the actions. Then every action will always be applicable, and any literal can be achieved in one step (if there is an applicable action—if not, the goal is impossible). This almost implies that the number of steps required to solve a conjunction of goals is the number of unsatisfied goals—almost but not quite, because (1) there may be two actions, each of which deletes the goal literal achieved by the other, and (2) some action may achieve multiple goals. If we combine our relaxed problem with the subgoal independence assumption, both of these issues are assumed away and the resulting heuristic is exactly the number of unsatisfied goals. In many cases, a more accurate heuristic is obtained by considering at least the positive interactions arising from actions that achieve multiple goals. First, we relax the problem further by removing negative effects (see Exercise 11.6). Then, we count the minimum number of actions required such that the union of those actions’ positive effects satisfies the goal. For example, consider Goal(A ∧ B ∧ C) Action(X, EFFECT:A ∧ P) Action(Y, EFFECT:B ∧ C ∧ Q) Action(Z, EFFECT:B ∧ P ∧ Q) . The minimal set cover of the goal {A, B, C} is given by the actions {X, Y }, so the set cover heuristic returns a cost of 2. This improves on the subgoal independence assumption, which
Section 11.3.Partial-Order Planning 387 gives heuristic vale f 3.:the set cover problem is NP rd A s at is facto of log of the ering algo n is the nu rals i ch h guarantee fad lity for the he is also p ate to8ecncrac problems by oving neg tive effects withou the effcct A in th This ans tha ed gin e int een su be elete the re sulting re problen EMPTYOELETE-UST The heu c is quite ac mpu pla mng alg .In practice,the search in the relax d problem is often here can be used in either the progres th ng the empt regression ly to as ew se ciemIforalpob e ex plored.Since exponentially sno algorithm will be out many pra be solved with the heuristic methods in this chapter -Ta be solved just a few years ago 11.3 PARTIAL-ORDER PLANNING Forward and backward state-space search are particular forms of totally ordered plan search. They explore only strictly linear sequences of actions directly connected to the start or goal. This means that they cannot take advantage of problem decomposition.Rather than work on each subproblem separately,they must always make decisions about how to sequence actions from all the subproblems.We would prefer an approach that works on several subgoals independently,solves them with several subplans,and then combines the subplans. Such an approach also has the advantage of flexibility in the order in which it constructs the plan.That is,the planner can work on"obvious"or"important"decisions first,rather than being forced to work on steps in chronological order.For example,a planning agent that is in Berkeley and wishes to be in Monte Carlo might first try to find a flight from San Francisco to Paris:given information about the departure and arrival times.it can then work on ways to get to and from the airports. LEAST COMMTMENT The general strategy of delaying a choice during search is called a least commitment strategy.There is no formal definition of least commitment,and clearly some degree of commitment is necessary,lest the search would make no progress.Despite the informality least commitment is a useful concept for analyzing when decisions should be made in any search problem actions have only sitive conditions an
Section 11.3. Partial-Order Planning 387 gives a heuristic value of 3. There is one minor irritation: the set cover problem is NPhard. A simple greedy set-covering algorithm is guaranteed to return a value that is within a factor of log n of the true minimum value, where n is the number of literals in the goal, and usually works much better than this in practice. Unfortunately, the greedy algorithm loses the guarantee of admissibility for the heuristic. It is also possible to generate relaxed problems by removing negative effects without removing preconditions. That is, if an action has the effect A ∧ ¬B in the original problem, it will have the effect A in the relaxed problem. This means that we need not worry about negative interactions between subplans, because no action can delete the literals achieved by another action. The solution cost of the resulting relaxed problem gives what is called the EMPTY-DELETE-LIST empty-delete-list heuristic. The heuristic is quite accurate, but computing it involves actually running a (simple) planning algorithm. In practice, the search in the relaxed problem is often fast enough that the cost is worthwhile. The heuristics described here can be used in either the progression or the regression direction. At the time of writing, progression planners using the empty-delete-list heuristic hold the lead. That is likely to change as new heuristics and new search techniques are explored. Since planning is exponentially hard,5 no algorithm will be efficient for all problems, but many practical problems can be solved with the heuristic methods in this chapter—far more than could be solved just a few years ago. 11.3 PARTIAL-ORDER PLANNING Forward and backward state-space search are particular forms of totally ordered plan search. They explore only strictly linear sequences of actions directly connected to the start or goal. This means that they cannot take advantage of problem decomposition. Rather than work on each subproblem separately, they must always make decisions about how to sequence actions from all the subproblems. We would prefer an approach that works on several subgoals independently, solves them with several subplans, and then combines the subplans. Such an approach also has the advantage of flexibility in the order in which it constructs the plan. That is, the planner can work on “obvious” or “important” decisions first, rather than being forced to work on steps in chronological order. For example, a planning agent that is in Berkeley and wishes to be in Monte Carlo might first try to find a flight from San Francisco to Paris; given information about the departure and arrival times, it can then work on ways to get to and from the airports. LEAST COMMITMENT The general strategy of delaying a choice during search is called a least commitment strategy. There is no formal definition of least commitment, and clearly some degree of commitment is necessary, lest the search would make no progress. Despite the informality, least commitment is a useful concept for analyzing when decisions should be made in any search problem. 5 Technically, STRIPS-style planning is PSPACE-complete unless actions have only positive preconditions and only one effect literal (Bylander, 1994)
388 Chapter 11.Planning Our first ng a vacation.Consider 之热 the Goal(RightShoeOn LeftShoeOn) Init() Action(RightShoe,PRECOND:RightSockOn,EFFECT:RightShoeOn Action(riahtSock effect riahtsockon) Action(LeftShoe,PRECOND:LeftSockOn,EFFECT:LeftShoeOn) Action(LeftSock,EFFECT:LeftSockOn) A planner should be able to come up with the two-action sequence RightSock followed by RightShoe to achieve the first conjunct of the goal and the sequence LeftSock followed by LeftShoe for the second conjunct.Then the two sequences can be combined to yield the final plan.In doing this,the planner will be manipulating the two subsequences independently. without committing to whether an action in one sequence is before or after an action in the other.Any planning algorithm that can place two actions into a plan without specifying which PLANEOFOER comes first is called a partial-order planner.Figure 11.6 shows the partial-order plan that is the solution to the shoes and socks problem.Note that the solution is represented as a graph of actions.not a sequence.Note also the"dummy"actions called Start and Finish.which mark the beginning and end of the plan.Calling them actions symplifies things.because now every step of a plan is an action.The partial-order solution corresponds to six possible total-order plans;each of these is called a linearization of the partial-order plan Partial-order planning can be implemented as a search in the space of partial-order n.we will just call them"plans"That is.we start with an empty plan ewe consider ways of refining the plan until we come up with a complete oves the problem The actions in this search are not actions in the world,but actionso step to the plan.imposing an ordering that puts one action before another. and soon We will define the pop algorithm for partial-order planning it is traditional to write as a stand alo but will ins ad formulate partial-order ing as an nstance of a search r This allov sus to focuson the plan refinemen pplied,rathertha g al out ho the alg thm s the space.In fact a wide ch thods the problem is fom Remember hat the states of ou will be (mostly unfinished)plans.To th the ill talk a es Fach the first tw of the plan and a bookk eeping function to determine h ow plans can tended A set of actions that make up the steps of the plan.These are taken from the set of actions in the planning problem.The"empty"plan contains just the Start and Finish actions Start has no preconditions and has as its effect all the literals in the initial state of the planning problem.Finish has no effects and has as its preconditions the goal literals of the planning problem
388 Chapter 11. Planning Our first concrete example will be much simpler than planning a vacation. Consider the simple problem of putting on a pair of shoes. We can describe this as a formal planning problem as follows: Goal(RightShoeOn ∧ LeftShoeOn) Init() Action(RightShoe, PRECOND:RightSockOn, EFFECT:RightShoeOn) Action(RightSock, EFFECT:RightSockOn) Action(LeftShoe, PRECOND:LeftSockOn, EFFECT:LeftShoeOn) Action(LeftSock, EFFECT:LeftSockOn) . A planner should be able to come up with the two-action sequence RightSock followed by RightShoe to achieve the first conjunct of the goal and the sequence LeftSock followed by LeftShoe for the second conjunct. Then the two sequences can be combined to yield the final plan. In doing this, the planner will be manipulating the two subsequences independently, without committing to whether an action in one sequence is before or after an action in the other. Any planning algorithm that can place two actions into a plan without specifying which comes first is called a partial-order planner. Figure 11.6 shows the partial-order plan that is PARTIAL-ORDER PLANNER the solution to the shoes and socks problem. Note that the solution is represented as a graph of actions, not a sequence. Note also the “dummy” actions called Start and Finish, which mark the beginning and end of the plan. Calling them actions symplifies things, because now every step of a plan is an action. The partial-order solution corresponds to six possible LINEARIZATION total-order plans; each of these is called a linearization of the partial-order plan. Partial-order planning can be implemented as a search in the space of partial-order plans. (From now on, we will just call them “plans.”) That is, we start with an empty plan. Then we consider ways of refining the plan until we come up with a complete plan that solves the problem. The actions in this search are not actions in the world, but actions on plans: adding a step to the plan, imposing an ordering that puts one action before another, and so on. We will define the POP algorithm for partial-order planning. It is traditional to write out the POP algorithm as a stand-alone program, but we will instead formulate partial-order planning as an instance of a search problem. This allows us to focus on the plan refinement steps that can be applied, rather than worrying about how the algorithm explores the space. In fact, a wide variety of uninformed or heuristic search methods can be applied once the search problem is formulated. Remember that the states of our search problem will be (mostly unfinished) plans. To avoid confusion with the states of the world, we will talk about plans rather than states. Each plan has the following four components, where the first two define the steps of the plan and the last two serve a bookkeeping function to determine how plans can be extended: • A set of actions that make up the steps of the plan. These are taken from the set of actions in the planning problem. The “empty” plan contains just the Start and Finish actions. Start has no preconditions and has as its effect all the literals in the initial state of the planning problem. Finish has no effects and has as its preconditions the goal literals of the planning problem