Forward search (Progression search)
Forward search is not useful without heuristic.
From the current state, we try to apply actions and see what states we can get in this domain until we reach the goal.
Backward search (Regression / Relevant-state search)
state: conjunction of ground literals
description: conjunction of relevant ground literals + any other fluent
- Subset of first-order logic
- Easy for us to understand and specify
- Hard to deal with
- Hard to develop good heuristics
- Allows formula to represent a set of propositional variables
Description represents a set of states.
For $n$ ground fluents, there is $2^n$ ground states(true or false) and $3^n$ descriptions(true, false and NA).
Backward search starts from the goal states and go back.
It needs inverse transitions and relevant actions.
Forward:
Backward:
Preconditions must have been true β add $πππππππ(π)$ to $πβ²$
Add-list (or effects) need not be true before the action β remove $π΄π·π·(π)$
$π·πΈπΏ(π)$ disappears: it doesnβt matter if they were true or not before the action.
Relevant actions:
At least one of the actionβs effect must unify with an element of the current goal;
Must not contain effects that negate an element of the current goal.
Backward search can improve efficiency, has lower branching factor as compared to forward search (mostly) and does not rule out any solution.
Heuristics for Planning (Relax the problem)
Ignore preconditions heuristics
Drop all the preconditions
Relaxation: actions applicable everywhere
Number of steps to solve the problem β number of unsatisfied goals
- Further relax: remove all remove effects except goal literals
Need minimum number of actions such that their union satisfies the goal
Becomes a set-cover problem (NP-Hard)
Greedy algorithm β within a factor of logβ‘(π) factor of the optimal
Ignore delete list heuristic
Monotonically move towards the goal
Relaxation : no action will undo progress made by another action
Optimal length of the solution can be used as heuristic
Still NP-hard (searching is still exponential)
Hill-climbing gives approximate solution in polynomial time(not exact solution)
Problem decomposition
If each subplan uses an admissible heuristic, take maximum of them for the plan
Assume subgoal independence: decompose to disjoint subsets
Sum of cost of each plan, if admissible, is better than max of the heuristics
These are heuristics because:
- There are factor representation
- Use STRIPS to represent the problem
SATPLAN
Transfer planning(search) problem to boolean satisfiability problem.
SAT is NP-hard, but is still effective in practice with millions of variables and constraints so it can be solved in a good(polynominal) time.
SAT: Finding an assignment to variables such that: boolean formula becomes TRUE
Input to SAT solvers: Conjunctive Normal Form (CNF) formulas
- CNF β conjunction of clauses
- Clause β disjunction of literals
- Literal β variable or its negation
- $\neg$ should appear only on literals
Any Boolean formula can be converted to CNF
Step1: Eliminate $\Leftrightarrow$; replace $\alpha\Leftrightarrow\beta$ with $\alpha\Rightarrow\beta\wedge\beta\Rightarrow\alpha$
Step2: Eliminate $\Rightarrow$; replace $\alpha\Rightarrow\beta$ with $\neg\alpha\vee\beta$
Step3: Move $\neg$ inwards by repeatedly applying:
$\qquad\neg(\neg\alpha)\equiv\alpha\quad$ Double negation elimination
$\qquad\neg(\alpha\wedge\beta)\equiv\neg\alpha\vee\neg\beta\quad$DeMorgan
$\qquad\neg(\alpha\vee\beta)\equiv\neg\alpha\wedge\neg\beta\quad$DeMorgan
Step4: Apply distributive law
Distribute $\vee$ over $\wedge$ when possible.
Transfrom STRIPS to SAT
- Propositionize actions
replace each action scheme with all ground actions. - Define initial state
Assert $F^0$ for every fluent $F$ in initial state and $\neg F^0$ for every fluent not in the initial state. - Propositionalize the goal
Construct a disjunction over all possible ground conjunctions obtained by replacing the variables with constants. - Add successor-state axiom (defining the transition)
Stating what is and what is not affected by the action
For each fluent $F$ add axiom of the form:which means F is true i.i.f. some action caused it or it was already true and no action changed it.
$ActionCausesF^t$ is a disjunction of all ground actions that have $F$ in their add list.
$ActionCausesNotF^t$ is a disjunction of all ground actions that have $F$ in their delete list.
Frame problem
Most actions leave most fluents unchanged.
Solution: write frame axioms to show nothing else changed on taking an action.
For $m$ actions and $n$ fluents, need $O(mn)$ axioms
Improvement: successor-state axiom only need $O(n)$
Each axiom is longer, but only involves actions that has an affect on the fluent. - Add precondition axioms
If an action is taken at time $t$, its preconditions must have been true.
For each ground action $A$, add axiom $A^t\Rightarrow PRE(A)^t$ - Add action exclusion axiom
Every action is distinct from every other action (only one action is allowed at each time step, two actions cannot happen at the same time)
Specify mutual exclusion $\neg (A_i^t\wedge A_j^t)\Leftrightarrow\neg A_i^t\vee\neg A_j^t$ - Create a conjunction of all the axioms (This is the problem definition)
- Convert this conjunction to CNF
SATPLAN
Problem: do not know how many steps
Solution: try values $T$ up to $T_{max}$
If you get a solution within $T_{max}$ then stop and output it, otherwise we do not know if there exists valid solutions.
After finding a solution, we extract the actions and trace back what actions have been done.
Complexity of planning
Worst case: intractable
PlanSat: answers the question is there any plan that solves a planning problem.
Bounded PlanSat: answers the question if there is a solution of length $k$ or less.
Decidable for classical planning as the number of states is finite.
PlanSat and BoundedPlanSat are PSPACE (solvable with polynomial space)