Planning
Planning means identifying appropriate actions and their sequence, itβs the basis of rational behavior.
We need to design intelligent machines because:
- the machine may be out of reach, eg. satellites
- experts are not always available.
So the machine should learn how to work rationally by itself.
Types of planning
- Path and motion planning
Generate a path from start to end and a control trajectory along the path. - Perception planning
Sensing actions to gather information and model the environment. - Navigation planning
Motion + perception planning to generate a policy based on localization/sensor-based primitives to explore an area. - Manipulation planning
Concerned with handling objects (think automobile assembly line).
Solving planning problems
Need domain independent representation and efficient solution mechanism.
STRIPS
STRIPS and PDDL both are languages to specify the problem.
STRIPS is one of the first representation languages for planning by Stanford.
Factored representation
A atomic representation is one in which each state is treated as a black box, having no internal structure: the state either does or does not match the goal.
A factored representation is one in which the states are defined by set of features, having more internal structure. State is represented as a set of attribute/value pairs, e.g. in Strips and Pddl, we use binary variables which menas the state is either true or false.
In planning problems, state variables are often referred to as fluents.
Logic notation
Term: an object in the world. Can be a variable, constant or function
Atomic sentence or Atom: a predicate(relations) symbol, optionally followed by a parenthesized list of terms, it can be true or false
Literal: atom or its negation
Ground literal: A literal with no variable
Sentence or Formula: Formed from atoms together with quantifiers ($\forall$, $\exists$), logical connectives ($\wedge$, $\vee$, $\neg$), equality symbol ($=$), Sentence can take values πππ’π or πΉπππ π
Substitution: replaces variables by terms
Unifier: takes 2 sentences and returns a substitution that makes the sentence look identical, if such a substitution exists
For every pair of unifiable sentences, there is a most general unifier (MGU), which is unique up to renaming and substitution of variables
Let $\sigma$ be a most general unifier and $\theta$ be a unifier. After applying $\sigma$, we can find another unifier $w$ such that applying $w$ to the output of $\sigma$ gives the same effect as applying $\theta$.
STRIPS
To specify a problem, we need 5 components:
- Initial state
- Goal test
- Set of actions
- Transition model
- Path cost
With STRIPS, we can specify 4 of them:
- Initial state
- Actions available
- Result of applying an action in a (Effect)
- Goal state
Thereβs no cost function in STRIPS
State is represented by a conjunction of positive literals that are grounded and function-free. Literals not mentioned are false due to closed-world assumption. (unique name assumption)
Eg. π΄π‘(πΉππ‘βππ(πΉπππ), ππ¦ππππ¦) is not allowed because STRIPS doesnβt allow functions.
Goal is a partially specified state, represented as a conjunction of positive ground literals.
A state π satisfies a goal π if π contains all the literals in π.
Action Schema consists ofοΌ
- the action name,
- list of variables used,
- precondition(conjunction of function-free positive literals),
- effect(conjunction of function-free (positive/negative) literals).
Action π is applicable in state π if all precondition of π is satisfied in π .
Ground actions: obtained by instantiating the variables in the schema.
For an action π with π£ variables in a domain with π objects, there are $O(k^v)$ distinct ground actions in the worst case. So thereβs a blow up in the number of ground actions which are difficult to evaluate.
Result of executing π in state π is a state π β² which is a set of fluents formed as follows:
- Start from π
- Remove fluents that appear as negative literals in the effect (delete list: π·πΈπΏ(π))
- Add fluents that appear as positive literals in the effect (add list: π΄π·π·(π))
STRIPS summary
State: conjunction of ground positive function-free literals (Closed world assumption)
Goal: conjunction of ground positive function-free literals (Satisfied by states that contain all literals)
Action precondition: conjunction of positive function-free literals
Action effects: conjunction of function-free literals
Positive literals for add list
Negative literals for delete list
Planning domain
A set of action schemas define planning domain
Planning problem: action schema + initial state + goal
PDDL
Strips β restrictive
$\surd$ Efficient to solve (β΅ of restrictions)
$\Large\times$ Hard to describe the problem
So we need more expressive power.
Action Description Language (ADL)
- positive and negative literals in states, open world assumption
- quantified variables at goal, conjunctions and disjunctions at goal
- conditional effects
- equality predicate, typed variables
PDDL : ADL + extensions
Inequality predicate β not in STRIPS but available in PDDL eg. $a\neq b$
Solving the planning problem
- Forward Search (heuristics)
Very large state space β exponential with the number of state variables
Action space can be very large
Forward search is hopeless without good heuristics - Backward Search (Research / relevant-state search)
- or combined