Decision Networks

Post author: HUANG Liu
Post link: <a href="https://huangliu0909.github.io/2021/02/10/AI_Decision_Theory_2/" title="Decision Theory 2">https://huangliu0909.github.io/2021/02/10/AI_Decision_Theory_2/
Copyright Notice: All articles in this blog are licensed under <a href="https://creativecommons.org/licenses/by-nc-sa/3.0/" rel="external nofollow" target="_blank">CC BY-NC-SA 3.0 unless stating additionally.

Essentially, decision networks are extensions of Bayesian networks.
Bayesian networks decrease the number of variables leading to the result.
Decision Networks (aka. influence diagrams) is to combine BN with action and utility nodes to figure out how to get a specific result.

A simplified form is to eliminate chance nodes that represent outcome state, which means that action node and current state are directly connected to the utility node.
Utility node represents the expected utility associated with each action and is then associated with the action-utility function or 𝑄-function(Q: quality of action)
In this lecture we view Probability Inference(refer to Uncertainty in AI) as a black box and use tool to solve it.

Value of information

Assume exact evidence can be obtained about variable $E_j$;
To compute value of perfect information (VPI):

Given current evidence $e$, expected utility with current best action $a$: $EU(a|e)=\max_a\sum_{s'}P(Result(a)=s'|a,e)U(s')$
Value of the best new action after $E_j=e_j$ is obtained: $EU(a_{e_j}|e,e_j)=\max_a\sum_{s'}P(Result(a)=s'|a,e,e_j)U(s')$
Variable $E_j$ can take multiple values $e_{jk}$, so on averaging: $VPI_e(E_j)=\sum_k P(E_j=e_{jk}|e)EU(a_{e_{jk}}|e,e_{jk})-EU(a|e)$

Proporties:

Expected value of information is always non-negative:
$\forall e,e_j\quad VPI_e(E_j)\geq 0$
VPI is not additive:
$VPI_e(E_j,E_k)\neq VPI_e(E_j) + VPI_e(E_k)$
VPI is order independent:
$VPI_e(E_j,E_k)= VPI_e(E_j) + VPI_{e,e_j}(E_k)= VPI_e(E_k) + VPI_{e,e_k}(E_j)$

Design a new agent

Agent should gather information before taking actions, if possible.
Choose Infomation