This document discusses embedding deterministic equations in Bayesian networks through discretization. Standard approaches impose arbitrary imputations that induce information loss. The proposed approach converts the Bayesian network to a credal network with interval-shaped credal sets to provide inferences robust to any imputation. Discretization algorithms aim to minimize information loss by optimizing variable partitions. Inference in the resulting credal networks is challenging due to the addition of interval probabilities.
2. QUICK OUTLINE
▸ Embedding deterministic equations in (discrete) graphical models
▸ Discretisation induces information loss (of course)
▸ (But only) generalised probability to avoid arbitrary imputation
▸ The Bayesian network might become a credal network
▸ Strategies for fast optimal discretisation in those credal networks
▸ An example of limitation of classical "Bayesian" probabilities
advocating the role of alternative uncertainty models such as
imprecise probability, Dempster-Shafer, possibility theory, ...
6. Continuous X = x
˜X = ˜x ⟺ x ∈ [x′, x′′]
π(x| ˜x) = ? ∫
x′′
x′
π(x| ˜x) dx = 1
Discrete
xx′ x′′
7. Continuous X = x
˜X = ˜x ⟺ x ∈ [x′, x′′]
π(x| ˜x) = ? ∫
x′′
x′
π(x| ˜x) dx = 1
Discrete
xx′ x′′
8. THEORETICAL BACKGROUND
IMPRECISELY SPECIFIED PROBABILITIES AS CREDAL SETS
▸ A (convex) set of probability mass functions (Walley)
▸ or
▸ Vacuous credal set? No constraints. All the mass functions!
▸ "Interval" credal set? All mass functions giving probability
one to a set of possible values (qualitative model)
▸ Equivalent models defined in evidence (Dempster &
Shafer) and possibility theory (Zadeh, Dubois & Prade)
K(X)
K(X) = {P(X) : linear constraints} K(X) = convex hull{Pi(X)}k
i=1
10. THE PROBLEM
DETERMINISM IN BAYESIAN NETS
▸ Continuous (e.g., Gaussian) BNs?
▸ Not always suited for knowledge-
based systems (expert elicitation
might require discrete values)
▸ Given equation + discretisation find
"best" CPT quantification
y = f(x) ⇒ P(y|x) := δ[y − f(x)]
P(˜y| ˜x)
(x, y) → (˜x, ˜y)Discretisation:
CPT:
...x1
y
xn
f
11. THE (EXISTING) SOLUTIONS
(SHARP) IMPUTATION STRATEGIES
▸ Value of discrete variable
corresponds to a (hyper)interval
of values of
▸ From Dirac to Kronecker?
Degenerate probability ?
▸ Find a representative point and
put all the mass to the interval of
▸ Different imputations might lead to
different quantifications, i. e.,
P(˜y| ˜x)
˜x ˜X
x ∈ [x′, x′′]
x
f(x)
¯x, ¯¯x ∈ ˜x f(¯¯x) ∈ ˜y′f(¯x) ∈ ˜ybut and
12. A TOY EXAMPLE
DISCRETISING THE BODY MASS INDEX
BMI =
W
H2
▸ We discretise W (ranges of 5Kg) and H (ranges of 5cm)
▸ Joe (89Kg,1.71m) is mod. obese (BMI=30.4)
▸ Bill (86Kg,1.74m) is overweight (BMI=28.4)
13. THE (EXISTING) SOLUTIONS
(SOFT) IMPUTATION STRATEGIES
▸ If the image of the hyper-interval of an
input variable corresponds to different
intervals of the output variable? Splitting
probability mass over the intervals?
▸ No reason for equiprobable options!
We are in a condition of ignorance (not
indifference) between the options
▸ Probability proportional to the coverage?
No reason for a uniform prior!
▸ A Bayesian network with multiple
quantification of CPTs?
▸ This is a credal network (Cozman, 2000)
14. THE (EXISTING) SOLUTIONS
(SOFT) IMPUTATION STRATEGIES
▸ If the image of the hyper-interval of an
input variable corresponds to different
intervals of the output variable? Splitting
probability mass over the intervals?
▸ No reason for equiprobable options!
We are in a condition of ignorance (not
indifference) between the options
▸ Probability proportional to the coverage?
No reason for a uniform prior!
▸ A Bayesian network with multiple
quantification of CPTs?
▸ This is a credal network (Cozman, 2000)
P(˜y| ˜x) ∈ [0,1]
∀˜y : [min f(x), max f(x)]x∈˜x ∧ ˜y ≠ ∅
15. OUR PROCEDURE
RELIABLE QUANTIFICATION
▸ INPUT: Bayesian network + equation
▸ Node Y with parents X to be quantified by eq Y=f(X)
▸ For each hiper-interval compute lower/upper bounds of f
▸ Set a credal interval over the states touched by the bounds
▸ The resulting model is a credal network giving robust
inferences wrt any possible imputation
▸ Challenge: dedicated algorithms for this special class of nets
16. OUR PROCEDURE
RELIABLE QUANTIFICATION
▸ INPUT: Bayesian network + equation
▸ Node Y with parents X to be quantified by eq Y=f(X)
▸ For each hiper-interval compute lower/upper bounds of f
▸ Set a credal interval over the states touched by the bounds
▸ The resulting model is a credal network giving robust
inferences wrt any possible imputation
▸ Challenge: dedicated algorithms for this special class of nets
INFERENCE IN CREDAL NETS IS NP-HARD (AS IN BAYESIAN NETS)
BUT MORE LIMITATIONS (E.G., FAST IN POLYTREES ONLY IF BINARY)
17. A TOY EXAMPLE
DISCRETISING THE BODY MASS INDEX
BMI =
W
H2
▸ We discretise W (ranges of 5Kg) and H (ranges of 5cm)
18. A TOY EXAMPLE
DISCRETISING THE BODY MASS INDEX
BMI =
W
H2
▸ We discretise W (ranges of 5Kg) and H (ranges of 5cm)
P(i5 |h4, w7) ∈ [0,1]
P(i6 |h4, w7) ∈ [0,1]
P(ik |h4, w7) = 0 k ≠ 5,6
19. THEORETICAL BACKGROUND
DISCRETISATION STRATEGIES
▸ So far, we assumed fixed discretisation of both X and Y
▸ Freedom of discretisation can reduce information loss
▸ Loss measure? Upper entropy (Abellán & Moral)
▸ Polynomial solution for fixed input variable discretisation
▸ Analogous to classical interval partitioning problem
▸ Proof? Trivial as the solution
is on the O(n) values
20.
21. SUMMARY
▸ Standard procedure to embed deterministic equations in
discrete Bayesian networks
▸ Our conservative approach converts the Bayesian network
into a credal network with interval-shaped credal sets
▸ The credal network returns inferences robust with respect
to any imputation
CONCLUSIONS
22. SUMMARY
▸ Discretisation algorithms for input variables and for input/
output with penalties for under/over partitioning
▸ Prove NP-hardness of credal nets with interval credal set in
the credal nodes and probabilities in the "Bayesian" ones
▸ Ad hoc inference algorithms for credal networks with
interval credal sets (hints from possibility theory?)
OUTLOOKS