We demonstrate how to estimate the expected optimisation time of UMDA, an estimation of distribution algorithm, using the level-based theorem. The talk was given at the GECCO 2015 conference in Madrid, Spain.
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Simplified Runtime Analysis of Estimation of Distribution Algorithms
1. Simplified Runtime Analysis of
Estimation of Distribution Algorithms
Duc-Cuong Dang Per Kristian Lehre
University of Nottingham
United Kingdom
Madrid, Spain
July, 11-15th 2015
3. Runtime Analysis of EAs
Analysis of the time the EA requires to optimise a function f:
expected number of fitness evaluations
expressed asymptotically wrt instance size n
dependence on
characteristics of the problem
parameter settings of the EA
4. Previous Work
(1+1) EA cGA1
Onemax Θ(n log n) Θ(K
√
n) [Droste, 2006] EDA worse
Linear functions Θ(n log n) Ω(Kn) [Droste, 2006] EDA worse
(µ+1) EA cGA2
Onemax + N(0, σ2) superpoly(n) whp O(Kσ2√
n log Kn) [Friedrich et al., 2015] EDA better
(1+1) EA UMDA
LeadingOnes Θ(n2) O(λn), λ = ω(n2) [Chen et al., 2007]
BVLeadingOnes Θ(n2) ∞ w.o.p. [Chen et al., 2010] w/o margins
O(λn), λ = ω(n2) [Chen et al., 2010]
SubString 2Ω(n) w.o.p. O(λn), λ = ω(n2) [Chen et al., 2009] EDA better
1
K = n1/2+ε
2
K = ω(σ2√
n log n)
3
λ = Ω(log n)
5. Previous Work
(1+1) EA cGA1
Onemax Θ(n log n) Θ(K
√
n) [Droste, 2006] EDA worse
Linear functions Θ(n log n) Ω(Kn) [Droste, 2006] EDA worse
(µ+1) EA cGA2
Onemax + N(0, σ2) superpoly(n) whp O(Kσ2√
n log Kn) [Friedrich et al., 2015] EDA better
(1+1) EA UMDA
LeadingOnes Θ(n2) O(λn), λ = ω(n2) [Chen et al., 2007]
BVLeadingOnes Θ(n2) ∞ w.o.p. [Chen et al., 2010] w/o margins
O(λn), λ = ω(n2) [Chen et al., 2010]
SubString 2Ω(n) w.o.p. O(λn), λ = ω(n2) [Chen et al., 2009] EDA better
LeadingOnes Θ(n2) O(nλ log λ + n2) this paper3
Onemax Θ(n log n) O(nλ log λ) this paper
1
K = n1/2+ε
2
K = ω(σ2√
n log n)
3
λ = Ω(log n)
6. Univariate Marginal Distribution Algorithm
1: Initialise the vector p0 := (1/2, . . . , 1/2).
2: for t = 0, 1, 2, . . . do
3: Sample λ bitstrings y1, . . . , yλ according to the distribution
pt(x) =
n
i=1
pt(i)xi
(1 − pt(i))1−xi
4: Let y(1), . . . , y(λ) be the bitstrings sorted by fitness func. f
5: Compute the next vector pt+1 according to
pt+1(i) :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i
6: end for
7. Univariate Marginal Distribution Algorithm
1: Initialise the vector p0 := (1/2, . . . , 1/2).
2: for t = 0, 1, 2, . . . do
3: Sample λ bitstrings y1, . . . , yλ according to the distribution
pt(x) =
n
i=1
pt(i)xi
(1 − pt(i))1−xi
4: Let y(1), . . . , y(λ) be the bitstrings sorted by fitness func. f
5: Compute the next vector pt+1 according to
pt+1(i) :=
1
n if Xi = 0
Xi
µ if 1 ≤ Xi ≤ µ − 1
1 − 1
n if Xi = µ,
where Xi := µ
j=1 y
(j)
i .
6: end for
12. Level-Based Theorem
≥ γ(1 + δ)
≥ zj
D(Pt)
A+
mA+
1
A+
j A+
j+1 · · ·
Pt
γ0λ γλ
Theorem (Corus, Dang, Ereemeev, Lehre (2014))
If for any level j < m and population P where
|P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ
an individual y ∼ D(P) is in Aj+1 with
Pr (y ∈ Aj+1) ≥
γ(1 + δ) if γ > 0
zj if γ = 0
and the population size λ is at least
λ = Ω ln(m/(δzj))/δ2
then level Am is reached in expected time
O
1
δ5
m ln λ +
m
j=1
1
λzj
.
20. Proof idea (ignoring margins)
Recall definition of UMDA
Probability for i-th position (assuming within margins)
pi :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i
21. Proof idea (ignoring margins)
Recall definition of UMDA
Probability for i-th position (assuming within margins)
pi :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i
Definition of levels and a first observation
Choosing levels x ∈ Aj ⇐⇒ Onemax(x) ≥ j, need to show
|P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ (1)
=⇒ Pr (Y ∈ Aj+1) ≥ γ(1 + δ) (2)
Note that assumption (1) with γ0 := µ/λ implies
n
i=1
Xi ≥ µj + γλ
23. Proof idea (taking into account margins)
Pr (Y ∈ Aj+1) ≥ Pr Y1,k >
γλ
µ
+ j − · Pr (Yk+1,k+ +1 = )
≥ Pr Y1,k > E [Y1,k] −
γλ
12µ
· 1 −
1
n
24. Feige’s Inequality
i E [Yi]
Theorem
Given n independent r.v. Y1, . . . , Yn ∈ [0, 1], then for all δ > 0
Pr
n
i=1
Yi >
n
i=1
E [Yi] − δ ≥ min
1
13
,
δ
1 + δ
27. Conclusion and Future Work
The recent level-based method seems well suited for EDAs
Straightforward runtime analysis of the UMDA
Trivial analysis of LeadingOnes,
smaller populations suffice, i.e., O(ln n) vs ω(n2
)
First upper bound on Onemax
How tight are the upper bounds?
o(n ln n) on Onemax?
Other problems and algorithms
linear functions
multi-variate EDAs
28. Thank you
The research leading to these results has received funding from the
European Union Seventh Framework Programme (FP7/2007-2013)
under grant agreement no. 618091 (SAGE).
29. References
Chen, T., Lehre, P. K., Tang, K., and Yao, X. (2009).
When is an estimation of distribution algorithm better than an evolutionary
algorithm?
In Proceedings of the 10th IEEE Congress on Evolutionary Computation
(CEC 2009), pages 1470–1477. IEEE.
Chen, T., Tang, K., Chen, G., and Yao, X. (2007).
On the analysis of average time complexity of estimation of distribution
algorithms.
In Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC’07),
pages 453–460.
Chen, T., Tang, K., Chen, G., and Yao, X. (2010).
Analysis of computational time of simple estimation of distribution algorithms.
IEEE Trans. Evolutionary Computation, 14(1):1–22.
Droste, S. (2006).
A rigorous analysis of the compact genetic algorithm for linear functions.
Natural Computing, 5(3):257–283.
Friedrich, T., K¨otzing, T., Krejca, M. S., and Sutton, A. M. (2015).
The benefit of sex in noisy evolutionary search.
CoRR, abs/1502.02793.