Simplified Runtime Analysis of Estimation of Distribution Algorithms

Simpliﬁed Runtime Analysis of
Estimation of Distribution Algorithms
Duc-Cuong Dang Per Kristian Lehre
University of Nottingham
United Kingdom
Madrid, Spain
July, 11-15th 2015

Outline
Background
Previous Runtime Analyses of EDAs
Univariate Marginal Distribution Algorithm (UMDA)
Our Tool - Level-based Analysis
Non-elitist Processes
Level-based Theorem
Our Results
Warm up: LeadingOnes
Onemax and Feige’s Inequality
Conclusion

Runtime Analysis of EAs
Analysis of the time the EA requires to optimise a function f:
expected number of ﬁtness evaluations
expressed asymptotically wrt instance size n
dependence on
characteristics of the problem
parameter settings of the EA

Previous Work
(1+1) EA cGA1
Onemax Θ(n log n) Θ(K
√
n) [Droste, 2006] EDA worse
Linear functions Θ(n log n) Ω(Kn) [Droste, 2006] EDA worse
(µ+1) EA cGA2
Onemax + N(0, σ2) superpoly(n) whp O(Kσ2√
n log Kn) [Friedrich et al., 2015] EDA better
(1+1) EA UMDA
LeadingOnes Θ(n2) O(λn), λ = ω(n2) [Chen et al., 2007]
BVLeadingOnes Θ(n2) ∞ w.o.p. [Chen et al., 2010] w/o margins
O(λn), λ = ω(n2) [Chen et al., 2010]
SubString 2Ω(n) w.o.p. O(λn), λ = ω(n2) [Chen et al., 2009] EDA better
1
K = n1/2+ε
2
K = ω(σ2√
n log n)
3
λ = Ω(log n)

Previous Work
(1+1) EA cGA1
Onemax Θ(n log n) Θ(K
√
n) [Droste, 2006] EDA worse
Linear functions Θ(n log n) Ω(Kn) [Droste, 2006] EDA worse
(µ+1) EA cGA2
Onemax + N(0, σ2) superpoly(n) whp O(Kσ2√
n log Kn) [Friedrich et al., 2015] EDA better
(1+1) EA UMDA
LeadingOnes Θ(n2) O(λn), λ = ω(n2) [Chen et al., 2007]
BVLeadingOnes Θ(n2) ∞ w.o.p. [Chen et al., 2010] w/o margins
O(λn), λ = ω(n2) [Chen et al., 2010]
SubString 2Ω(n) w.o.p. O(λn), λ = ω(n2) [Chen et al., 2009] EDA better
LeadingOnes Θ(n2) O(nλ log λ + n2) this paper3
Onemax Θ(n log n) O(nλ log λ) this paper
1
K = n1/2+ε
2
K = ω(σ2√
n log n)
3
λ = Ω(log n)

Univariate Marginal Distribution Algorithm
1: Initialise the vector p0 := (1/2, . . . , 1/2).
2: for t = 0, 1, 2, . . . do
3: Sample λ bitstrings y1, . . . , yλ according to the distribution
pt(x) =
n
i=1
pt(i)xi
(1 − pt(i))1−xi
4: Let y(1), . . . , y(λ) be the bitstrings sorted by ﬁtness func. f
5: Compute the next vector pt+1 according to
pt+1(i) :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i
6: end for

Univariate Marginal Distribution Algorithm
1: Initialise the vector p0 := (1/2, . . . , 1/2).
2: for t = 0, 1, 2, . . . do
3: Sample λ bitstrings y1, . . . , yλ according to the distribution
pt(x) =
n
i=1
pt(i)xi
(1 − pt(i))1−xi
4: Let y(1), . . . , y(λ) be the bitstrings sorted by ﬁtness func. f
5: Compute the next vector pt+1 according to
pt+1(i) :=



1
n if Xi = 0
Xi
µ if 1 ≤ Xi ≤ µ − 1
1 − 1
n if Xi = µ,
where Xi := µ
j=1 y
(j)
i .
6: end for

UMDA on Onemax (n = 5000)
0 10 20 30 40 50
0.00.20.40.60.81.0
iteration

Non-elitist populations
X
Pt+1 = (y1
, y2
, . . . , yλ
)
where yi
∼ D(Pt)

Non-elitist populations
X
Pt+1 = (y1
, y2
, . . . , yλ
)
where yi
∼ D(Pt)
Example (UMDA)
D(P)(x) :=
n
i=1
pxi
i (1 − pi)1−xi

Level-Based Theorem
D(Pt)
AmA1 Aj Aj+1 · · ·
Pt
γ0λ γλ
X = A1 ⊃ A2 ⊃ · · · ⊃ Am−1 ⊃ Am = A

Level-Based Theorem
≥ γ(1 + δ)
≥ zj
D(Pt)
A+
mA+
1
A+
j A+
j+1 · · ·
Pt
γ0λ γλ
Theorem (Corus, Dang, Ereemeev, Lehre (2014))
If for any level j < m and population P where
|P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ
an individual y ∼ D(P) is in Aj+1 with
Pr (y ∈ Aj+1) ≥
γ(1 + δ) if γ > 0
zj if γ = 0
and the population size λ is at least
λ = Ω ln(m/(δzj))/δ2
then level Am is reached in expected time
O

 1
δ5

m ln λ +
m
j=1
1
λzj



 .

LeadingOnes
LeadingOnes(x) :=
n
i=1
i
j=1
xi

LeadingOnes
LeadingOnes(x) :=
n
i=1
i
j=1
xi
Theorem
The expected optimisation time of UMDA with
λ ≥ b ln(n) for some constant b > 0,
λ > (1 + δ)eµ
on LeadingOnes is
O(nλ ln(λ) + n2
).

Proof idea
Level deﬁnition x ∈ Aj ⇐⇒ LeadingOnes(x) ≥ j
If |P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ > 0
then Pr(y ∈ Aj+1) ≥ γ(1+δ)

Proof idea
11111111111111111111********
11111111111111111111********
11111111111111111111********
11111111111111111110********
11111111111111111110********
11111111111111111110********
11111111111111111110********
****************************
****************************
****************************
If |P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ > 0
then Pr(y ∈ Aj+1) ≥ γ(1+δ)

Proof idea
11111111111111111111********
11111111111111111111********
11111111111111111111********
11111111111111111110********
11111111111111111110********
11111111111111111110********
11111111111111111110********
****************************
****************************
****************************
If |P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ > 0
then Pr(y ∈ Aj+1) = j+1
i=1 pi ≥ 1 − 1
n
j γλ
µ ≥ γλ
eµ ≥ γ(1+δ)

Onemax
Onemax(x) =
n
i=1
xi
Theorem
The expected optimisation time of UMDA with
λ ≥ b ln(n) for some constant b > 0,
µ < min{λ/(13e), n}
on Onemax is
O(nλ ln λ).

Proof idea (ignoring margins)
Recall deﬁnition of UMDA
Probability for i-th position (assuming within margins)
pi :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i

Proof idea (ignoring margins)
Recall definition of UMDA
Probability for i-th position (assuming within margins)
pi :=
Xi
µ
where Xi :=
µ
j=1
y
(j)
i
Definition of levels and a first observation
Choosing levels x ∈ Aj ⇐⇒ Onemax(x) ≥ j, need to show
|P ∩ Aj| ≥ γ0λ > |P ∩ Aj+1| =: γλ (1)
=⇒ Pr (Y ∈ Aj+1) ≥ γ(1 + δ) (2)
Note that assumption (1) with γ0 := µ/λ implies
n
i=1
Xi ≥ µj + γλ

Proof idea (taking into account margins)

Proof idea (taking into account margins)
Pr (Y ∈ Aj+1) ≥ Pr Y1,k >
γλ
µ
+ j − · Pr (Yk+1,k+ +1 = )
≥ Pr Y1,k > E [Y1,k] −
γλ
12µ
· 1 −
1
n

Feige’s Inequality
i E [Yi]
Theorem
Given n independent r.v. Y1, . . . , Yn ∈ [0, 1], then for all δ > 0
Pr
n
i=1
Yi >
n
i=1
E [Yi] − δ ≥ min
1
13
,
δ
1 + δ

Proof idea
Pr (Y ∈ Aj+1) ≥ Pr Y1,k >
γλ
µ
+ j − · Pr (Yk+1,k+ +1 = µ )
≥ Pr Y1,k > E [Y1,k] −
γλ
12µ
· 1 −
1
n
≥ min
1
13
,
γλ
12µ
γλ
12µ + 1
·
1
e
≥
γλ
13eµ

Proof idea
Pr (Y ∈ Aj+1) ≥ Pr Y1,k >
γλ
µ
+ j − · Pr (Yk+1,k+ +1 = µ )
≥ Pr Y1,k > E [Y1,k] −
γλ
12µ
· 1 −
1
n
≥ min
1
13
,
γλ
12µ
γλ
12µ + 1
·
1
e
≥
γλ
13eµ
≥ γ(1 + δ) if λ ≥ 13e(1 + δ)µ

Conclusion and Future Work
The recent level-based method seems well suited for EDAs
Straightforward runtime analysis of the UMDA
Trivial analysis of LeadingOnes,
smaller populations suﬃce, i.e., O(ln n) vs ω(n2
)
First upper bound on Onemax
How tight are the upper bounds?
o(n ln n) on Onemax?
Other problems and algorithms
linear functions
multi-variate EDAs

Thank you
The research leading to these results has received funding from the
European Union Seventh Framework Programme (FP7/2007-2013)
under grant agreement no. 618091 (SAGE).

References
Chen, T., Lehre, P. K., Tang, K., and Yao, X. (2009).
When is an estimation of distribution algorithm better than an evolutionary
algorithm?
In Proceedings of the 10th IEEE Congress on Evolutionary Computation
(CEC 2009), pages 1470–1477. IEEE.
Chen, T., Tang, K., Chen, G., and Yao, X. (2007).
On the analysis of average time complexity of estimation of distribution
algorithms.
In Proceedings of 2007 IEEE Congress on Evolutionary Computation (CEC’07),
pages 453–460.
Chen, T., Tang, K., Chen, G., and Yao, X. (2010).
Analysis of computational time of simple estimation of distribution algorithms.
IEEE Trans. Evolutionary Computation, 14(1):1–22.
Droste, S. (2006).
A rigorous analysis of the compact genetic algorithm for linear functions.
Natural Computing, 5(3):257–283.
Friedrich, T., K¨otzing, T., Krejca, M. S., and Sutton, A. M. (2015).
The beneﬁt of sex in noisy evolutionary search.
CoRR, abs/1502.02793.

Simplified Runtime Analysis of Estimation of Distribution Algorithms

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (17)

Similar to Simplified Runtime Analysis of Estimation of Distribution Algorithms

Similar to Simplified Runtime Analysis of Estimation of Distribution Algorithms (20)

Recently uploaded

Recently uploaded (20)

Simplified Runtime Analysis of Estimation of Distribution Algorithms