SlideShare a Scribd company logo
1 of 12
Download to read offline
ELSEVIER European Journal of Operational Research 81 (1995)617-628
EUROPEAN
JOURNAL
OF OPERATIONAL
RESEARCH
Theory and Methodology
A parallel depth first search branch and bound algorithm
for the quadratic assignment problem
Bernard Mans a,c, Thierry Mautor a,b,, and Catherine Roucairol a,b
aINRIA, Domaine de Voluceau, Rocquencourt BP 105, F-78153 Le Chesnay Cedex, France
b MASI-UniversitE de Versailles 45, Avenue des Etats-Unis, 78000 Versailles, France
c Carleton University, School of Computer Science, Ottawa K1S 5B6, Canada
Received January 1993;revised March 1993
Abstract
We propose a new parallel Branch and Bound algorithm for the Quadratic Assignment Problem, which is a
Combinatorial Optimization problem known to be very hard to solve exactly. An original method to distribute work
to processors using the notion of Feeding Tree is presented. When adequately used, it allows to reduce memory
contention and load unbalance. Therefore, a linear speed-up in the number of processors is reached on a shared
memory multiprocessor, the Cray 2 and the optimality of solutions for famous problems of size less than 20 (Nugent
16, Elshafei 19, Scriabin-Vergin 20,...) is proved by this program. The implementation analysis shows that these
results are more than an improvement due to hardware evolution and confirms the usefulness of our parallel Branch
and Bound algorithm for larger Quadratic Assignment Problems.
Keywords: Branch and bound; Combinatorial optimization; Parallel algorithm; Quadratic assignment problem
1. Introduction
The Quadratic Assignment Problem (QAP)is a combinatorial optimization problem introduced by
Koopmans and Beckman, in 1957, [12]. It has numerous and various applications, such as location
problems, VLSI design [18], architecture design [9]....
The objective is to assign n units to n sites in order to minimize the quadratic cost of this assignment,
which depends both on the distances between the sites and on the flows between the units. It can be
formulated as follows.
Given two (n × n) matrices,
F = (fij) where f/j is the flow between units i and j,
D = (dkt) where dkt is the distance between sites k and l,
* Corresponding author.
0377-2217/95/$09.50 © 1995Elsevier Science B.V. All rights reserved
SSDI 0377-2217(93)E0334-T
618 B. Martset al./ EuropeanJournalof OperationalResearch81 (1995)617-628
find a permutation p of the set N = {1, 2 .... , n} which minimizes the global cost function:
Cost(p) : ~ ~fijdp(i)p(j).
i=l j=l
2. Parallel solution methods
The QAP has shown itself to be computationally a very difficult problem. This problem, of which the
Travelling Salesman problem is a specific case, is NP-hard [13]. Moreover, finding an e-approximate
solution is also NP-hard [22].
But its theoretical complexity is not a sufficient description of the extreme difficulty of this problem,
for which even applications of moderate size (n -- 20) cannot, up to now, be solved exactly.
Therefore, many heuristics have been developed in the past thirty years. More recently, due to the
development of parallel architectures, many parallel algorithms have been fruitfully implemented to
speed up the search and, thus, overcome the difficulty of the problem.
2.1. Parallel exact methods
The best way to solve exactly a quadratic assignment problem is, to date, to use a Branch and Bound
algorithm, as other methods such as cutting planes methods have not been successful, failing to solve
exactly problems of size greater than eight.
As a consequence, Branch and Bound algorithms have been the only exact parallel solution methods
suggested for this problem. They are due to Roucairol, in 1987, [21] and to Crouse and Pardalos, in 1989,
[8]. We will describe later the main characteristics of these algorithms (see Section 4.2). Nevertheless,
these parallel Branch and Bound algorithms are also limited to the solution of small size problems
(size < 15).
This strong limitation explains why most recent approaches to this problem have been heuristics.
2.2. Parallel heuristics
Among sequential solution methods, the adjustments of the recent meta-heuritics (Simulated Anneal-
ing, Tabu Search and Genetic Algorithms) to the QAP provide the best approximate results, outstripping
the results of the first heuristic solution methods (construction methods, exchange methods).
Thus, logically, implementations of parallel heuristics are issued from these meta-heuristics.
In 1989, Brown, Huntley and Spillane [3], on a 32-nodes Intel iPCS/2 hypercube, and Miihlenbein
[17], on a 64 processors system with distributed memory, have developed parallel genetic algorithms.
In 1991, Taillard [24], on a network of 10 transputers T800C, has proposed a parallel Tabu search,
while Chakrapani and Skorin-Kapov [7] have developed a massively parallel Tabu search on a Connec-
tion Machine using 16K processing units.
These algorithms always find the optimal solution for problems of small sizes from the literature.
Since the optimality cannot be proved for higher instances, it is hard to estimate the quality of the
results. Nevertheless, for special instances built with the knowledge of the optimum (Palubetskes [19],
Burkard et al. [5]), the results of these heuristics are very close to optimal.
Let us also mention a connectionist approach, developed by Wang, in 1990, [25] on a microVax
computer, which simulates n 2 processing units performing nonlinear transformations.
B. Manset aL/ EuropeanJournalof OperationalResearch81 (1995)617-628 619
3. Sequential Branch and Bound
Let us briefly recall the main principles of the sequential algorithm introduced by Mautor and
Roucairol, in 1992, [16,15], which currently provides the best sequential results for the quadratic
assignment problem.
Meanwhile, we discuss important features, such as lower bounds and branching strategies, concerning
any Branch and Bound algorithm.
3.1. Bounding
Undoubtedly, the computation of the lower bound is one of the major difficulties in the exact solution
of the quadratic assignment problem.
Indeed, up to the present time, this bound is either too loose, the number of nodes of the search tree
becoming quickly huge as the size of the problem increases, or, when the bound is slightly tighter, the
time needed to compute the bound on one node is prohibitive.
The oldest lower bound, developed independantly in 1962 by Gilmore [11] and Lawler [13], is obtained
by solving a linear assignment problem on a (n x n) matrix C = (Cig), where Cig is the lower bound of
the assignment of facility i to site k and is given by the computation of the ranked product of the row fi.
of F and the colunm d.k of D.
This bound is, therefore, quickly computed in O(n 3) but its results are not very tight. For instance, the
Gilmore-Lawler bound is more than 20% away from the best solution for Nugent's problems of size
greater than 20.
The most interesting other lower bounds are based on an eigenvalue approach (Finke, Burkard and
Rendl [10], Rendl and Wolkowicz [20]) or on equivalent dual formulations of the problem (Assad and Xu
[2], Carraresi and MaluceUi [6]). If they provide a slightly better evaluation, it is at the cost of a very
significant increase in computational time. The best ratio quality/time is, therefore, still achieved by the
Gilmore-Lawler bound.
For this reason, the most successful Branch and Bound algorithms (Burkard and Derigs [4], Roucairol
[21], Crouse and Pardalos [8]) use the Gilmore-Lawler bound. Mautor and Roucairol use this bound and
concentrate their effort on more efficient approaches to reduce the enumeration.
3.2. Reduction tests
Symmetric equivalences
In most classical applications of QAP, the sites are located on a regular figure - grid, circle or line -
on which several symmetric or isometric equivalences can be detected.
We want to avoid creating, visiting and bounding these different but equivalent nodes in different
branches of the search tree.
For this purpose, a simple test which is quickly computed, has been introduced by Mautor and
Roucairol [16]. For any partial solution, this test identifies the different isometric classes. Thus, a unit
can be assigned to only one site of each isometric class.
Reduction test using the search gap
This classical test forbids some assignments and, therefore, reduces the size of the B&B tree.
Moreover, it allows to choose an efficient branching for the next level (see Section 3.3).
620 B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628
Let us denote:
lb: the value of the lower bound obtained,
bks: the value of the best known solution (upper bound).
Test. If the alternative cost of the assignment of an unit to a site is greater or equal than the search gap
(difference between the value of the best known solution and the lower bound: bks- lb), then this
assignment can be forbidden (the proof can be found in [16]).
3.3. Branching strategies
Depth first search strategy
As mentioned above, exact methods can only deal with problems of small size, for which efficient and
recent heuristics, such as Tabu search, almost always find the optimal solution or, at least, a solution
extremely close to optimal.
As a consequence, since only the nodes of the critical tree (evaluation lower than the best solution)
have to be examined, a depth first search strategy is more efficient than a best first search strategy. This
approach has two advantages. First, the data structures are lighter. Second, we avoid incessant sortings of
partial solutions and reconstructions of independant subproblems (our experiments have shown that
building the matrices at each exploration of a B&B node takes at least 40% of the global computation
time).
Polytomic branching
The branching scheme is polytomic, that is a selected unit is assigned to all the free (not already
assigned), not forbidden (the alternative cost is in the search gap) and isometrically different sites. This
selected unit is the one with the lowest number of sons to be examined (highest number of forbidden
elements).
This scheme has several advantages. First, with respect to a dichotomic branching, fewer nodes are
created.
Second, the memory requirement is reduced by avoiding the unnecessary information on revoked
assignments generated by classical dichotomic branching schemes, where an unit is assigned or not to a
site.
Moreover, this scheme discriminates between the lower bounds of a node and its sons and allows early
pruning of some branches. Our experiments have shown that this strategy produces a significant decrease
in the size of the search tree.
3.4. Size of the search tree
We can see in Table 1 that the reduction tests and the branching strategy, described above, lead to a
drastic decrease in the total number of created nodes compared to the previous most efficient exact
algorithms (Burkard and Derigs, Roucairol, Crouse and Pardalos).
Table 1
Total number of created nodes
Algorithm Nugent 8 Nugent 12 Nugent 15 Elshafei 19
Burkard-Derigs 403 36 966 2 064415 not proved
Roucairol 428 83 379 not proved not proved
Crouse-Pardalos 798 42 706 1596 353 not proved
Mautor-Roucairol 32 3 474 97 287 491
B. Manset al./ EuropeanJournalof OperationalResearch81 (1995)617-628 621
4. Parallel Branch and Bound
411. Main principles
We have parallelized the Mautor-Roucairol sequential algorithm on an asynchronous shared memory
multiprocessor, the Cray 2. Our main concern has been to minimize the waiting time (inactivity) of the
processors and, thus, to obtain the highest speed-up. Thus, our approach considers the major difficulties
appearing in parallelizing B&B methods [14]: task allocation, choice of granularity, overhead
detection ....
As previously argued, we use a depth first search strategy. In a parallel implementation, an additional
reason appears, since the granularity (relative number of operations done between synchronizations)
defined by a parallel best first search strategy would correspond only to the exploration of a B&B node,
which is quickly done. Therefore, the partial sort of the newly generated nodes, at each branching step,
in a global shared list would create many contentions on the memory access, and hence a bottleneck for
processors.
For these reasons, we propose that each processor should execute the same depth first search
algorithm on different B & B subtrees. Each processor is, therefore, assigned to the root of a subtree and it
develops a local depth first search on the corresponding subtree.
When a processor has completed its own exploration ofa subtree, it accesses a shared data structure,
the feeding tree, in which it takes a new node, the root of a new subtree to develop. Since this task
allocation cannot be done statically, processors schedule themselves (self-scheduling technique) and
trigger their own work demand. The consistency and fairness of this shared list of tasks is ensured by
synchronizing with classical primitives as locks.
The procedure stops when the feeding tree is empty and all processors are idle.
The main features of our algorithm are illustrated on the Nugent et al. problem of size 15. Indeed, for
this problem, the size of the tree and the computational time are significant enough for the implementa-
tion choices to be carefully analyzed. It should be noted that we have obtained similar results with other
problems, in that the speed-up has been linear in all cases.
4.2. Task allocation
Different approaches can be adopted for the task allocation. In Roucairol's parallel algorithm [21], a
global heap is used to memorize the generated B&B nodes. Since a best first search strategy is used,
each exploration of a B & B node requires that the shared data structure be accessed once to obtain the
required node and be accessed several times to insert the newly generated nodes. In order to limit this
global access, Crouse and Pardalos [8] initially create several heaps in the global memory, so that each
processor can select one and explore it completely locally. In this case, the parallel B&B execution
terminates when all heaps have been explored and all processors are idle.
Since our approach is quite different, we describe the way we distribute the work amongst the
processors, through the notion of the feeding tree.
z
The feeding tree
Definition 4.1 The Feeding Tree, shared data structure, is the upper part of the B&B tree developed
down to depth (or level) i. The leaves of the Feeding Tree (nodes of level i of the B&B tree) are the
roots of the subtrees allocated to the processors (see Fig. 1).
622 13.Marts et al./European Journal of Operational Research 81 (1995) 617-628
Fig. 1. Feeding Tree and allocated subtrees.
The first free processor initializes the left part of the Feeding Tree until it generates, by successive
branchings, the leftmost node at the chosen depth i. Then, this processor unlocks the shared structure
and begins its own exploration on the subtree, whose the node is the root. While the exploration of the
allocated subtree is not completed, the processor does not need to access the global shared structure.
Gradually, the other processors access the feeding tree, in a mutually exclusive way, and develop it
until a new node of depth i - the next depth-i-node to the right of the last allocated node - with
eventual back-trackings in the feeding tree. Likewise, as soon as a processor becomes idle, the
exploration of its last subtree being completed, it tries to access again the feeding tree to build a new
"depth-i-node".
When the development of the feeding tree is completed (at the first level, it is not possible to branch
on a new partitioned subproblem), each processor that becomes idle terminates (the main program
terminates when the last processor becomes idle).
Since an assignment is fixed at each branching step, the maximal depth of the B&B tree is n - 1,
where n is the size of the QAP problem. The maximal depth of allocated subtrees is, therefore, equal to
(n - i - 1).
Due to the depth first search strategy, in the feeding tree, only the nodes on the path from the root to
the last allocated node (the facilities assigned to reach this depth and the set of the remaining available
locations) have to be memorized. Similarly, the context memorized by a processor is the path between
the current node and its ascendant of depth i.
Thus, the memory requirement for this parallel implementation is slightly increased with respect to
the sequential implementation. Indeed, this increase is only in O(p) ((i +p(n - i - 1)/(n - 1)), where p
is the number of processors.
Algorithm
procedure Parallel B&B
begin
/* initialize Feeding Tree */
Nprocsldle := 0
Create fid, the first leftmost descendant of root with depth i
for each of the Nprocs processors do/* self-schedule */
lock (Feeding Tree)
Create rfid right sibling of rid
B. Manset al.~EuropeanJournalof OperationalResearch81 (1995)617-628 623
if rfid ~ null then
Keep context of rfid
rid := rfid
unlock (Feeding Tree)
else /* no more tasks in this branch */
Backtrack until creating a right sibling ancfid
if ancfid ¢ null then
Create rf/d, the first leftmost descendant of ancfid with-depth i
Keep context of rfid
..= 'rid
unlock (Feeding Tree)
else / * no more tasks to assign */
Nprocsldle .'= Nprocsldle + 1
unlock (Feeding Tree)
Terminate
endif
endif
/* Depth First Search exploration */
Expand the whole B&B subtree
if a new best solution nbs is found then
lock (bks)
if (nbs < bks) bks := nbs
unlock (bks)
endif
endo / * Nprocsldle = Nprocs */
end Parallel B&B
4.3. Granularity - memory contentions
As previously mentioned, we want to minimize the waiting time of the processors. Now, a processor
has to wait in two cases:
• when the processor needs to access a global shared structure, locked by another processor (memory
contention),
• during the termination phase, when the processor is idle and has to wait for other processors to
complete their last local exploration (termination wait).
Let us discuss these two cases in more detail.
Memory contention
The overhead problem, fully dependant on the parallel implementation, is decisive on the Cray 2. A
lock operation requires 200 CPU cycles when the lock is free, while the conflict management requires
4000 CPU cycles.
Since the possible updatings of the best known solution are extremely rare and quickly done, only the
dynamic development of the feeding tree can take so much time to induce a significant waiting time for
other processors.
But the total waiting time of processors is a function of level i of the feeding tree.
The deeper this level is, the smaller the average granularity of the tasks allocated to the processors
will be. The processors will then access the feeding tree more frequently. Moreover, the average time for
624 B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628
0.2
0. I
............
o ......... ......... ......... ! .................
,
shared level
Fig. 2. Conflictratio depending on the level(Nugent 15).
a processor to develop the feeding tree and generate a new "depth-i-node" will increase, due to
numerous backtrackings.
This phenomenon is shown in Fig. 2, where the access conflict ratio (percentage of access require-
ments when the feeding tree is locked) is given, with respect to the shared level,
For example, when the depth of the feeding tree is equal to 5, nearly 30% of the access requirements
result in waiting time.
Termination wait
This delay occurs when a processor without work (no more task) waits for the completion of the
subtree(s) allocated to other processor(s). The maximal idle time is the time needed to explore the
largest subtree which can be assigned to a processor.
When the level i of the feeding tree decreases, the average granularity of the tasks (subtrees)
allocated to the processors will increase. Therefore, the probability of a long termination wait becomes
more and more significant.
This is illustrated in Fig. 3, where the difference in the numbers of processed nodes (generated and
explored) by the most and the least active processors is given. The effect of the chosen depth i on the
0.4--
0.3
0.2
O. 1
I nI n- n_
0 . 0 . . . . . . . . . I . . . . . . . . . . . . . . . . . . I . . . . . . . . . I . . . . . . . . . I . . . . . . . . . I
O 1 2 3 4 5 6
shared level
Fig. 3. Difference in the numbers of processed nodes with respect to the shared level.
[:' explored
I generated
B. Manset al./ EuropeanJournalof OperationalResearch81 (1995)617"628 625
load balancing is confirmed, since, with a small feeding tree(level 1 or 2), this difference can reach 30%
(a processor has explored 5000 subproblems more than another) while a deeper feeding tree produces a
very good balance of the load.
5. Experiments
Our algorithm, written in Fortran, was run on the Cray 2 which is a 4-processor asynchronous machine
with shared memory.
The test data used for the computational results is as follows:
• Nugent et al. [18]: classical problems for QAP of sizes n = 12, n = 15, and n = 16 (the flow matrix is
extracted from the problem of size 20 and the sites are located on a 4 × 4 square),
• Elshafei: hospital layout problem of size 19 from [9],
• Scriabin and Vergin: economic layout problem of size 20 [23], from Armour and Buffa [1].
5.1. Computational results
We compared the results obtained by our method with those obtained by one of the fastest sequential
algorithms (Burkard and Derigs [4]) and by the two previously most efficient parallel algorithms available
(Crouse and Pardalos [8], and Roucairol [21]). Since Burkard and Derigs' algorithm has been tested in
1980 on a Cyber 76 and because of the evolution in computational performances, we ran their sequential
algorithm on the Cray 2. We also report on two different sets of results from Crouse and Pardalos. The
first is obtained from a sequential run of this algorithm and the second from running the algorithm in
parallel on a four-processor machine.
In Table 2, we present a comparison of the running times of these algorithms.
Some tests done on a Cray YMP are also presente d.
Remark. Since our dedicated use of the Cray 2 (i.e. a single execution on the machine) was limited to 400
seconds, cumulated for all used processors, parallel executions of Nugent of size 16, and of Scriabin and
Vergin of size 20 could not be done with a reasonably busy machine. The amount of time needed for
parallel executions of these problems depends on the number of users working on the machine. Thus, in
these two cases,-we could not reach the predictable linear speedup shown for Nugent et al. of size 15.
Table 2
Comparison of running times (in seconds)..
Algorithm Machine No. Nugent Nugent Nugent Elshafei Scr.-Ver.
procs size 12 size 15 size 16 size 19 size 20
Optimal
solution 578 1150 1550 17212548 110030
Burk. Der. Cray 2 1 24 1290 not proved not proved not proved
Roucairol Cray XMP 4 312 out of time not proved not proved not proved
Cr. Pard. IBM 3090 1 34 2005 not proved not proved not proved
Cr. Pard. IBM 3090 4 10 out of space not proved not proved not proved
Mans et al. Cray 2 1 2.68 109 969 1.04 1189
Mans et al. Cray 2 4 0.99 28 436 0.68 560
Mans et al. Cray YMP 1 not tested 62 not tested not tested not tested
Mans et al. Cray YMP 6 not tested 11 not tested not tested not tested
626 B. Mans et aL/European Journalof OperationalResearch81 (1995) 617-628
38 i i i i i i i
37
36
35
34
Time 3
3
in sec
32
31, "~
3O
29
28 I
2 3 4
sharedlevel
Fig. 4. Running time obtained on Nugent 15 with respect to the level i.
Among the classical quadratic assignment problems from the literature, the problem of Nugent of size
15 was considered to be the hardest to solve by an exact method. Only the most efficient Branch and
Bound algorithms managed to solve this problem and they needed a long computational time (more than
20 minutes) to compute the solution.
Our parallel algorithm manages to solve this problem in a few seconds (11 seconds on the Cray YMP,
28 seconds on the Cray 2).
This result is a good illustration of the efficiency of this parallel algorithm, which, on all the classical
problems, obtains the fastest running time.
Moreover, this algorithm obtains the best solution and the proof of the optimality of this solution for
problems of size 16 to 20 which have never been solved exactly before: Nugent of size 16, Scriabin and
Vergin of size 20, Elshafei of size 19.
We have also run our algorithm to solve randomly generated problems with previously known optimal
solutions, (Palubetskes [19], Burkard and al. [5]). Problems of size 18 were solved sequentially in less than
6 minutes, while they were solved in parallel with a linear speedup.
5.2. Parallel analysis
In order to analyze the effects of the granularity and the sizes of the allocated subtrees, we considered
the level /-depth of the feeding tree-as a parameter and ran our program with different values of i.
1101 , , i ~ , '
i00 
9
0 level3
"~ level 4 -b--
Time 70
in see. 60
50
40
30
2
0 i J t I l
2 3 4
Numberof Processors
Fig. 5. Running time (Nugent 15) with respect to the number of processors.
B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628 627
4
3.5
3
Speed 2.5
up
2
1.5
t i i i i
level 3 ~ ~ ' ~
lord 4 +- / / J
2 3
Number of Processors
Fig. 6. Speed up on Nugent 15.
The tests have been done on the Nugent 15 problem, while our program was the only one executed by
the machine.
As previously seen, the level i must not be set to extreme values, neither too high due to memory
contentions (see figure 2), nor too low for a good balance of the load (see figure 3).
This influence of the level i is also shown in Fig. 4, where the running times obtained with different
depths of the feeding tree are presented.
On the Nugent 15 problem, the best compromise is achieved when the depth is equal to 3 (or 4).
Of course, the value of the level i leading to the best behaviour of the program depends on the
problem and on the structure of the corresponding search tree. Nevertheless, all the experiments we
have done show that a middle-valued depth of the feeding tree, equal to [n/5] or [n/4] (where n is the
size of the problem), leads to very good performances.
Therefore, as it can be seen in Fig. 5, with an appropriate choice of i, we obtain a significant decrease
in the running time, as the number of processors increases.
A linear speed-up in the number of processors is, thus, achieved (see Fig. 6). We emphasize the fact
that the comparisons of the parallel running times are done with the running time of the best sequential
implementation of the algorithm.
6. Conclusions
We have presented an original and very efficient parallel algorithm for solving the Quadratic
Assignment problem. We introduced the notion of the feeding tree which allows a good distribution of
work to processors.
This algorithm which is the parallel version of a sequential B &B algorithm leads to a linear speedup
(nearly equal to the number of processors), with very little overhead. Since the heuristics for QAP give
very good solutions, no oversearch is pursued (only the critical tree is explored). The parallel implemen-
tation of the Depth First Search strategy is scalable in CPU time and memory requirement.
Nevertheless, whereas the parallelism is fully used, a global improvement of the algorithm is still
required to solve problems of size greater than 20. Without considering the hardware evolution, these
solutions can not be reached with the existing tight bounds.
Acknowledgements
We thank Franz Rendl, Professor at the Technische Universit~it of Graz, Austria, for his helpful
comments and suggestions on this work during the last few years.
628 B. Mans et aL/ European Journal of Operational Research 81 (1995) 617-628
The authors are indebted to Cray France (esp. G. Simeoni) and to CCVR (Centre de Calcul Vectoriel
pour la Recherche, Palaiseau, France), for supporting this research with computer time and for helpful
discussions.
References
[1] Armour, G., and Buffa E., "A heuristic algorithm and simulation approach to the relative allocation of facilities", Management
Science, 9 (1963) 294-309.
[2] Assad, A., and Xu, W., "On lower bounds for a class of quadratic 0-1 programs", Operations Research Letters 4 (1985)
175- 180.
[3] Brown, D., Huntley, C., and Spillane, A., "A parallel genetic heuristic for the quadratic assignment problem", in: Proc. of the
3rd Conference on Genetic Algorithms, 406-415, Arlington, 1989.
[4] Burkard, R., and Derigs, U., Assignment and Matching Problems: Solution Methods with Fortran Programs, Springer Verlag,
Berlin 1980.
[5] Burkard, R., Karisch, S., and Rendl, F., "Qaplib - a quadratic assignment problem library. European Journal of Operational
Research 55 (1991) 115-119.
[6] Carraresi, P., and Malucelli F., "A new lower bound for the quadratic assignment problem", Operations Research 40 (1992)
$22-$27.
[7] Chakrapani, J., and Skorin-Kapov, J., "Massively parallel tabu search for the quadratic assignment problem", Technical
Report HAR-91-06, Harriman School for Management and Policy, NY 11794, 1991.
[8] Crouse, J., and Pardalos, P., "A parallel algorithm for the quadratic assignment problem", in: Proceedings of Supercomputing
89, 351-360, ACM, 1989.
[9] Elshafei, A., "Hospital layout as a quadratic assignment problem", Operational Research Quarterly 28 (1977) 167-179.
[10] Finke, G., Burkard, R., and Rend1, F., "Quadratic assignment problems", Annals of Discrete Math. 28 (1987) 61-82.
[11] Gilmore, P.C., "Optimal and suboptimal algorithms for the quadratic assignment problem", SIAM Journal on Applied Math. 10
(1962) 305-313.
[12] Koopmans, T.C., and Beckman, M.J., "Assignment problems and the location of economic activities", Econometrica 25 (1957)
53-76.
[13] Lawler, E., "The quadratic assignment problem", Management Science 9 (1963) 586-599.
[14] Mans, B., "Contribution a l'algorithmique non num6dque parallele: Parall61isations de m6thodes de recherche arborescentes",
Thbse d'universit6, Universit6 Paris VI, 4, place Jussieu, 75252 Pal-is cedex 05, June 1992.
[15] Mautor, T., "Contribution ~ la r6solution des probl~mes d'implantation: Algorithmes s6quentiels et paraU~les pour l'affecta-
tion quadratique", Th~se d'universit6, Universit6 Paris VI; 4, place Jussieu, 75252 Paris Cedex 05, February 1993.
[16] Mautor, T., and Roucairol C., "A new exact algorithm for the solution of quadratic assignment problems", Discrete Applied
Mathematics to appear, 1993. MASI-RR-92-09-Universit6 de Paris 6, 4 place Jussieu, 75252 Paris C6dex 05.
[17] Mulhenbein, H., "Parallel genetic algorithms, population genetics and combinatorial optimization", in: J.D. Becket I. Eisele,
and F.W. Mundemann, (eds.), Parallelism, Learning, Evolution; Workshop on Evolutionary Models and Strategies and Workshop
on Parallel Processing: Logic, Organization and Technology WOPPLOT89. Springer-Verlag, Berlin, July 1989.
[18] Nugent, C., Vollmann, T., and Ruml. J., "An experimental comparison of techniques for the assignment of facilities to
locations", Operations Research 16 (1968) 150-173.
[19] Palubetskes, G.S., "Generation of quadratic assignment test problems with known optimal solution". Zhurnal Vychislitel'noi
Matematiki i Matematicheskoi Fisiki 28 11 (1988) 1740-t743, (in Russian).
[20] Rendl. F., and Wolkowicz, H., "Applications of parametric programming and eigenvalue maximization to the quadratic
assignment problem", Mathematical Programming (1992) 63-78.
[21] Roucairol, C., "A parallel branch and bound algorithm for the quadratic assignment problem", Discrete Applied Mathematics
18 (1987) 211-225.
[22] Sahni, S., and Gonzalez, T., "P-complete approximation problems", Journal of the ACM 23 (1976) 555-565.
[23] Scriabin, M., and Vergin, R.C., "Comparison of computer algorithms and visual based methodes for plant layout",
Management Science 22 (1975) 172-187.
[24] Taillard, E.,' "Robust tabu search for the quadratic assignment problem", Parallel Computing 17 (1991) 443-455.
[25] Wang, Jun, "A parallel distributed processor for the quadratic assignment problem", in: Proceedings of the INNC90,
International Neural Network Conference 1, 278-281. Paris. France, July 1990.

More Related Content

Similar to A Parallel Depth First Search Branch And Bound Algorithm For The Quadratic Assignment Problem

Trust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationTrust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationChristian Adom
 
3 article azojete vol 7 24 33
3 article azojete vol 7 24 333 article azojete vol 7 24 33
3 article azojete vol 7 24 33Oyeniyi Samuel
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETScsandit
 
2011 santiago marchi_souza_araki_cobem_2011
2011 santiago marchi_souza_araki_cobem_20112011 santiago marchi_souza_araki_cobem_2011
2011 santiago marchi_souza_araki_cobem_2011CosmoSantiago
 
A New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsA New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsJody Sullivan
 
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...SSA KPI
 
7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930eaAlvaro
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsSubhajit Sahu
 
Shor's discrete logarithm quantum algorithm for elliptic curves
 Shor's discrete logarithm quantum algorithm for elliptic curves Shor's discrete logarithm quantum algorithm for elliptic curves
Shor's discrete logarithm quantum algorithm for elliptic curvesXequeMateShannon
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0theijes
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Editor IJARCET
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Editor IJARCET
 
A Bibliography on the Numerical Solution of Delay Differential Equations.pdf
A Bibliography on the Numerical Solution of Delay Differential Equations.pdfA Bibliography on the Numerical Solution of Delay Differential Equations.pdf
A Bibliography on the Numerical Solution of Delay Differential Equations.pdfJackie Gold
 
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...Samantha Vargas
 
A stochastic algorithm for solving the posterior inference problem in topic m...
A stochastic algorithm for solving the posterior inference problem in topic m...A stochastic algorithm for solving the posterior inference problem in topic m...
A stochastic algorithm for solving the posterior inference problem in topic m...TELKOMNIKA JOURNAL
 
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...IJCSIS Research Publications
 
HOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptxHOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptxSayedulHassan1
 
HOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptxHOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptxSayedulHassan1
 

Similar to A Parallel Depth First Search Branch And Bound Algorithm For The Quadratic Assignment Problem (20)

Trust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor DissertationTrust Region Algorithm - Bachelor Dissertation
Trust Region Algorithm - Bachelor Dissertation
 
3 article azojete vol 7 24 33
3 article azojete vol 7 24 333 article azojete vol 7 24 33
3 article azojete vol 7 24 33
 
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETSFAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETS
 
2011 santiago marchi_souza_araki_cobem_2011
2011 santiago marchi_souza_araki_cobem_20112011 santiago marchi_souza_araki_cobem_2011
2011 santiago marchi_souza_araki_cobem_2011
 
A New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming ProblemsA New Neural Network For Solving Linear Programming Problems
A New Neural Network For Solving Linear Programming Problems
 
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
 
7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower BoundsParallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
Parallel Batch-Dynamic Graphs: Algorithms and Lower Bounds
 
Shor's discrete logarithm quantum algorithm for elliptic curves
 Shor's discrete logarithm quantum algorithm for elliptic curves Shor's discrete logarithm quantum algorithm for elliptic curves
Shor's discrete logarithm quantum algorithm for elliptic curves
 
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
Anomaly Detection in Temporal data Using Kmeans Clustering with C5.0
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
 
Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204Volume 2-issue-6-2200-2204
Volume 2-issue-6-2200-2204
 
A Bibliography on the Numerical Solution of Delay Differential Equations.pdf
A Bibliography on the Numerical Solution of Delay Differential Equations.pdfA Bibliography on the Numerical Solution of Delay Differential Equations.pdf
A Bibliography on the Numerical Solution of Delay Differential Equations.pdf
 
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...
An Accelerated Branch-And-Bound Algorithm For Assignment Problems Of Utility ...
 
A stochastic algorithm for solving the posterior inference problem in topic m...
A stochastic algorithm for solving the posterior inference problem in topic m...A stochastic algorithm for solving the posterior inference problem in topic m...
A stochastic algorithm for solving the posterior inference problem in topic m...
 
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
Amelioration of Modeling and Solving the Weighted Constraint Satisfaction Pro...
 
HOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptxHOME ASSIGNMENT (0).pptx
HOME ASSIGNMENT (0).pptx
 
HOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptxHOME ASSIGNMENT omar ali.pptx
HOME ASSIGNMENT omar ali.pptx
 
https:::arxiv.org:pdf:2105.13813.pdf
https:::arxiv.org:pdf:2105.13813.pdfhttps:::arxiv.org:pdf:2105.13813.pdf
https:::arxiv.org:pdf:2105.13813.pdf
 

More from Carrie Romero

Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5
Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5
Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5Carrie Romero
 
Introduction Template For Report - Professional Tem
Introduction Template For Report - Professional TemIntroduction Template For Report - Professional Tem
Introduction Template For Report - Professional TemCarrie Romero
 
Transition In Essay Writing - Paragraph Transitions
Transition In Essay Writing - Paragraph TransitionsTransition In Essay Writing - Paragraph Transitions
Transition In Essay Writing - Paragraph TransitionsCarrie Romero
 
How To Help Students Brainstorm And Organize An Essay
How To Help Students Brainstorm And Organize An EssayHow To Help Students Brainstorm And Organize An Essay
How To Help Students Brainstorm And Organize An EssayCarrie Romero
 
Quality Management Essay
Quality Management EssayQuality Management Essay
Quality Management EssayCarrie Romero
 
Essay On Helping Others In English 10 Lines Essay O
Essay On Helping Others In English 10 Lines Essay OEssay On Helping Others In English 10 Lines Essay O
Essay On Helping Others In English 10 Lines Essay OCarrie Romero
 
Buy Essays Cheap Review - Resumverlet
Buy Essays Cheap Review - ResumverletBuy Essays Cheap Review - Resumverlet
Buy Essays Cheap Review - ResumverletCarrie Romero
 
RESEARCH PAPER WRITERS CAN HELP Y
RESEARCH PAPER WRITERS CAN HELP YRESEARCH PAPER WRITERS CAN HELP Y
RESEARCH PAPER WRITERS CAN HELP YCarrie Romero
 
Beautiful Beloved Essay Thatsnotus
Beautiful Beloved Essay ThatsnotusBeautiful Beloved Essay Thatsnotus
Beautiful Beloved Essay ThatsnotusCarrie Romero
 
Essay Benefits Of Reading Spm Teleg
Essay Benefits Of Reading Spm TelegEssay Benefits Of Reading Spm Teleg
Essay Benefits Of Reading Spm TelegCarrie Romero
 
Templates Writing Paper Template, Writing Paper, Ki
Templates Writing Paper Template, Writing Paper, KiTemplates Writing Paper Template, Writing Paper, Ki
Templates Writing Paper Template, Writing Paper, KiCarrie Romero
 
Buy College Essays. Essa
Buy College Essays. EssaBuy College Essays. Essa
Buy College Essays. EssaCarrie Romero
 
HSC Essay Plans For Questions On The National S
HSC Essay Plans For Questions On The National SHSC Essay Plans For Questions On The National S
HSC Essay Plans For Questions On The National SCarrie Romero
 
Essay About Importance Of Le
Essay About Importance Of LeEssay About Importance Of Le
Essay About Importance Of LeCarrie Romero
 
1 3 1 Essay Example - Lawwustl.Web.Fc2.Com
1 3 1 Essay Example - Lawwustl.Web.Fc2.Com1 3 1 Essay Example - Lawwustl.Web.Fc2.Com
1 3 1 Essay Example - Lawwustl.Web.Fc2.ComCarrie Romero
 
What Color Resume Paper Should You Use Prepar
What Color Resume Paper Should You Use PreparWhat Color Resume Paper Should You Use Prepar
What Color Resume Paper Should You Use PreparCarrie Romero
 
Munchkin Land Kindergarten Kid Writing P
Munchkin Land Kindergarten Kid Writing PMunchkin Land Kindergarten Kid Writing P
Munchkin Land Kindergarten Kid Writing PCarrie Romero
 
Sample Informative Speech Outline On Caf
Sample Informative Speech Outline On CafSample Informative Speech Outline On Caf
Sample Informative Speech Outline On CafCarrie Romero
 
Top Websites To Get Free Essays Online CustomEs
Top Websites To Get Free Essays Online CustomEsTop Websites To Get Free Essays Online CustomEs
Top Websites To Get Free Essays Online CustomEsCarrie Romero
 

More from Carrie Romero (20)

Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5
Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5
Paper Mate Write Bros Comfort 0.7Mm Mechanical Pencil - 5
 
Introduction Template For Report - Professional Tem
Introduction Template For Report - Professional TemIntroduction Template For Report - Professional Tem
Introduction Template For Report - Professional Tem
 
Transition In Essay Writing - Paragraph Transitions
Transition In Essay Writing - Paragraph TransitionsTransition In Essay Writing - Paragraph Transitions
Transition In Essay Writing - Paragraph Transitions
 
How To Help Students Brainstorm And Organize An Essay
How To Help Students Brainstorm And Organize An EssayHow To Help Students Brainstorm And Organize An Essay
How To Help Students Brainstorm And Organize An Essay
 
Quality Management Essay
Quality Management EssayQuality Management Essay
Quality Management Essay
 
Essay On Helping Others In English 10 Lines Essay O
Essay On Helping Others In English 10 Lines Essay OEssay On Helping Others In English 10 Lines Essay O
Essay On Helping Others In English 10 Lines Essay O
 
Buy Essays Cheap Review - Resumverlet
Buy Essays Cheap Review - ResumverletBuy Essays Cheap Review - Resumverlet
Buy Essays Cheap Review - Resumverlet
 
RESEARCH PAPER WRITERS CAN HELP Y
RESEARCH PAPER WRITERS CAN HELP YRESEARCH PAPER WRITERS CAN HELP Y
RESEARCH PAPER WRITERS CAN HELP Y
 
Beautiful Beloved Essay Thatsnotus
Beautiful Beloved Essay ThatsnotusBeautiful Beloved Essay Thatsnotus
Beautiful Beloved Essay Thatsnotus
 
CRITICAL ANALYSIS
CRITICAL ANALYSISCRITICAL ANALYSIS
CRITICAL ANALYSIS
 
Essay Benefits Of Reading Spm Teleg
Essay Benefits Of Reading Spm TelegEssay Benefits Of Reading Spm Teleg
Essay Benefits Of Reading Spm Teleg
 
Templates Writing Paper Template, Writing Paper, Ki
Templates Writing Paper Template, Writing Paper, KiTemplates Writing Paper Template, Writing Paper, Ki
Templates Writing Paper Template, Writing Paper, Ki
 
Buy College Essays. Essa
Buy College Essays. EssaBuy College Essays. Essa
Buy College Essays. Essa
 
HSC Essay Plans For Questions On The National S
HSC Essay Plans For Questions On The National SHSC Essay Plans For Questions On The National S
HSC Essay Plans For Questions On The National S
 
Essay About Importance Of Le
Essay About Importance Of LeEssay About Importance Of Le
Essay About Importance Of Le
 
1 3 1 Essay Example - Lawwustl.Web.Fc2.Com
1 3 1 Essay Example - Lawwustl.Web.Fc2.Com1 3 1 Essay Example - Lawwustl.Web.Fc2.Com
1 3 1 Essay Example - Lawwustl.Web.Fc2.Com
 
What Color Resume Paper Should You Use Prepar
What Color Resume Paper Should You Use PreparWhat Color Resume Paper Should You Use Prepar
What Color Resume Paper Should You Use Prepar
 
Munchkin Land Kindergarten Kid Writing P
Munchkin Land Kindergarten Kid Writing PMunchkin Land Kindergarten Kid Writing P
Munchkin Land Kindergarten Kid Writing P
 
Sample Informative Speech Outline On Caf
Sample Informative Speech Outline On CafSample Informative Speech Outline On Caf
Sample Informative Speech Outline On Caf
 
Top Websites To Get Free Essays Online CustomEs
Top Websites To Get Free Essays Online CustomEsTop Websites To Get Free Essays Online CustomEs
Top Websites To Get Free Essays Online CustomEs
 

Recently uploaded

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Recently uploaded (20)

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 

A Parallel Depth First Search Branch And Bound Algorithm For The Quadratic Assignment Problem

  • 1. ELSEVIER European Journal of Operational Research 81 (1995)617-628 EUROPEAN JOURNAL OF OPERATIONAL RESEARCH Theory and Methodology A parallel depth first search branch and bound algorithm for the quadratic assignment problem Bernard Mans a,c, Thierry Mautor a,b,, and Catherine Roucairol a,b aINRIA, Domaine de Voluceau, Rocquencourt BP 105, F-78153 Le Chesnay Cedex, France b MASI-UniversitE de Versailles 45, Avenue des Etats-Unis, 78000 Versailles, France c Carleton University, School of Computer Science, Ottawa K1S 5B6, Canada Received January 1993;revised March 1993 Abstract We propose a new parallel Branch and Bound algorithm for the Quadratic Assignment Problem, which is a Combinatorial Optimization problem known to be very hard to solve exactly. An original method to distribute work to processors using the notion of Feeding Tree is presented. When adequately used, it allows to reduce memory contention and load unbalance. Therefore, a linear speed-up in the number of processors is reached on a shared memory multiprocessor, the Cray 2 and the optimality of solutions for famous problems of size less than 20 (Nugent 16, Elshafei 19, Scriabin-Vergin 20,...) is proved by this program. The implementation analysis shows that these results are more than an improvement due to hardware evolution and confirms the usefulness of our parallel Branch and Bound algorithm for larger Quadratic Assignment Problems. Keywords: Branch and bound; Combinatorial optimization; Parallel algorithm; Quadratic assignment problem 1. Introduction The Quadratic Assignment Problem (QAP)is a combinatorial optimization problem introduced by Koopmans and Beckman, in 1957, [12]. It has numerous and various applications, such as location problems, VLSI design [18], architecture design [9].... The objective is to assign n units to n sites in order to minimize the quadratic cost of this assignment, which depends both on the distances between the sites and on the flows between the units. It can be formulated as follows. Given two (n × n) matrices, F = (fij) where f/j is the flow between units i and j, D = (dkt) where dkt is the distance between sites k and l, * Corresponding author. 0377-2217/95/$09.50 © 1995Elsevier Science B.V. All rights reserved SSDI 0377-2217(93)E0334-T
  • 2. 618 B. Martset al./ EuropeanJournalof OperationalResearch81 (1995)617-628 find a permutation p of the set N = {1, 2 .... , n} which minimizes the global cost function: Cost(p) : ~ ~fijdp(i)p(j). i=l j=l 2. Parallel solution methods The QAP has shown itself to be computationally a very difficult problem. This problem, of which the Travelling Salesman problem is a specific case, is NP-hard [13]. Moreover, finding an e-approximate solution is also NP-hard [22]. But its theoretical complexity is not a sufficient description of the extreme difficulty of this problem, for which even applications of moderate size (n -- 20) cannot, up to now, be solved exactly. Therefore, many heuristics have been developed in the past thirty years. More recently, due to the development of parallel architectures, many parallel algorithms have been fruitfully implemented to speed up the search and, thus, overcome the difficulty of the problem. 2.1. Parallel exact methods The best way to solve exactly a quadratic assignment problem is, to date, to use a Branch and Bound algorithm, as other methods such as cutting planes methods have not been successful, failing to solve exactly problems of size greater than eight. As a consequence, Branch and Bound algorithms have been the only exact parallel solution methods suggested for this problem. They are due to Roucairol, in 1987, [21] and to Crouse and Pardalos, in 1989, [8]. We will describe later the main characteristics of these algorithms (see Section 4.2). Nevertheless, these parallel Branch and Bound algorithms are also limited to the solution of small size problems (size < 15). This strong limitation explains why most recent approaches to this problem have been heuristics. 2.2. Parallel heuristics Among sequential solution methods, the adjustments of the recent meta-heuritics (Simulated Anneal- ing, Tabu Search and Genetic Algorithms) to the QAP provide the best approximate results, outstripping the results of the first heuristic solution methods (construction methods, exchange methods). Thus, logically, implementations of parallel heuristics are issued from these meta-heuristics. In 1989, Brown, Huntley and Spillane [3], on a 32-nodes Intel iPCS/2 hypercube, and Miihlenbein [17], on a 64 processors system with distributed memory, have developed parallel genetic algorithms. In 1991, Taillard [24], on a network of 10 transputers T800C, has proposed a parallel Tabu search, while Chakrapani and Skorin-Kapov [7] have developed a massively parallel Tabu search on a Connec- tion Machine using 16K processing units. These algorithms always find the optimal solution for problems of small sizes from the literature. Since the optimality cannot be proved for higher instances, it is hard to estimate the quality of the results. Nevertheless, for special instances built with the knowledge of the optimum (Palubetskes [19], Burkard et al. [5]), the results of these heuristics are very close to optimal. Let us also mention a connectionist approach, developed by Wang, in 1990, [25] on a microVax computer, which simulates n 2 processing units performing nonlinear transformations.
  • 3. B. Manset aL/ EuropeanJournalof OperationalResearch81 (1995)617-628 619 3. Sequential Branch and Bound Let us briefly recall the main principles of the sequential algorithm introduced by Mautor and Roucairol, in 1992, [16,15], which currently provides the best sequential results for the quadratic assignment problem. Meanwhile, we discuss important features, such as lower bounds and branching strategies, concerning any Branch and Bound algorithm. 3.1. Bounding Undoubtedly, the computation of the lower bound is one of the major difficulties in the exact solution of the quadratic assignment problem. Indeed, up to the present time, this bound is either too loose, the number of nodes of the search tree becoming quickly huge as the size of the problem increases, or, when the bound is slightly tighter, the time needed to compute the bound on one node is prohibitive. The oldest lower bound, developed independantly in 1962 by Gilmore [11] and Lawler [13], is obtained by solving a linear assignment problem on a (n x n) matrix C = (Cig), where Cig is the lower bound of the assignment of facility i to site k and is given by the computation of the ranked product of the row fi. of F and the colunm d.k of D. This bound is, therefore, quickly computed in O(n 3) but its results are not very tight. For instance, the Gilmore-Lawler bound is more than 20% away from the best solution for Nugent's problems of size greater than 20. The most interesting other lower bounds are based on an eigenvalue approach (Finke, Burkard and Rendl [10], Rendl and Wolkowicz [20]) or on equivalent dual formulations of the problem (Assad and Xu [2], Carraresi and MaluceUi [6]). If they provide a slightly better evaluation, it is at the cost of a very significant increase in computational time. The best ratio quality/time is, therefore, still achieved by the Gilmore-Lawler bound. For this reason, the most successful Branch and Bound algorithms (Burkard and Derigs [4], Roucairol [21], Crouse and Pardalos [8]) use the Gilmore-Lawler bound. Mautor and Roucairol use this bound and concentrate their effort on more efficient approaches to reduce the enumeration. 3.2. Reduction tests Symmetric equivalences In most classical applications of QAP, the sites are located on a regular figure - grid, circle or line - on which several symmetric or isometric equivalences can be detected. We want to avoid creating, visiting and bounding these different but equivalent nodes in different branches of the search tree. For this purpose, a simple test which is quickly computed, has been introduced by Mautor and Roucairol [16]. For any partial solution, this test identifies the different isometric classes. Thus, a unit can be assigned to only one site of each isometric class. Reduction test using the search gap This classical test forbids some assignments and, therefore, reduces the size of the B&B tree. Moreover, it allows to choose an efficient branching for the next level (see Section 3.3).
  • 4. 620 B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628 Let us denote: lb: the value of the lower bound obtained, bks: the value of the best known solution (upper bound). Test. If the alternative cost of the assignment of an unit to a site is greater or equal than the search gap (difference between the value of the best known solution and the lower bound: bks- lb), then this assignment can be forbidden (the proof can be found in [16]). 3.3. Branching strategies Depth first search strategy As mentioned above, exact methods can only deal with problems of small size, for which efficient and recent heuristics, such as Tabu search, almost always find the optimal solution or, at least, a solution extremely close to optimal. As a consequence, since only the nodes of the critical tree (evaluation lower than the best solution) have to be examined, a depth first search strategy is more efficient than a best first search strategy. This approach has two advantages. First, the data structures are lighter. Second, we avoid incessant sortings of partial solutions and reconstructions of independant subproblems (our experiments have shown that building the matrices at each exploration of a B&B node takes at least 40% of the global computation time). Polytomic branching The branching scheme is polytomic, that is a selected unit is assigned to all the free (not already assigned), not forbidden (the alternative cost is in the search gap) and isometrically different sites. This selected unit is the one with the lowest number of sons to be examined (highest number of forbidden elements). This scheme has several advantages. First, with respect to a dichotomic branching, fewer nodes are created. Second, the memory requirement is reduced by avoiding the unnecessary information on revoked assignments generated by classical dichotomic branching schemes, where an unit is assigned or not to a site. Moreover, this scheme discriminates between the lower bounds of a node and its sons and allows early pruning of some branches. Our experiments have shown that this strategy produces a significant decrease in the size of the search tree. 3.4. Size of the search tree We can see in Table 1 that the reduction tests and the branching strategy, described above, lead to a drastic decrease in the total number of created nodes compared to the previous most efficient exact algorithms (Burkard and Derigs, Roucairol, Crouse and Pardalos). Table 1 Total number of created nodes Algorithm Nugent 8 Nugent 12 Nugent 15 Elshafei 19 Burkard-Derigs 403 36 966 2 064415 not proved Roucairol 428 83 379 not proved not proved Crouse-Pardalos 798 42 706 1596 353 not proved Mautor-Roucairol 32 3 474 97 287 491
  • 5. B. Manset al./ EuropeanJournalof OperationalResearch81 (1995)617-628 621 4. Parallel Branch and Bound 411. Main principles We have parallelized the Mautor-Roucairol sequential algorithm on an asynchronous shared memory multiprocessor, the Cray 2. Our main concern has been to minimize the waiting time (inactivity) of the processors and, thus, to obtain the highest speed-up. Thus, our approach considers the major difficulties appearing in parallelizing B&B methods [14]: task allocation, choice of granularity, overhead detection .... As previously argued, we use a depth first search strategy. In a parallel implementation, an additional reason appears, since the granularity (relative number of operations done between synchronizations) defined by a parallel best first search strategy would correspond only to the exploration of a B&B node, which is quickly done. Therefore, the partial sort of the newly generated nodes, at each branching step, in a global shared list would create many contentions on the memory access, and hence a bottleneck for processors. For these reasons, we propose that each processor should execute the same depth first search algorithm on different B & B subtrees. Each processor is, therefore, assigned to the root of a subtree and it develops a local depth first search on the corresponding subtree. When a processor has completed its own exploration ofa subtree, it accesses a shared data structure, the feeding tree, in which it takes a new node, the root of a new subtree to develop. Since this task allocation cannot be done statically, processors schedule themselves (self-scheduling technique) and trigger their own work demand. The consistency and fairness of this shared list of tasks is ensured by synchronizing with classical primitives as locks. The procedure stops when the feeding tree is empty and all processors are idle. The main features of our algorithm are illustrated on the Nugent et al. problem of size 15. Indeed, for this problem, the size of the tree and the computational time are significant enough for the implementa- tion choices to be carefully analyzed. It should be noted that we have obtained similar results with other problems, in that the speed-up has been linear in all cases. 4.2. Task allocation Different approaches can be adopted for the task allocation. In Roucairol's parallel algorithm [21], a global heap is used to memorize the generated B&B nodes. Since a best first search strategy is used, each exploration of a B & B node requires that the shared data structure be accessed once to obtain the required node and be accessed several times to insert the newly generated nodes. In order to limit this global access, Crouse and Pardalos [8] initially create several heaps in the global memory, so that each processor can select one and explore it completely locally. In this case, the parallel B&B execution terminates when all heaps have been explored and all processors are idle. Since our approach is quite different, we describe the way we distribute the work amongst the processors, through the notion of the feeding tree. z The feeding tree Definition 4.1 The Feeding Tree, shared data structure, is the upper part of the B&B tree developed down to depth (or level) i. The leaves of the Feeding Tree (nodes of level i of the B&B tree) are the roots of the subtrees allocated to the processors (see Fig. 1).
  • 6. 622 13.Marts et al./European Journal of Operational Research 81 (1995) 617-628 Fig. 1. Feeding Tree and allocated subtrees. The first free processor initializes the left part of the Feeding Tree until it generates, by successive branchings, the leftmost node at the chosen depth i. Then, this processor unlocks the shared structure and begins its own exploration on the subtree, whose the node is the root. While the exploration of the allocated subtree is not completed, the processor does not need to access the global shared structure. Gradually, the other processors access the feeding tree, in a mutually exclusive way, and develop it until a new node of depth i - the next depth-i-node to the right of the last allocated node - with eventual back-trackings in the feeding tree. Likewise, as soon as a processor becomes idle, the exploration of its last subtree being completed, it tries to access again the feeding tree to build a new "depth-i-node". When the development of the feeding tree is completed (at the first level, it is not possible to branch on a new partitioned subproblem), each processor that becomes idle terminates (the main program terminates when the last processor becomes idle). Since an assignment is fixed at each branching step, the maximal depth of the B&B tree is n - 1, where n is the size of the QAP problem. The maximal depth of allocated subtrees is, therefore, equal to (n - i - 1). Due to the depth first search strategy, in the feeding tree, only the nodes on the path from the root to the last allocated node (the facilities assigned to reach this depth and the set of the remaining available locations) have to be memorized. Similarly, the context memorized by a processor is the path between the current node and its ascendant of depth i. Thus, the memory requirement for this parallel implementation is slightly increased with respect to the sequential implementation. Indeed, this increase is only in O(p) ((i +p(n - i - 1)/(n - 1)), where p is the number of processors. Algorithm procedure Parallel B&B begin /* initialize Feeding Tree */ Nprocsldle := 0 Create fid, the first leftmost descendant of root with depth i for each of the Nprocs processors do/* self-schedule */ lock (Feeding Tree) Create rfid right sibling of rid
  • 7. B. Manset al.~EuropeanJournalof OperationalResearch81 (1995)617-628 623 if rfid ~ null then Keep context of rfid rid := rfid unlock (Feeding Tree) else /* no more tasks in this branch */ Backtrack until creating a right sibling ancfid if ancfid ¢ null then Create rf/d, the first leftmost descendant of ancfid with-depth i Keep context of rfid ..= 'rid unlock (Feeding Tree) else / * no more tasks to assign */ Nprocsldle .'= Nprocsldle + 1 unlock (Feeding Tree) Terminate endif endif /* Depth First Search exploration */ Expand the whole B&B subtree if a new best solution nbs is found then lock (bks) if (nbs < bks) bks := nbs unlock (bks) endif endo / * Nprocsldle = Nprocs */ end Parallel B&B 4.3. Granularity - memory contentions As previously mentioned, we want to minimize the waiting time of the processors. Now, a processor has to wait in two cases: • when the processor needs to access a global shared structure, locked by another processor (memory contention), • during the termination phase, when the processor is idle and has to wait for other processors to complete their last local exploration (termination wait). Let us discuss these two cases in more detail. Memory contention The overhead problem, fully dependant on the parallel implementation, is decisive on the Cray 2. A lock operation requires 200 CPU cycles when the lock is free, while the conflict management requires 4000 CPU cycles. Since the possible updatings of the best known solution are extremely rare and quickly done, only the dynamic development of the feeding tree can take so much time to induce a significant waiting time for other processors. But the total waiting time of processors is a function of level i of the feeding tree. The deeper this level is, the smaller the average granularity of the tasks allocated to the processors will be. The processors will then access the feeding tree more frequently. Moreover, the average time for
  • 8. 624 B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628 0.2 0. I ............ o ......... ......... ......... ! ................. , shared level Fig. 2. Conflictratio depending on the level(Nugent 15). a processor to develop the feeding tree and generate a new "depth-i-node" will increase, due to numerous backtrackings. This phenomenon is shown in Fig. 2, where the access conflict ratio (percentage of access require- ments when the feeding tree is locked) is given, with respect to the shared level, For example, when the depth of the feeding tree is equal to 5, nearly 30% of the access requirements result in waiting time. Termination wait This delay occurs when a processor without work (no more task) waits for the completion of the subtree(s) allocated to other processor(s). The maximal idle time is the time needed to explore the largest subtree which can be assigned to a processor. When the level i of the feeding tree decreases, the average granularity of the tasks (subtrees) allocated to the processors will increase. Therefore, the probability of a long termination wait becomes more and more significant. This is illustrated in Fig. 3, where the difference in the numbers of processed nodes (generated and explored) by the most and the least active processors is given. The effect of the chosen depth i on the 0.4-- 0.3 0.2 O. 1 I nI n- n_ 0 . 0 . . . . . . . . . I . . . . . . . . . . . . . . . . . . I . . . . . . . . . I . . . . . . . . . I . . . . . . . . . I O 1 2 3 4 5 6 shared level Fig. 3. Difference in the numbers of processed nodes with respect to the shared level. [:' explored I generated
  • 9. B. Manset al./ EuropeanJournalof OperationalResearch81 (1995)617"628 625 load balancing is confirmed, since, with a small feeding tree(level 1 or 2), this difference can reach 30% (a processor has explored 5000 subproblems more than another) while a deeper feeding tree produces a very good balance of the load. 5. Experiments Our algorithm, written in Fortran, was run on the Cray 2 which is a 4-processor asynchronous machine with shared memory. The test data used for the computational results is as follows: • Nugent et al. [18]: classical problems for QAP of sizes n = 12, n = 15, and n = 16 (the flow matrix is extracted from the problem of size 20 and the sites are located on a 4 × 4 square), • Elshafei: hospital layout problem of size 19 from [9], • Scriabin and Vergin: economic layout problem of size 20 [23], from Armour and Buffa [1]. 5.1. Computational results We compared the results obtained by our method with those obtained by one of the fastest sequential algorithms (Burkard and Derigs [4]) and by the two previously most efficient parallel algorithms available (Crouse and Pardalos [8], and Roucairol [21]). Since Burkard and Derigs' algorithm has been tested in 1980 on a Cyber 76 and because of the evolution in computational performances, we ran their sequential algorithm on the Cray 2. We also report on two different sets of results from Crouse and Pardalos. The first is obtained from a sequential run of this algorithm and the second from running the algorithm in parallel on a four-processor machine. In Table 2, we present a comparison of the running times of these algorithms. Some tests done on a Cray YMP are also presente d. Remark. Since our dedicated use of the Cray 2 (i.e. a single execution on the machine) was limited to 400 seconds, cumulated for all used processors, parallel executions of Nugent of size 16, and of Scriabin and Vergin of size 20 could not be done with a reasonably busy machine. The amount of time needed for parallel executions of these problems depends on the number of users working on the machine. Thus, in these two cases,-we could not reach the predictable linear speedup shown for Nugent et al. of size 15. Table 2 Comparison of running times (in seconds).. Algorithm Machine No. Nugent Nugent Nugent Elshafei Scr.-Ver. procs size 12 size 15 size 16 size 19 size 20 Optimal solution 578 1150 1550 17212548 110030 Burk. Der. Cray 2 1 24 1290 not proved not proved not proved Roucairol Cray XMP 4 312 out of time not proved not proved not proved Cr. Pard. IBM 3090 1 34 2005 not proved not proved not proved Cr. Pard. IBM 3090 4 10 out of space not proved not proved not proved Mans et al. Cray 2 1 2.68 109 969 1.04 1189 Mans et al. Cray 2 4 0.99 28 436 0.68 560 Mans et al. Cray YMP 1 not tested 62 not tested not tested not tested Mans et al. Cray YMP 6 not tested 11 not tested not tested not tested
  • 10. 626 B. Mans et aL/European Journalof OperationalResearch81 (1995) 617-628 38 i i i i i i i 37 36 35 34 Time 3 3 in sec 32 31, "~ 3O 29 28 I 2 3 4 sharedlevel Fig. 4. Running time obtained on Nugent 15 with respect to the level i. Among the classical quadratic assignment problems from the literature, the problem of Nugent of size 15 was considered to be the hardest to solve by an exact method. Only the most efficient Branch and Bound algorithms managed to solve this problem and they needed a long computational time (more than 20 minutes) to compute the solution. Our parallel algorithm manages to solve this problem in a few seconds (11 seconds on the Cray YMP, 28 seconds on the Cray 2). This result is a good illustration of the efficiency of this parallel algorithm, which, on all the classical problems, obtains the fastest running time. Moreover, this algorithm obtains the best solution and the proof of the optimality of this solution for problems of size 16 to 20 which have never been solved exactly before: Nugent of size 16, Scriabin and Vergin of size 20, Elshafei of size 19. We have also run our algorithm to solve randomly generated problems with previously known optimal solutions, (Palubetskes [19], Burkard and al. [5]). Problems of size 18 were solved sequentially in less than 6 minutes, while they were solved in parallel with a linear speedup. 5.2. Parallel analysis In order to analyze the effects of the granularity and the sizes of the allocated subtrees, we considered the level /-depth of the feeding tree-as a parameter and ran our program with different values of i. 1101 , , i ~ , ' i00 9 0 level3 "~ level 4 -b-- Time 70 in see. 60 50 40 30 2 0 i J t I l 2 3 4 Numberof Processors Fig. 5. Running time (Nugent 15) with respect to the number of processors.
  • 11. B. Mans et al. / European Journal of Operational Research 81 (1995) 617-628 627 4 3.5 3 Speed 2.5 up 2 1.5 t i i i i level 3 ~ ~ ' ~ lord 4 +- / / J 2 3 Number of Processors Fig. 6. Speed up on Nugent 15. The tests have been done on the Nugent 15 problem, while our program was the only one executed by the machine. As previously seen, the level i must not be set to extreme values, neither too high due to memory contentions (see figure 2), nor too low for a good balance of the load (see figure 3). This influence of the level i is also shown in Fig. 4, where the running times obtained with different depths of the feeding tree are presented. On the Nugent 15 problem, the best compromise is achieved when the depth is equal to 3 (or 4). Of course, the value of the level i leading to the best behaviour of the program depends on the problem and on the structure of the corresponding search tree. Nevertheless, all the experiments we have done show that a middle-valued depth of the feeding tree, equal to [n/5] or [n/4] (where n is the size of the problem), leads to very good performances. Therefore, as it can be seen in Fig. 5, with an appropriate choice of i, we obtain a significant decrease in the running time, as the number of processors increases. A linear speed-up in the number of processors is, thus, achieved (see Fig. 6). We emphasize the fact that the comparisons of the parallel running times are done with the running time of the best sequential implementation of the algorithm. 6. Conclusions We have presented an original and very efficient parallel algorithm for solving the Quadratic Assignment problem. We introduced the notion of the feeding tree which allows a good distribution of work to processors. This algorithm which is the parallel version of a sequential B &B algorithm leads to a linear speedup (nearly equal to the number of processors), with very little overhead. Since the heuristics for QAP give very good solutions, no oversearch is pursued (only the critical tree is explored). The parallel implemen- tation of the Depth First Search strategy is scalable in CPU time and memory requirement. Nevertheless, whereas the parallelism is fully used, a global improvement of the algorithm is still required to solve problems of size greater than 20. Without considering the hardware evolution, these solutions can not be reached with the existing tight bounds. Acknowledgements We thank Franz Rendl, Professor at the Technische Universit~it of Graz, Austria, for his helpful comments and suggestions on this work during the last few years.
  • 12. 628 B. Mans et aL/ European Journal of Operational Research 81 (1995) 617-628 The authors are indebted to Cray France (esp. G. Simeoni) and to CCVR (Centre de Calcul Vectoriel pour la Recherche, Palaiseau, France), for supporting this research with computer time and for helpful discussions. References [1] Armour, G., and Buffa E., "A heuristic algorithm and simulation approach to the relative allocation of facilities", Management Science, 9 (1963) 294-309. [2] Assad, A., and Xu, W., "On lower bounds for a class of quadratic 0-1 programs", Operations Research Letters 4 (1985) 175- 180. [3] Brown, D., Huntley, C., and Spillane, A., "A parallel genetic heuristic for the quadratic assignment problem", in: Proc. of the 3rd Conference on Genetic Algorithms, 406-415, Arlington, 1989. [4] Burkard, R., and Derigs, U., Assignment and Matching Problems: Solution Methods with Fortran Programs, Springer Verlag, Berlin 1980. [5] Burkard, R., Karisch, S., and Rendl, F., "Qaplib - a quadratic assignment problem library. European Journal of Operational Research 55 (1991) 115-119. [6] Carraresi, P., and Malucelli F., "A new lower bound for the quadratic assignment problem", Operations Research 40 (1992) $22-$27. [7] Chakrapani, J., and Skorin-Kapov, J., "Massively parallel tabu search for the quadratic assignment problem", Technical Report HAR-91-06, Harriman School for Management and Policy, NY 11794, 1991. [8] Crouse, J., and Pardalos, P., "A parallel algorithm for the quadratic assignment problem", in: Proceedings of Supercomputing 89, 351-360, ACM, 1989. [9] Elshafei, A., "Hospital layout as a quadratic assignment problem", Operational Research Quarterly 28 (1977) 167-179. [10] Finke, G., Burkard, R., and Rend1, F., "Quadratic assignment problems", Annals of Discrete Math. 28 (1987) 61-82. [11] Gilmore, P.C., "Optimal and suboptimal algorithms for the quadratic assignment problem", SIAM Journal on Applied Math. 10 (1962) 305-313. [12] Koopmans, T.C., and Beckman, M.J., "Assignment problems and the location of economic activities", Econometrica 25 (1957) 53-76. [13] Lawler, E., "The quadratic assignment problem", Management Science 9 (1963) 586-599. [14] Mans, B., "Contribution a l'algorithmique non num6dque parallele: Parall61isations de m6thodes de recherche arborescentes", Thbse d'universit6, Universit6 Paris VI, 4, place Jussieu, 75252 Pal-is cedex 05, June 1992. [15] Mautor, T., "Contribution ~ la r6solution des probl~mes d'implantation: Algorithmes s6quentiels et paraU~les pour l'affecta- tion quadratique", Th~se d'universit6, Universit6 Paris VI; 4, place Jussieu, 75252 Paris Cedex 05, February 1993. [16] Mautor, T., and Roucairol C., "A new exact algorithm for the solution of quadratic assignment problems", Discrete Applied Mathematics to appear, 1993. MASI-RR-92-09-Universit6 de Paris 6, 4 place Jussieu, 75252 Paris C6dex 05. [17] Mulhenbein, H., "Parallel genetic algorithms, population genetics and combinatorial optimization", in: J.D. Becket I. Eisele, and F.W. Mundemann, (eds.), Parallelism, Learning, Evolution; Workshop on Evolutionary Models and Strategies and Workshop on Parallel Processing: Logic, Organization and Technology WOPPLOT89. Springer-Verlag, Berlin, July 1989. [18] Nugent, C., Vollmann, T., and Ruml. J., "An experimental comparison of techniques for the assignment of facilities to locations", Operations Research 16 (1968) 150-173. [19] Palubetskes, G.S., "Generation of quadratic assignment test problems with known optimal solution". Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fisiki 28 11 (1988) 1740-t743, (in Russian). [20] Rendl. F., and Wolkowicz, H., "Applications of parametric programming and eigenvalue maximization to the quadratic assignment problem", Mathematical Programming (1992) 63-78. [21] Roucairol, C., "A parallel branch and bound algorithm for the quadratic assignment problem", Discrete Applied Mathematics 18 (1987) 211-225. [22] Sahni, S., and Gonzalez, T., "P-complete approximation problems", Journal of the ACM 23 (1976) 555-565. [23] Scriabin, M., and Vergin, R.C., "Comparison of computer algorithms and visual based methodes for plant layout", Management Science 22 (1975) 172-187. [24] Taillard, E.,' "Robust tabu search for the quadratic assignment problem", Parallel Computing 17 (1991) 443-455. [25] Wang, Jun, "A parallel distributed processor for the quadratic assignment problem", in: Proceedings of the INNC90, International Neural Network Conference 1, 278-281. Paris. France, July 1990.