## Theory of Randomized Search Heuristics (ThRaSH) Seminar Series 2021

Randomized search heuristics such as stochastic gradient methods, simulated annealing, evolutionary algorithms, stochastic neural networks for optimization, ant colony and swarm optimization, and the cross-entropy method are frequently used across many scientific communities. They have been successfully applied in various domains, both for combinatorial and numerical optimization. Despite their success in practice, proving that such algorithms satisfy certain performance guarantees is still a difficult and widely open problem.

The mission of the ThRaSH workshop series is to contribute to the theoretical understanding of randomized search heuristics, in particular of their computational complexity. The aim is to stimulate interactions within the research field and between people from different disciplines working on randomized algorithms. The primary focus is on discussing recent ideas and detecting challenging topics for future work, rather than on the presentation of final results.

Organisers:

### Mailing list

If you would like to receive information about the 2021 ThRaSH seminars, please subscribe to our mailing list by sending an email to majordomo@lists.bham.ac.uk with the following text in the body of the email:

subscribe thrash-seminar-2021

### Seminars Spring 2021

There will be no seminar on May 13th 2021 due to public holidays in several countries.

#### July 1st: Representing fitness landscapes by valued constraints to understand open-ended evolution

Artem Kaznatcheev (Pennsylvania/Oxford) July 1st 2021, (7pm BST, 20:00 CEST, 2pm EDT) (NB! Note the unusual time.)

Experiments show that evolutionary fitness landscapes can have a rich combinatorial structure due to epistasis. For some landscapes, this structure can produce a computational constraint that prevents evolution from finding local fitness optima thus overturning the traditional assumption that local fitness peaks can always be reached quickly if no other evolutionary forces challenge natural selection. This creates a distinction between easy landscapes where local fitness peaks can be found quickly, and the hard landscapes of open-ended evolution where finding local optima is infeasible (PLS-complete). To understand what separates easy landscape from hard landscapes, we represent landscapes using valued constraints.

First, we show that for fitness landscapes representable by binary Boolean valued constraints there is a minimal necessary constraint graph that can be easily computed. Second, we consider landscapes as equivalent if they allow the same (improving) local search moves; we show that a minimal constraint graph still exists, but is NP-hard to compute.

We then develop several techniques to bound the length of any sequence of improving local search moves. We show that a level-based analysis bound can be obtained from the numerical values of the constraints in the representation, and show how this bound may be tightened by considering equivalent representations. This level-based analysis gives a quadratic bound on the number of improving moves made by any local search for a degree 2 constraint graph. But to extend the quadratic bound to any tree-structured constraint graph requires us to introduce a new encouragement-path analysis instead of the level-based analysis.

Finally, we build two families of examples to show that the conditions in our tractability results are essential. With domain size three, even just a path of binary constraints can model a landscape with an exponentially long sequence of improving moves. With a treewidth-two constraint graph, even with a maximum degree of three, binary Boolean constraints can model a landscape with an exponentially long sequence of improving moves.

This talk is based on joint work with Dave Cohen and Peter Jeavons:

Kaznatcheev, A. (2019). Computational complexity as an ultimate constraint on evolution. Genetics, 212(1), 245-265.

Kaznatcheev, A., Cohen, D., & Jeavons, P. (2020). Representing fitness landscapes by valued constraints to understand the complexity of local search. Journal of Artificial Intelligence Research, 69, 1077-1102.

#### Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size

Christoph Hertrich (TU Berlin, Germany) June 24 2021, (1pm BST, 14:00 CEST, 8am EDT)

The development of a satisfying and rigorous mathematical understanding of the performance of neural networks is a major challenge in computer science. Against this background, we study the expressive power of neural networks through the example of the classical NP-hard Knapsack Problem. Our main contribution is a class of recurrent neural networks (RNNs) with rectified linear units that are iteratively applied to each item of a Knapsack instance and thereby compute optimal or provably good solution values. We show that an RNN of depth four and width depending quadratically on the profit of an optimum Knapsack solution is sufficient to find optimum Knapsack solutions. We also prove the following tradeoff between the size of an RNN and the quality of the computed Knapsack solution: for Knapsack instances consisting of $$n$$ items, an RNN of depth five and width $$w$$ computes a solution of value at least $$1 - O( n^2 / \sqrt{w} )$$ times the optimum solution value. Our results build upon a classical dynamic programming formulation of the Knapsack Problem as well as a careful rounding of profit values that are also at the core of the well-known fully polynomial-time approximation scheme for the Knapsack Problem. A carefully conducted computational study qualitatively supports our theoretical size bounds. Finally, we point out that our results can be generalized to many other combinatorial optimization problems that admit dynamic programming solution methods, such as various Shortest Path Problems, the Longest Common Subsequence Problem, and the Travelling Salesperson Problem.

#### Lazy Parameter Tuning and Control: Choosing All Parameters Randomly From a Power-Law Distribution

Denis Antipov (ITMO University, Saint Petersburg, Russia) June 17 2021, (1pm BST, 14:00 CEST, 8am EDT)

Most evolutionary algorithms have multiple parameters and their values drastically affect the performance. Due to the often complicated interplay of the parameters, setting these values right for a particular problem is a challenging task. This task becomes even more complicated when the optimal parameter values change significantly during the run of the algorithm since then a dynamic parameter choice is necessary.

In this work, we propose a lazy but effective solution, namely choosing all parameter values in each iteration randomly from a suitably scaled power-law distribution. We demonstrate the effectiveness of this approach via runtime analyses of the $$(1 + (\lambda, \lambda))$$ genetic algorithm with all three parameters chosen in this manner. We show that this algorithm on the one hand can imitate simple hill-climbers like the (1+1) EA, giving the same asymptotic runtime on some simple problems. On the other hand, this algorithm is also very efficient on jump functions, where the best static parameters are very different from those necessary to optimize simple problems. We prove a performance guarantee that is comparable to, and sometimes even better than, the best performance known for static parameters. We complement our theoretical results with a rigorous empirical study confirming what the asymptotic runtime results suggest.

(Joint work with Benjamin Doerr (Laboratoire d’Informatique (LIX), CNRS, cole Polytechnique, Palaiseau, France) and Maxim Buzdalov (ITMO University, Saint Petersburg, Russia))

#### Populations in Dynamic Optimization

Johannes Lengler (ETH, Switzerland) June 10 2021, (1pm BST, 14:00 CEST, 8am EDT)

I will talk about optimization in an environment where the fitness function changes in every round. While we cannot expect reasonable behaviour in the general case, I will discuss two benchmarks, Dynamic Binary Value and Dynamic Linear Functions, in which we can hope that an optimization heuristic may find the global optimum. We understand the case of the (1+1)-EA very well. However, when we increase the population size and go to the ($$\mu$$+1)-EA, some quite unexpected population dynamics happen. In some cases, a larger population size makes it much easier to find the optimum. In other cases, a larger population size makes it much harder, and practically infeasible to find the opimum. Even for $$\mu=2$$, the population dynamics has quite counter-intuitive effects.

Our understanding is good enough to know that the population dynamics must have some strong and counterintuitive effect. But we do not understand the dynamics, and we don’t even have an intuitive explanation where these effects might come from. I will give an overview of the known results, and particularly about the unsolved questions in this setting.

#### Generalized Jump Functions

Henry Bambury and Antoine Bultel (Ecole Polytechnique, France) June 3 2021, (1pm BST, 14:00 CEST, 8am EDT)

Jump functions are the most studied non-unimodal benchmark in the theory of randomized search heuristics, in particular, evolutionary algorithms (EAs). They have significantly improved our understanding of how EAs escape from local optima. However, their particular structure – to leave the local optimum one can only jump directly to the global optimum – raises the question of how representative such results are.

For this reason, we propose an extended class $$\text{Jump}_{k,\delta}$$ of jump functions that contain a valley of low fitness of width $$\delta$$ starting at distance $$k$$ from the global optimum. We prove that several previous results extend to this more general class: for all $$k = o(n^{1/3})$$ and $$\delta < k$$, the optimal mutation rate for the (1+1) EA is $$\frac{\delta}{n}$$, and the fast (1+1) EA runs faster than the classical (1+1) EA by a factor super-exponential in $$\delta$$. However, we also observe that some known results do not generalize: the randomized local search algorithm with stagnation detection, which is faster than the fast (1+1) EA by a factor polynomial in $$k$$ on $$\text{Jump}_k$$, is slower by a factor polynomial in $$n$$ on some $$\text{Jump}_{k,\delta}$$ instances.

Computationally, the new class allows experiments with wider fitness valleys, especially when they lie further away from the global optimum.

(Joint work with Benjamin Doerr)

#### When Move Acceptance Selection Hyper-Heuristics Outperform Metropolis and Elitist Evolutionary Algorithms and When Not

John Alasdair Warwicker (Karlsruhe Institute of Technology, Germany) May 27th 2021, (1pm BST, 14:00 CEST, 8am EDT)

Selection hyper-heuristics (HHs) are automated algorithm selection methodologies that choose between different heuristics during the optimisation process. Recently Selection HHs have been shown to optimise standard unimodal benchmark functions from evolutionary computation in the optimal expected runtime achievable with the available low-level heuristics. In this talk, we extend our understanding of the performance of HHs to the domain of multimodal optimisation by considering a Move Acceptance HH (MAHH) from the literature that can switch between elitist and non-elitist heuristics during the run.

We first identify the range of parameters that allow MAHH to hillclimb efficiently. Afterwards, we use standard multimodal benchmark functions to highlight function characteristics where MAHH is efficient by quickly escaping local optima and ones where it is not. We then design a more general benchmark function class where the size and the gradient of the slopes leading to and away from the local optima can be tuned. This function class also allows us to highlight natural examples of when the use of non-elitism makes a difference between small polynomial expected runtimes versus the exponential runtime required by elitist search heuristics. We complete the picture by providing an example function where the sensitivity of Metropolis to differences in fitness values is crucial, allowing us to illustrate when it is able to significantly outperform the HH.

Since the MAHH is essentially a non-elitist random local search heuristic, the talk is of independent interest to researchers in the fields of artificial intelligence and randomised search heuristics.

(Joint work with Pietro Oliveto and Andrei Lissovoi.)

#### Non-elitist Evolutionary Algorithms Excel in Fitness Landscapes with Sparse Deceptive Regions and Dense Valleys

Per Kristian Lehre (University of Birmingham, UK) May 6th 2021, (1pm BST, 14:00 CEST, 8am EDT)

It is largely unknown how the runtime of evolutionary algorithms (EAs) depends on fitness landscape characteristics for broad classes of problems. Runtime guarantees for complex and multi-modal problems where EAs are typically applied are rarely available.

We present a parameterised problem class SparseLocalOpt$$_{\alpha,\varepsilon}$$, where the class with parameters $$\alpha,\varepsilon\in[0,1]$$ contains all fitness landscapes with deceptive regions of sparsity $$\varepsilon$$ and fitness valleys of density $$\alpha$$. We study how the runtime of EAs depends on these fitness landscape parameters.

We find that for any constant density and sparsity $$\alpha,\varepsilon\in(0,1)$$, SparseLocalOpt$$_{\alpha,\varepsilon}$$ has exponential elitist ($$\mu$$+$$\lambda$$) black-box complexity, implying that a wide range of elitist EAs fail even for mildly deceptive and multi-modal landscapes. In contrast, we derive a set of sufficient conditions for non-elitist EAs to optimise any problem in SparseLocalOpt$$_{\alpha,\varepsilon}$$ in expected polynomial time for broad values of $$\alpha$$ and $$\varepsilon$$. These conditions can be satisfied for tournament selection and linear ranking selection, but not for ($$\mu$$,$$\lambda$$)-selection.

(Joint work with Duc-Cuong Dang and Anton Eremeev.)

#### Analysis of Evolutionary Algorithms on Fitness Function with Time-linkage Property

Weijie Zheng (Southern University of Science and Technology) April 22nd 2021, (1pm BST, 14:00 CEST, 8am EDT)

In real-world applications, many optimization problems have the time-linkage property, that is, the objective function value relies on the current solution as well as the historical solutions. Although the rigorous theoretical analysis on evolutionary algorithms has rapidly developed in recent two decades, it remains an open problem to theoretically understand the behaviors of evolutionary algorithms on time-linkage problems. This paper takes the first step to rigorously analyze evolutionary algorithms for time-linkage functions. Based on the basic OneMax function, we propose a time-linkage function where the first bit value of the last time step is integrated but has a different preference from the current first bit. We prove that with probability $$1-o(1)$$, randomized local search and (1+1) EA cannot find the optimum, and with probability $$1 -o(1)$$, ($$\mu$$ + 1) EA is able to reach the optimum.

(Joint work with Huanhuan Chen and Xin Yao).

#### Tight Bounds on the Expected Runtime of a Standard Steady State Genetic Algorithm

Dirk Sudholt (University of Passau) April 15th 2021, (1pm BST, 14:00 CEST, 8am EDT)

Recent progress in the runtime analysis of evolutionary algorithms (EAs) has allowed the derivation of upper bounds on the expected runtime of standard steady-state Genetic Algorithms (GAs). These upper bounds have shown speed-ups of the GAs using crossover and mutation over the same algorithms that only use mutation operators (i.e., steady-state EAs) both for standard unimodal (i.e., OneMax) and multimodal (i.e., Jump) benchmark functions. The bounds suggest that populations are beneficial to the GA as well as higher mutation rates than the default $$1/n$$ rate. However, making rigorous claims was not possible because matching lower bounds were not available. Proving lower bounds on crossover-based EAs is a notoriously difficult task as it is hard to capture the progress that a diverse population can make. We use a potential function approach to prove a tight lower bound on the expected runtime of the (2+1) GA for OneMax for all mutation rates $$c/n$$ with $$c < 1.422$$. This provides the last piece of the puzzle that completes the proof that larger population sizes improve the performance of the standard steady-state GA for OneMax for various mutation rates, and it proves that the optimal mutation rate for the (2+1) GA on OneMax is $$(\sqrt{97}-5)/(4n) \approx 1.2122/n$$.

#### Evolutionary Minimization of Traffic Congestion

Martin Krejca (Sorbonne University) April 8th 2021, (1pm BST, 14:00 CEST, 8am EDT)

Traffic congestion is a major issue that can be solved by suggesting drivers alternative routes they are willing to take. This concept has been formalized as a strategic routing problem in which a single alternative route is suggested to an existing one. In this talk, we extend this formalization and introduce the Multiple-Routes problem, which is given a start and a destination and then aims at finding up to $$n$$ different routes that the drivers strategically disperse over, minimizing the overall travel time of the system. Due to the NP-hard nature of the problem, we introduce the Multiple-Routes evolutionary algorithm (MREA) as a heuristic solver. We study several mutation and crossover operators and evaluate them on real-world data of the city of Berlin, Germany.

Please note that this work is empirical. However, the formalization of the objective function stems from game theory, with some proven guarantees. Thus, there might be the possibility of approaching our setting also from a theoretical perspective.

slides

#### Pareto Optimization for Subset Selection with Dynamic Partition Matroid Constraints

Viet Anh Do (University of Adelaide) April 1st 2021, (1pm BST, 14:00 CEST, 8am EDT)

In this talk, we consider the subset selection problems with submodular or monotone discrete objective functions under partition matroid constraints. We focus on POMC, a simple Pareto optimization approach that has been shown to be effective on such problems. Our analysis departs from singular constraint problems and extends to problems with multiple constraints. We add to the existing results of POMC’s worst-case performance with parameters related to the partition matroid constraints. Additionally, we show that when the constraint thresholds change dynamically, the algorithm is able to maintain the approximation guarantees w.r.t. the new constraints and optima in polynomial time. Lastly, we highlight the difficulties in worst-case analyses in the presence of multiple constraints.

#### Result Diversification by Multi-objective Evolutionary Algorithms

Chao Qian (Nanjing University) March 25th 2021, (1pm GMT, 14:00 CET, 9am EDT)

Given a ground set of items, result diversification aims to select a subset with high quality" and diversity" while satisfying some constraints. It arises in various real-world applications, such as web-based search, document summarization, feature selection and facility location, just to name a few. The quality can often be characterized by a monotone submodular function, and the diversity is usually measured by the sum of distances between the selected items. In this talk, we will introduce how to apply multi-objective evolutionary algorithms (MOEAs) to solve the result diversification problem. We will show that MOEAs can achieve the (asymptotically) optimal polynomial-time approximation ratio for the problem with cardinality constraints as well as the more general matroid constraints. Furthermore, we will show that when the quality or distance changes dynamically, MOEAs can maintain the optimal approximation ratio in polynomial running time, which addresses the open question proposed by [Borodin et al. TALG'17].

### Seminars Autumn 2020

These seminars were organised by Carsten Witt (DTU) and Timo Koetzing (HPI).

#### Exponential Upper Bounds

Speaker: Benjamin Doerr, Ecole Polytechnique, France October 9, 2020

Throughout the (young) history of runtime analysis of evolutionary algorithms (EAs), exponential lower bounds have been used to argue that a certain algorithm is not suited for a certain task, and this was generally seen as the end of this story. In this talk, I argue that there are at least two reasons to not stop thinking here. One is the general recent trend in classic algorithmics to also study exponential-time algorithms (for various reasons). The other is that an exponential lower bound still leaves open the basic question what is the runtime in this case. It could be much higher than exponential, and indeed, runtimes like $$n^{\Theta(n)}$$ have been observed with standard EAs as well. There is some progress on the second point: Via a simple hope-for-extreme-luck argument, I easily prove several exponential upper bounds, many of them matching well-known lower bounds. As one particular result, I show that any of the algorithms randomized local search, Metropolis algorithm, simulated annealing, and (1+1) evolutionary algorithm can optimize any pseudo-Boolean weakly monotonic function under a large set of noise assumptions in a runtime that is at most exponential. This extends a previous result, limited to the (1+1) EA, the LeadingOnes function, and one-bit or bit-wise prior noise with noise probability at most $$1/2$$. Unfortunately, all these bounds are not yet of the type $$C^n$$ for a constant C < 2, and thus they are not yet interesting in the perspective of classic algorithms. This is clearly an interesting problem for future work. The hope-for-extreme-luck argument so far seems to work mostly for (1+1)-type algorithms. Already for the $$(1+\lambda)$$ EA, some technical difficulties arise. Overcoming these (or proving super-exponential bounds) is a second problem we have to leave open. A preliminary version of these results was presented at PPSN 2020. The full paper can be found on the arxiv preprint server.

slides

#### Web-survey on the analysis of randomized search heuristics

Timo Koetzing, Hasso Plattner Institute Potsdam, Germany October 16, 2020

Presentation of ongoing work to establish a web survey covering the most important concepts in the theoretical analysis of randomized search heuristics, followed by discussion in break-out groups

#### Exponential Upper Bounds via Drift Analysis

Carsten Witt, Technical University of Denmark October 23, 2020

Drift analysis is a key tool for the runtime analysis of randomized search heuristics. Usually, it is used to translate information on the expected one-step progress of the underlying stochastic process, the so-called drift towards the optimum, into a statement on the expected optimization time of the search heuristic. However, drift analysis is more versatile. In the talk, I will discuss the use of potential functions in drift theorems to bound the optimization time in scenarios where the process does not have a drift towards the optimum. This results in exponential upper bounds on the expected optimization time. Moreover, I would like to discuss how this approach compares with the recent technique by Doerr (PPSN 2020) that derives exponential upper bounds based on lucky sequences of improving steps.

slides

#### Time-flexible Drift Theorems

Martin Krejca, Hasso Plattner Institute Potsdam, Germany October 30, 2020

Drift theory is a collection of theorems that bound the expected stopping time of time-discrete random processes over the reals that have a consistent bias – the drift – in decreasing in expectation in each step. The beauty of drift theory stems from translating a bound for the drift immediately into a bound for the expected stopping time. In other words, local information about the one-step change of the process is turned into global information about the behavior of the entire process. However, a downside of drift theory is that the application of a drift theorem is usually preceded by defining a useful potential function that transform the random process into one that has the desired drift. This step is not straightforward.

slides

November 4, 2020

#### Self-Adjusting Evolutionary Algorithms for Multimodal Optimization

Amirhossein Rajabi, Technical University of Denmark November 13, 2020

Recent theoretical research has shown that self-adjusting and self-adaptive mechanisms can provably outperform static settings in evolutionary algorithms for binary search spaces. However, the vast majority of these studies focuses on unimodal functions which do not require the algorithm to flip several bits simultaneously to make progress. In fact, common self-adjusting algorithms are not designed to detect local optima and do not have any obvious benefit to cross large Hamming gaps. A mechanism called stagnation detection is suggested, which can be added as a module to existing evolutionary algorithms. Added to a simple (1+1) EA and RLS, we proved an efficient expected runtime on leaving local optima without changing the behaviour of the algorithm on unimodal functions.

slides