Differential Evolution Optimization from Scratch with Python
August 27, 2017
Besides particle swarm optimization (PSO) which I touched on previously, differential evolution (DE) is one of my go-to favorites. Just like PSO, differential evolution falls within the evolutionary algorithms (EA) family. Differential evolution is very similar to genetic algorithms (GA) which are based on the principles of evolutionary biology such as mutation, crossover, and selection. The downside of genetic algorithms is that at their core, they are based on a bit level information structure. Because of this, GAs excel at combinatorial optimization problems such as the traveling salesman problem. The downside is that GAs don’t natively support real valued (float values) cost functions. Sure, genetic algorithms can be modified to support float values, but in my experience it just isn’t worth it. This is where differential evolution comes it. Differential evolution is basically a genetic algorithm that natively supports float value based cost functions. In this tutorial, I hope to teach you the fundamentals of differential evolution and implement a bare bones version in Python.
The basic structure of differential evolution can be summed up below:
1) Initialize a random population of individuals throughout the search space.
2) while iter <= max num of generations
3) cycle through each individual in the population
3.A) perform mutation
3.B) perform recombination ("crossover" in GA lingo)
3.C) perform selection
4) if stopping criterion has been met:
exit and return best individual
iter = iter + 1
go back to step #3
So let’s go through each step while implementing it in Python. Before we can get to step one, we need to construct the main function which will contain the bulk of the code:
Simple, let’s move to step one. Given a list of tuples representing the search space bounds for each input variable \( x_n \) such that:
Initializing a population of size (popsize) given user specified bounds:
It would be prudent to note at this point that the term individual which is simply just a one-dimensional list, or array of values will be used interchangeably with the term vector, since they are essentially the same exact thing. Within the Python code, this may take the form of vec or just simply v.
Next, we need to begin the main loop of the algorithm represented by step #2, while we’re at it, we’ll knock out step #3A. There exists many different flavors of mutation for differential evolution but we’re going to stick with the simplest for now. In this version of mutation, we need to select three individuals \( x_1 \), \( x_2 \), and \( x_3 \) from the current generation that are both unique to themselves, but also unique to the currently selected individual that we’re mutating. From here we subtract individuals \( x_2 \) and \( x_3 \) from each other, and multiply the difference by the user controlled mutation factor \( F \) which ranges from [0,2]. This is then added to the individual \( x_1 \) which forms the new individual \( v \) or what’s called the donor. Below is the equation form of this mutation scheme.
$$ v = x_1 + F(x_2 - x_3) $$
One thing we need be aware of at this point is the possibility of this new donor vector existing outside of the bounds specified at initialization. Because of this, we need to create a separate function that checks and corrects for this. In the event that one of these rogue points are found, we’ll simply move it to the nearest boundary whether it’s the minimum or maximum. In the code, this new function will be called ensure_bounds. Implementing all this into our code looks like this:
Where maxiter represents the number of generations we want to run the algorithm for and mutate represents the mutation factor \( F \). Next step is recombination or “crossover” in the language of genetic algorithms. For this, we need to introduce a new user selected value known as the “recombination rate” (some texts refer to this as the crossover ratio or CR). This recombination rate is a float value that varies between zero and one and determines if recombination occurs. Cycling through each index position in the target vector, we generate a random value between zero and one. If this random value is less than the recombination rate, recombination occurs and we swap out the current variable in our target vector with the corresponding variable in the donor vector. If the randomly generated value is greater than the recombination rate, recombination does not happen and the variable in the target vector is left alone. This new offspring individual is called the trial vector.
The last real step (#3.C) is selection which consists of evaluating our new trial individual v_trial against the currently selected individual (target individual, x_t). We’ll go with the easiest selection scheme for this tutorial which means we’re using greedy selection. This basically means that if the new trial individual performs better than the currently selected target individual, we delete our target individual from the population and replace it with the trial individual. If the target individual is better, than we leave it alone and move on.
You might have noticed that I’ve skipped over step #4 which is the stopping criteria. This is again, in the interest of keeping the code as simple as possible. Because of this, the stopping criteria is simply when we have cycled through all the generations specified at initialization. Any number of stopping mechanisms from other optimization routines can be implemented here.
Putting everything together, adding a few lines for text output, score keeping, and some example cost functions, the final code looks like this (github repository -> here):
That’s it! Differential evolution is a very simple evolutionary routine that produces great results when used correctly. I hope this was helpful. Thanks for reading!
In step #3, this implementation differs from most DE algorithms in that we cycle through each member of the population, generate a donor vector, then perform selection. In this setup, every member of the population becomes a target vector at some point which means that every individual has the possibility of being replaced. In most DE implementations, the target vector is randomly chosen. In my experience using DE (both in aerodynamic shape optimization, as well as in deep learning applications), I have found that the current implementation works much better. The standard DE implementation might be slightly more stochastic in nature, but whatever… If, you’re interested in the “standard” DE implementation swap out the lines in the above code with the following: