Genetic Algorithms

Pascal Francq

January 23, 2012 (May 19, 2011)

Abstract

Genetic Algorithms (GA) are a kind of meta-heuristics applicable to solve optimization problems.

1 Definition of Genetic Algorithms

Before studying the different elements of Genetic Algorithms (GA) introduced by John H. Holland [1], it may be interesting first of all to define what GA are. GA can be defined as an exploration algorithm based on the mechanisms of natural selection and the genetic science [2]. GA use the survival of the best adapted structure and an exchange of pseudo-random information to search the solution space. At each generation a new set of artificial artifacts is constructed by using parts of the best elements of the previous generation and, sometimes, by adding new characteristics. GA are not only based on random exploration but also use information obtained from the past to evaluate the best position to explore in the future. The population is evaluated at each generation against an objective function to determine its adaptation.

2 Origin of Genetic Algorithms

GA are based of the natural evolution of species as described by Charles Darwin [3]: in any population, the best adapted (fitted) individuals have a greater chance of reproducing than less adapted ones; this is called natural selection. In nature, new characteristics are transmitted to offsprings by crossing two individual genes. This process is not deterministic, and some uncertainty in these characteristics is introduced. Furthermore mutations may occur as genes are passed from parents to children implying that a small modification occurs in the copy, i.e. the child will have a characteristic not present in the parents. If this mutation has a positive influence on the child, the child will be better adapted than its fellows without this mutation; this newly introduced characteristic may thus be present in the next generation as the result of natural selection. Natural selection is thus based on the fact that some characteristics in a population reflects the specifications of the strongest (best adapted) individuals and thus should lead generally speaking to a better adapted population.
As in nature, where individuals may be represented by their chromosomes alone, GA work with a population of abstract representations of solutions called chromosomes. At each iteration, a generation, the chromosomes representing the best solutions [A]  [A] The solutions with the best values for a given cost function. are selected and used for a crossover to produce new solutions. Their children replace the less adapted ones. By exchanging “good information” between chromosomes, GA try to create new chromosomes with a better value with respect to the fitness function.
Note that the “bad” genes are not taken into account by GA

3 Genetic Algorithms and other Methods

They are four major differences between a GA and traditional approaches to solving optimization problems:
1. GA use a parameter coding rather than the parameters themselves, i.e. GA do not have to know the meaning of the parameters.
2. GA work on a population of solutions and not on a single one, and so explore a larger portion of the search space.
3. GA use a value of the function to be optimized and not another form of this function. The function does not need to have any special analytical properties.
4. GA use probabilistic transitions and not deterministic ones. For the same instance of a problem (same initial conditions and data), the results may be different after two different runs [B]  [B] In fact, this can make debugging more difficult because it is impossible to reproduced the bugs from one run to another. Because of this, there is generally a debug mode in GA, where under same initial conditions two different runs will produce the same results. This can be done by implementing a pseudo-random generator which reproduces the same sequence of numbers if asked..

4 The Structure of Genetic Algorithms

The generic structure of a GA is shown in Figure 1↓.
The different steps are:
1. A population is constructed which is made up of a fixed number of solutions, known as chromosomes. Each chromosome is evaluated through a cost function describing the optimum targeted.
2. GA reproduce some of the statistically best chromosomes. These chromosomes replace others, statistically the worst since the size of the population has to remain static [C]  [C] This is not necessary, but in a computer environment, memory and speed problems increase when the size of the population increases. This is the reason why standard-size populations are selected..
3. GA construct new chromosomes by making a crossover between existing ones.
4. There is a level of given probability that some of the chromosomes will be affected by a mutation.
5. There is a level of given probability that some of the chromosomes will be affected by an inversion.
6. Each newly created chromosome is evaluated by means of the cost function.
7. GA stop if and when a final condition is reached that indicates that an acceptable solution exists; otherwise GA make a new generation by repeating step 2.
All modern GA hold a separate copy of the best chromosome ever computed so as to avoid its accidental destruction when it forms part of the population.
Both the operators and the encoding scheme are equally important.
The remainder of this article is devoted to a brief introduction to each operator. To illustrate each of them, a well-known example will be developed progressively to help the reader to master the concepts : the optimization of the function , where is a coded number using five bits (), with each bit representing a gene of the chromosome [2]. In this particular case the function, , can, of course, be used as a cost function which is maximum for the number 31, i.e. when all the bits coding the chromosome are set to 1. For the example, the size of population is fixed as 4.
Remark: The reader should note that the operators have to be chosen so that they reach the optimal solution, i.e. the operators described in the next sections cannot be used to solve every optimization problem.

5 Initial Construction and Evaluation

The first step is to construct and evaluate an initial population. The easiest way to construct this population is to randomly create the different chromosomes. But, when the initial population is of poor quality, GA need many more generations to reach the end condition. The initialization is therefore a crucial step in GA. In general, problem-dependent heuristics are used to initialize these solutions.
For our example, random initialization is acceptable. Table 1↓ shows the initial population and the value of the cost function for each chromosome.
 Chromosome Coding ranking
Table 1 Initial Population for .

6 Reproduction

Historically, the reproduction step is the re-creation of a simple copy of a set of selected chromosomes that will replace a set of other selected chromosomes. In fact, no modern GA does the reproduction step by really copying chromosomes. Moreover, the role of reproduction is to select the way in which the different chromosomes will be treated by the crossover.
Different methods exist for the construction of these sets, but only two of them will be dealt with here: the proportional selection and the tournament.

6.1 Proportional Selection

Proportional selection is based on the idea that an individual who is twice as good as another should have double the chance of being reproduced [1]. This selection is usually implemented using the “roulette wheel” strategy. The wheel is divided into sections, being the number of chromosomes. Each section is labeled from through , and has an arc length directly proportional to the fitness value of the corresponding individual. This is demonstrated in Figure 2↓ for a population of chromosomes.
A ball is run around the wheel and will stop at a random point, so determining the section selected. When this method is applied times, this method selects individuals that will be used as parents in the crossover. The individuals with high fitness values will naturally have more chance of being selected, and those with low fitness values have more chance of disappear in the next generation. The chromosomes selected are then randomly regrouped to form pairs, which are used as the parents for the crossovers. The number, , of chromosomes to be selected for the crossover can be obtained by the crossover probability. For example, in a population of 10 chromosomes and a crossover probability of 0.6, 6 chromosomes will be selected as parents.
Let us apply proportional selection to our example (Table 2↓). The length of the arc for each chromosome is computed with . By multiplying this length by the size of the population, it is possible to compute the number of copies expected for each chromosome.
 Chromosome Copies
Table 2 Proportional selection using the example.
As expected, the result of the selection is that the chromosome with the highest cost value function is copied twice while the chromosome with the lowest cost value function is not copied at all. So, after the proportional selection, there are copies of the chromosome stored in and no copy of the chromosome which was stored in . This means that a copy of the chromosome stored in is stored in , i.e. at this stage chromosomes stored in and are identical. The two pairs forming the parents for the potential random crossovers are and .
A disadvantage of this method is the fact that there are no guarantees that the best individual will be reproduced because proportional selection uses a probabilistic approach. Also, if the best chromosome ever computed is stored outside the population, it will not change the problem for the best chromosome in the population, which may be different from the best ever computed. The advantage is that the best solution is not always the parent of a global optimum, but sometimes of a local one. This strategy enables the local optimum to be overcome.

6.2 Tournament

The tournament strategy is illustrated in Figure 3↓. The idea is to create an ordered list of the individuals with the best solution always at the top, and the others ordered according to a specific method described below [4]. The upper part of the list will further be used when “good” chromosomes are needed, while the lower part for “bad” chromosomes.
An initial set of all the individual identifiers is established. Two identifiers of this set are chosen randomly (step ), and their fitness values are compared (step ). The best of the two — the one corresponding to the individual with the best fitness value — is reinserted into the set, while the other is pushed into the list (step ). This method is repeated until all the identifiers in the set are in the list; this process leads to the construction of an ordered list of individuals, with the best one at the top of the list. The operator can then presume that the chromosomes ranked in the top half will be used as parents for the crossovers and the resulting children will replace the chromosomes in the bottom half.
Let us apply this strategy to the example in Table 3↓. Initially, the set contains all the chromosomes of the population and the list is empty.
 Set Test Put in the list List
Table 3 An example of a tournament.
The first two randomly compared chromosomes are and . Chromosome has a better fitness value than , so goes in the list while goes back into the set. The next two randomly compared chromosomes are and . Because has the lower fitness value it goes into the list and stays in the set. Finally, the chromosomes are ordered as , where the last inserted chromosomes represents the best individual. The parents chosen for the crossovers are and , and chromosomes and will contain the results of these crossovers.
In comparison with proportional selection, with the tournament strategy:
1. The chromosomes are not copied but just ordered, i.e. there is always only one copy of each chromosome. In fact, for well-designed operators the population must be as diversified as possible.
2. The last inserted chromosome in the list is the best chromosome of the population, i.e. in comparison with proportional selection the best individual is always reproduced.

7 Crossover

The role of a crossover operator is to create new chromosomes in the population by using characteristics of parent chromosomes. If the crossover operator copies interesting characteristics, the corresponding new chromosomes should enjoy a better potential of adaptation, i.e. their fitness values should increase. The basic idea is that information, known as genes, is exchanged between two selected chromosomes to create two new chromosomes. An example of such an exchange is shown in Figure 4↓. A position called the crossing site is selected randomly from the string. The new chromosomes are constructed by mixing the left- and the right-hand sides of each parent’s crossing site.
Let us apply this crossover operator to our example by choosing as parents the result of the tournament where the randomly selected crossing site is (Table 4↓).
 Parent Children Chromosome Replaced
Table 4 Crossover on the example.
The result of the crossover shows that the newly created chromosome, , has a better fitness value than the parents. In the example, chromosome however has a very low fitness value and will probably be destroyed during the next reproduction step.

8 Mutation

If the crossover was the only existing operator, it would be impossible for GA to scan the whole search space. Indeed, if we assume that, in the initial population in our example no chromosomes has the third bit set to , the best solution for the problem — corresponding to a chromosome with all the bits set to 1 — would never be found. The role of the mutation operator is to randomly insert new characteristics in the population to diversify it. An example of a mutation is shown in Figure 5↓, where a gene is randomly chosen and modified.
In our example, this modification could consist in transforming a bit from 1 to 0 or from 0 to 1. If chromosome in our example is chosen and the randomly selected bit is the third, will be modified as , and the best solution to the problem will have been found (Table 5↓).
 Before Mutation After Mutation
Table 5 Mutation on the example.
Even if the interest of such an operator is clear in certain cases, the probability that a mutation occurs is very low because too many mutations would destroy the effects of the reproduction and the crossover. So it is necessary to adopt a method to choose the chromosome and, eventually, the gene on which the mutation must be carried out. Section 10↓ discusses some strategies that can be used.

9 Inversion

The crossover and the mutation operators affect the information contained in the chromosomes. The inversion operator does not change the information, but will change its presentation. Again, to understand this point, it is important to know that the position of a given gene in a chromosome has an influence on whether this gene will, or will not, be used by the different operators. For example, in the basic crossover presented at Figure 4↑, a gene which is in the middle of a chromosome has more chance of being involved in a crossover than the first or last gene. It is therefore sometimes interesting to change the way information is presented. Figure 6↓ shows a simple inversion operator if it is assumed that the value of the corresponding fitness function is not changed: two genes have been randomly chosen, here D and E, and their information exchanged.
The pertinence of this operator strongly depends on the problem studied because an operator may not affect the information contained in a chromosome. This means that this operator is not always applied in real situations. In our example, the inversion of two bits in the chromosome will automatically change the number represented if the two bits are set to different values. In the Grouping Genetic Algorithms (GGA), the inversion operator is applied.

10 Mutation and Inversion Frequencies

The mutation and inversion operators must not be used in each generation and on each chromosome. It is therefore necessary to develop a method to choose when and where to apply these operators.
A first strategy widely used in GA is to assign to each operator a probability, and ; values of 0.0333 are often found in the literature. In each generation, each chromosome has a probability of and of being chosen for a mutation or an inversion.
Another strategy is to apply such operators only when it “seems necessary”. Because the mutation is used to insert diversity into the population and the inversion to facilitate the work of the other operators, these operators are needed when GA seem to be stagnating. Two situations can indicate that GA do not longer evolute:
• A number of generations has been run through without the best chromosome computed ever changing. When a given threshold is reached, the best chromosome ever computed is chosen, a mutation is effected on it and the result replaces a chromosome in the population.
• A given number of generations has been run through without the best chromosome in the population changing. When a given threshold is reached, the best chromosome in the population is chosen, a mutation is effected on it and the result replaces a chromosome in the population.
The following approaches can be used to choose the chromosome to be replace:
• The chromosome with the lowest fitness value is used for the mutation.
• A chromosome is chosen using the “roulette wheel” strategy (see section 6.1↑).
In some GA, when a mutation must be carried out on the best chromosome ever computed, a special operator known as the strong mutation is used. In general, this type of operator modifies the chromosome profoundly while a normal mutation only brings about minor changes.

11 Genetic Algorithms for Multiple Objective Optimization Problems

11.1 Philosophy

Most of the approaches merging multiple objective optimization and GA are illustrated in Figure 7↓a: GA compute a set of Pareto solutions and a multi-criteria decision-aid method is then used to choose the best one as the final solution. Because most Pareto-based methods focus on a single point on the Pareto front, some potentially interesting solutions fail to be studied.
To tackle grouping problems with multiple objective optimization Brahim Rekiek proposed an extension of the GA that uses a multi-criteria decision-aid method for ranking (Figure 7↑b) [5] . The important feature is the integration because it enables the GA to scan all the interesting parts of the search space. Because the method is called upon each time an evaluation of a population is needed, it is important for the method to be rapid. The PROMETHEE method [6] is the multi-criteria decision-aid used.
At each generation all the chromosomes are considered to be a set of solutions to be ranked with regard to the criteria, i.e. at each generation a multiple objective problem must be solved. Each problem is defined by a set solutions , , …, and consisting of the chromosomes of a population, the best chromosome ever computed , and a set of objectives. an evaluation criterion, , and a weight, , are associated with each objective, . An evaluation corresponds to each couple of chromosome and evaluation criterion . This evaluation is given as a real number and representing the quality of the chromosome for the given criterion. All these evaluations make up the evaluation table used as input by the PROMETHEE method (Table 6↓).
Table 6 Evaluation table.
PROMETHEE method computes a net flow for each chromosome. The net flow is the result of comparisons with all the other chromosomes for all the criteria. Using net flows it is possible to establish an inter-chromosomes ranking order from the highest to the lowest values of .

11.2 Control Strategy

At each generation, the PROMETHEE method enables the GA to rank all the chromosomes in a population, and the best chromosome ever found (Figure 8↓). Subsequent to this step the GA can compare two chromosomes and determine which one is the best. If the best chromosome ever found is not the top-ranking, it will be replaced by the top-ranking one.
If used with the tournament as a selection operator, when two solutions are chosen for comparison, it is their ranking which determines the best one rather than the result of a fitness function. Because conventional GA need a fitness function to work, it is interesting to transform the ranking of the chromosomes into a fitness function.
A first idea is to choose the net flow as the value for the fitness function for each chromosome. The problem is that the computing of the net flow is the result of comparisons, i.e. between two generations the net flow of the best chromosome ever computed may change. It is therefore impossible to use the net flow as fitness function.
The method used in this thesis to determine the value of the cost function depends on a case to case basis. Let us suppose that at generation the chromosomes were ranked as , … , where was the best ranked chromosome and the worst ranked one. Let us also assign to each chromosome a cost function, . The cost functions can then be computed using the ranking obtained with PROMETHEE method:
The value of the cost function of the highest ranked chromosome is set to if it is not the best chromosome ever computed, else the cost function remains identical. The value of the cost function for the second highest ranked chromosome is set to 1. The values of the cost function for the other chromosomes are proportional to the corresponding ranking.
Let us illustrate the method on the example given in Table 7↓ for a population of 4 chromosomes and the first 2 generations. The ranking is shown for each generation and the cost function is computed.
Table 7 Transforming ranking into a cost function.
The best chromosome was not computed after initialization, and is therefore ranked as the worst one. Chromosome is ranked as the best one and also has the greatest cost value function and will be chosen as the best chromosome. After the first generation, the ranking leaves the best chromosome in its position as the best ranked one, i.e. its cost function has not changed and is still better than that of the others. After the second generation, the best ranked chromosome is no longer but . The cost function of becomes and is then better than that of the best chromosome. Chromosome will be chosen as the best chromosome ever computed. The fact that is better ranked than and has worse cost function is not important because will be replaced by and disappear from the Genetic Algorithm.
The advantage of the described method is that the GGA operators do not know anything about the multi-criteria decision-aid method used for ranking the chromosomes. Integration into standard GGA is therefore facilitated.

11.3 Branching on Populations

As explained previously, a weight is assigned to each objective. In fact, in the PROMETHEE method, a triplet of values must be assigned to each evaluation criterion associated to an objective, where is the preference threshold, the indifference threshold and the weight. Finding the right values for all these parameters is not an easy task. Moreover, different values in these parameters may produce different results because the ranking has changed. Good practice for a decision maker is therefore to run several GA with different values in these parameters and to choose the best solution produced.
Another method suggested by Rekiek is a branching on populations technique inspired by an optimizing method involving searching a population tree [7]. The basic idea is to work on a population tree where, at each node, , a GA runs with specific values for the triplets for the different criteria used in PROMETHEE method. With methods like branching and backtracking it is possible to construct such trees. Using this method the decision-maker runs the Genetic Algorithm for a number of generations with a given set of value for the triplets. If he is not satisfied with the population computed, he can change some of the values of the triplets and run the Genetic Algorithm again for a number of generations. If the new results, , are worse, the decision-maker can go back to the population, . If the new results are better, he can change some triplets again and rerun the Genetic Algorithm.
While this seems a promising proposition, it is difficult to implement such a solution automatically, i.e. a decision-maker is generally needed to change the values of the triplets and to check the improvement or the deterioration of the different populations while the tree is under construction. Moreover, this method transforms GAs, which are unsupervised clustering methods, i.e. methods that can produce results without any human interactions, into supervised clustering methods where some human interventions are necessary.

11.4 Verification of the Best Solutions

PROMETHEE method is a ranking method based on a comparison of the solutions to each criterion, i.e. the ranking of each solution is dependent on the other solutions compared. When dealing with approximate multi-criteria problems (Definition ), the values of the criteria for each solution are just an approximation of the quality of the solution. It can be added that when comparing two solutions, and , where is a better solution than , the values of the other solutions relative to the population lead as the best to be ranked, i.e. the actual best solution is replaced. This can be seen as a problem where a “local optimum” hides the “global optimum”.
To avoid this problem, an idea could be to store in a separate set all the different solutions that the GA has ranked as the best solution on an occasion. At the end of the Genetic Algorithm, these best solutions are checked by applying the PROMETHEE method to this set. The solution ranked as first is then return as the result of the Genetic Algorithm. This solution has been successfully applied to a cell formulation problem [8]. While this method decrease the probability that a “local optimum” hides the “global optimum”, it does not avoid it. In fact, the ranking done after the run of the Genetic Algorithm can also be incorrect.

12 Implementation

As explained above, Genetic Algorithms are a generic approach for solving optimization problems. Since there exist different classes of optimization problems, it should be cleared that the chromosome coding as well as the operators (crossover, mutation, inversion, local optimization, etc.) depend from it. In fact, GA may refer to a generic approach with regards to a abstract problem class (for example grouping problems), or to a practical solution for a concrete problem class (such as the Bin Packing Problem). As shown at Table 8↓ for the Bin Packing Problem, it was chosen to implement generic GA as template classes and practical GA as normal classes.
 Problems Optimization Problems Grouping Problems Bin Packing Problem Categories Abstract Abstract Concrete Implementation Template Classes Template Classes Normal Classes
Table 8 GA and Optimization Problems.
The R Library proposes several template classes to implement a generic GA. The two main classes are the one dealing with the genetic algorithm instance and its chromosomes: the instance maintains a population of chromosomes and implements the different steps of a GA (population analysis, crossover, mutation, etc.). In practice, most of these steps call virtual methods of the chromosome that should be overridden within an inheriting class (see for example the chromosome of the Grouping Genetic Algorithms.
Most tasks within a GA may be parallelized: the chromosomes evaluations, the crossovers, the mutations, etc. Also if it is not actually the case in the R library, it is a future evolution. Therefore, it is necessary to design the GA implementation with regards to such a parallelization. One of the aspect, is that some chromosome methods, such the one of the crossover for example, need temporal structures. One way to solve it, is to duplicate all the temporal structures in each chromosome, but this is a waste of memory. I therefore introduce the RThreadData class (and its child classes). Its role is to hold these temporal structures, the GA instance being responsible to allocate as many instances as threads created and communicate the correct data structure to each chromosome (through the Init method).
Two more classes play a role in the implementation of a generic GA in the R Library: the template class RFitness represents the fitness function of a GA and the RDebug class provides a generic class that holds debugging information of a running GA. This class should be derived to manage the information gathered. For example, the RDebugXML class stores it in a XML file.

References

[1] John H. Holland, Adaptation in Natural and Artificial Systems, University of Michigan Press, 1975.

[2] David E. Goldberg, Genetic Algorithms in Search, Addison-Wesley, 1991.

[3] Charles Darwin, On the Origin of Species, John Murray, 1859.

[4] David E. Goldberg, Bradley Kork & Kalyanmoy Deb, “Messy Genetic Algorithms: Motivation, Analysis, and First Results”, Complex Systems, 3(5), pp. 493—530, 1989.

[5] Brahim Rekiek, Assembly Line Design : Multiple Objective Grouping Genetic Algorithm and the Balancing of Mixed-model Hybrid Assembly Line, PhD Thesis, Université libre de Bruxelles, 2000.

[6] Jean-Pierre Brans & Bertrand Mareschal, ”The PROMCALC & GAIA Decision Support System for Multicriteria Decision Aid”, Decision Support Systems, 12(4‑5), pp. 297—310, 1994.

[7] Louis Steinberg & Khaled Rasheed, “Optimizing by Searching a Tree of Populations”, In Proceedings of the Genetic and Evolutionary Computation Conference GECCO’99, pp. 1723—1730, 1999.

[8] Emmanuelle Vin, Pierre De Lit & Alain Delchambre, “A New Combined Approach to Solve the Cell Formation Problem with Alternative Routings”, In Proceedings of ACS’2002: Advanced Computer Systems, pp. 43—50, 2002.