# Grouping Genetic Algorithms

Pascal Francq

December 10, 2011 (May 19, 2011)

Abstract

The Grouping Genetic Algorithms (GGA) are a kind of Genetic Algorithms (GA) applicable to solve grouping problems.

## 1 Definition

The Grouping Genetic Algorithms (GGA) were developed by Falkenauer [1] to solve clustering problems. In fact, GGA are a genetic framework for grouping problems, i.e. each particular problem needs its own customization. As the name suggests, GGA are an extension of the conventional Genetic Algorithms adapted to grouping problems.

## 2 Encoding

There are other applications of Genetic Algorithms to solve grouping problems, but what makes GGA a well-designed solution is the coding used by Falkenauer. To illustrate the encoding scheme used in GGA, let us consider the example in Figure 1↓ where letters represent the groups and the numbers represent the objects.
Most Genetic Algorithms (GA) dealing with grouping problems will choose the objects assignation as the information to store in a gene. Such chromosomes could be written:
For , the first object is assigned to group , the second object to group , the third and the fourth objects to group and so on. Each group is different, which explains why other letters are used for both chromosomes. Assuming that a conventional crossover with the crossing site take at is applied, the resulting chromosomes will be:
It is evident that these newly created chromosomes have no sense. Moreover, if some constraints on the groups exist, the resulting chromosomes will certainly contain many illegal groups.
In GGA, the chromosomes are enhanced with a group element containing the group composition:
All the operators work on the group element of the chromosomes. This coding respects the spirit of the “building blocs” because GGA always manipulate groups. This coding has however a technical consequence, namely that the different chromosomes in the same population have different lengths.

## 3 Identical Solutions but Different Chromosomes

One of the problems associated with the coding of grouping problems, is the fact that “identical solutions” can be coded with different chromosomes. Figure 2↓ illustrates this; the solutions are identical but, they are coded differently. The fact that the “same” groups have different letters is because the chromosomes were constructed independently so a chromosome cannot know that such a group already exists in another chromosome.
A simple method can be used to identify these identical solutions. The idea is to construct a new object element like that of the chromosome, but using numbers for the groups rather than letters. The first object is always considered to be in group 1. If the second object is in a different group from the first one, it is put into group 2. Let us apply this method to the object element given by:
The first object is in group . The second object has been put into group because it is not grouped with the first object. The third object is put into group because it is not grouped with the first or the second object. The fourth object is put into the same group as the third one. The other objects are handled identically.
When the same method is employed with the second chromosome, the new object element becomes:
The two newly created objects elements are identical, i.e. the groups are identical. When GGA identifies the same solution on numerous occasions in the same population, it decides to keep only one chromosome coding this solution. All the other chromosomes are modified by means of a special operator, modify. This operator generally can just consist in the mutation of the corresponding chromosome.

## 4 Initialization

Once the coding has been defined, the Grouping Genetic Algorithms must initialize the population. The method used depends on the particular problem because the different solutions must satisfy the hard constraints. Because GGA are meta-heuristic, heuristics are used most of the time to initialize the population. For example, Falkenauer suggests a first-fit heuristic for the bin packing problem.
An important point concerning GGA is the diversity of the population, i.e. the heuristic must be adapted to produce different solutions each time it is called upon. If the objects are treated in random order in the case of the first-fit heuristic, the diversity of the population will be saved. Some heuristics cannot therefore be used for the initialization. In the bin packing problem the first-fit descending heuristic, where the objects are processed in descending order of size, is not used for initialization because if all the objects have different sizes, the heuristic will always produce the same solution and construct a population with identical chromosomes. This is of no particular interest.

## 5 Crossover

Crossover is one of the most important operators in genetic algorithms. The crossover paradigm used for the Grouping Genetic Algorithms is shown in Figure 3↓.
The crossover consists of four steps:
1. A crossing site is chosen in each parent.
2. The bins selected by the crossing site of one parent are inserted at the crossing site of the second parent. At this stage, some objects may appear in more than one bin.
3. The existing bins containing objects that are already in the inserted bins are eliminated, i.e. some objects are no longer in a bin. If some bins are empty, they are also removed from the solution.
4. The objects left aside are reinserted into the solution.
With two parents it is possible to create two children by inserting the selected bins of the first parent into the second one, and by doing the reverse.
Concerning this latter step, a method must be used for the insertion. This method in fact depends on the problem to be solved because the operator must construct some valid solutions. Falkenauer suggests a first-fit descending heuristic for example for the bin packing problem.

## 6 Mutation

The role of a mutation operator is to insert new characteristics into a population to enhance the search space of the Genetic Algorithm. In the case of the Grouping Genetic Algorithms, the operator proposed is illustrated in Figure 4↓.
The idea is to randomly choose a few bins and to remove them from the solution. The objects attached to these bins are then reinserted into the solution. The method used for the insertion of the objects is generally the same as the one used for the crossover operator.

## 7 Inversion

The role of the inversion operator is to propose the same solution to the Grouping Genetic Algorithm, but differently. As pointed out in 3↑, a single solution may have different presentations, and because crossovers work through crossing sites, the way in which a solution is presented influences the crossover operator’s results. The first group appearing in the group element of a chromosome is likely less probability to be chosen than the other groups. It is therefore important to include this operator in the Grouping Genetic Algorithms. Figure 5↓ gives the philosophy behind the inversion.
Two groups are randomly selected and their positions are switched in the solution. In the example, when the crossing site selected is between the second and third positions, the bins inserted will be “A” and “C” in the first solution and “D” and “C” in the second.

## 8 Implementation

The R Library proposes template classes to implement GGA. The main classes are the one dealing with the GGA instance and its chromosomes.

## References

[1] Emanuel Falkenauer, Genetic Algorithms and Grouping Problems, John Wiley, 1998.