 Theoretical Computer Science

# The use of crossovers in Genetic Algorithm

My questions concern the use of crossovers in genetic algorithms. The three basic ingredients of genetic algorithms are:

• selection
• mutation
• crossover

If we think of genetic algorithm acting on binary strings

Crossover is the process by which two parent strings get mixed to give born to offspring strings. For example: 11000 and 00010 can create the offspring 11010 by crossover only.

In some genetic algorithm people authorize double-crossover. For example the two parents: 110011 and 001100 can give birth to 111111.

I don't fully understand the advantageous of crossovers! It seems to me that one can obtain similar results with algorithms that do not contain crossover.

My questions

• Under what conditions is the use of crossover in genetic algorithm beneficial? why?

• Under what conditions is the use of double-crossover more beneficial than single crossover? why? What about triple crossover?

When I say "Under what conditions", I mean "for what fitness landscape" or "for what problem type".

By fitness landscape, I mean a mapping of each possibility to fitness (reproductive success) in a hyperspace of \$n+1\$ dimensions where \$n\$ is the length of the string (the additional dimension being the fitness). ## Solution

If crossover is excluded from genetic algorithms, they become something between the gradient descent and the simulated annealing. The main effect of crossover consists in the exchange of parts of different solutions. If an optimization task can be loosely decomposed into somewhat independent subtasks, and this decomposition is reflected in genes, then crossover will increase performance of GAs. For example, if there is a function f(x,y)=g(x)+h(y), and x and y are encoded consequently in the genome, and e.g. g(x) has larger influence, then the part of genome that stands for x will be optimized in the first place, and it will become nearly the same for the whole population thanks to crossover. After this, h(y) term will be optimized. That is, crossover helps to loosely decompose the task into subtasks without prior knowledge (but dependent on the encoding scheme), if it is possible (otherwise it will yield no benefit, or even will make search less efficient). This is actually the main additional metaheuristic of GAs in comparison with other metaheuristic optimization methods.

Double-point crossover (or other more clutter types of crossover) can be more beneficial, if the most crucial parts of solutions are encoded in the middle of the genome or if they are placed in separate locations. The simplest example is the function f(x,y,z)=h(x,z)+r(y)+g(z), where h(x,z) has the highest influence, and genes encode the string xyz. Single-point crossover will prefer to take x from one parent, and z from another parent. However, (x,z) determine the solution quality together. If one individual will occasionally have optimal components (x*,z*), they will not appear simultaneously in its children after single-point crossover, so it will be more difficult for these parts of the genome to stabilize.