A Combined Algorithm

Next: Implementations Up: Parallelism Previous: The Parallel Genetic Algorithm Contents

Subsections

A Combined Algorithm

While, in opposite to backpropagation, the genetic algorithm allows very efficient parallel implementation on distributed memory system, its main disadvantage is, that, due to its stochastic nature, its superior robustness and reliability is often outweighed by its slower convergence and thus longer calculation times, which often leaves backpropagation the better choice for squential network training.

These facts suggest the conclusion, that a combined algorithm which exploits the parallel nature and robustness of the genetic algorithm without sacrificing the efficiency of backpropagation might perform better than each algorithm for itself.

Data Conversion

A hybrid genetic backpropagation algorithm will necessarily involve a conversion between the chromosome string and the matrix representation of a network as introduced in Section 3.2.3. This brings up two encoding problems, for which a reasonable compromise must be found.

Accuracy

While backpropagation normally uses a hardware floating point representation with a length of 32 or even 64 bit per weight, the genetic algorithm normally uses a representation just long enough to guarantee a sufficient diversity of the population. Thus, too short representations might cause small weight updates to be ignored, while too long representations will lead to unnecessary long chromosome string, which slow down the genetic features of the algorithm by increasing communication times.

Range

The representations of numbers as grey encodes bitstrings is limited to a certain range. If this range is too small, gradient updates exceeding the limits would be lost, if it is too large, this unnecessarily increases the searchspace for the stochastic genetic search.

Genetic Backpropagation

The combined algorithm used in this project works, like the standard genetic algorithm, on a population of string-encoded networks, but before the fitness is calculated, the individual is decoded, undergoes a certain number of backpropagation steps and is encoded again. Then, the selection is performed and a new population is generated.

Initialising the Population

The initial population is not generated by randomising the chromosome strings but by randomising weight matrices which are then encoded as strings. This allows the initial weights to be distributed in a smaller range than the used encoding interval and reduces the probability of starting backpropagation in a very flat region of the error function which would lead to very small gradients.

Backpropagation Parameters

As introduced in Section 4.1, the main parameters of the backpropagation algorithm are the learn rate $\gamma$ and the impulse rate $\alpha$ . Instead of fixing $\gamma$ and $\alpha$ for all individuals as a simulation parameter, their values can also be optimised by the genetic algorithm by including them into the chromosome string. The phenotype new of the individual is then determined by the phenotype of the network (described by the weight matrices $W^{(1)}_k$ and $W^{(2)}_k$ , see Section 3.2.3 for details) and the backpropagation parameters $\gamma_k$ and $\alpha_k$ . The new encoding function $e'_{net}$ is the defined as

$\begin{displaymath}e'_{net}: {\bf R}^{(n+1)m} \times {\bf R}^{(m+1)o} \times {\... ...htarrow {\bf B}^l, \quad l=b \left( (n+1)m+(m+1)o+a+g \right) \end{displaymath}$

$\begin{displaymath}e'_{net}(W^{(1)},W^{(2)},\gamma,\alpha)= e_{net}(W^{(1)},W^{... ...\, \phi_g(\gamma) \oplus \mbox{\rm grey}_a \, \phi_a(\alpha) \end{displaymath}$

The mapping functions $\phi_g$ and $\phi_a$ map the possible values of $\gamma$ and $\alpha$ onto the integer intervals $\{0, \ldots 2^g-1\}$ and $\{0, \ldots 2^a-1\}$ . For the implementation in this project, $\phi_g$ was chosen to be logarithmic and $\phi_a$ to be linear over the following intervals:

$\begin{displaymath}\phi_g: [0.02,20] \rightarrow {\bf Z}_{256}, \quad \phi_a: [0,0.95] \rightarrow {\bf Z}_{256} \end{displaymath}$

Next: Implementations Up: Parallelism Previous: The Parallel Genetic Algorithm Contents