next up previous contents
Next: The Population Up: The Genetic Algorithm Previous: Chromosome Strings   Contents

Subsections

The Individual


Genotype and Phenotype

The genotype $S$ of an individual $I$ is its chromosome string, while its phenotype $P$ is the representation of the problem, the user originally wants to solve. Since the genetic algorithm operates only on the genotype, a function must be provided to translate $P$ into $S$.

How this encoding is done, depends on the type of the problem parameters $P$ and no fixed rules can be given. A encoding is optimal if the relation between the fitness $f(I)$ of a individual $I$ is in a linear relation to the Hamming distance between $S$ and $S_{opt}$, when $S_{opt}$ represents the genotype of the optimal solution:


\begin{displaymath}f(I) = -k \, H(S,S_{opt}) + c, \quad k>0 \end{displaymath}

Since the $S_{opt}$ in normally not known, the general strategy should be, that similar phenotypes should also have a small Hamming distance between their corresponding genotypes. Moreover, related groups of parameters should be encoded in subsequent substrings of the genotype.

The Grey Code

While Boolean parameters of the phenotype $P$ can be directly encoded by mapping them onto the chromosome string $S$, the encoding of integer and real values is not so straightforward.

Using binary numbers for the encoding of integers of the range $[0,2^n-1]$, leads to the problem, that the encoding of two successive numbers $a$ and $a+1$ may have a Hamming distance up to $n-1$.


\begin{displaymath}\max \{H(\mbox{\rm bin}_n a,\mbox{\rm bin}_n(a+1)) \,\vert\, ...
...H(\mbox{\rm bin}_n(2^{n-1}-1),\mbox{\rm bin}_n(2^{n-1})) = n-1 \end{displaymath}

The Grey code encodes the successor $a+1$ of an integer $a$ by inverting one digit of $\mbox{\rm grey}_n a$, and is defined by the following conditions.


\begin{displaymath}\mbox{\rm grey}_n: {\bf Z}_{2^n} \rightarrow {\bf B}^n, \quad
a,b \in {\bf Z}_{2^n}, \quad {\bf Z}_k = \{0,1, \ldots k-1\} \end{displaymath}


\begin{displaymath}\mbox{\rm grey}_n a = \mbox{\rm grey}_n b \, \Longleftrightarrow \, a=b \end{displaymath}


\begin{displaymath}\mbox{\rm grey}_1 0 = (0), \qquad
\mbox{\rm grey}_{n+1} a = (0) \, \oplus \, \mbox{\rm grey}_n a \end{displaymath}


\begin{displaymath}H(\mbox{\rm grey}_n a,\mbox{\rm grey}_n(a+1 \bmod 2^n))=1 \end{displaymath}

A real value $x \in [a,b)$ can be mapped onto ${\bf Z}_{2^n}$ and then be Grey encoded as $n$-bit strings. The mapping function $\phi$ must be strictly monotonous and its return values should be uniformly distributed. For already uniformly distributed parameters, $\phi$ should be defined as


\begin{displaymath}\phi: [a,b) \rightarrow {\bf Z}_{2^n}, \qquad
\phi(x) = \left\lfloor 2^n \, {x-a \over b-a} \right\rfloor \end{displaymath}


Encoding Neuronal Networks

If the genetic algorithm is to be used for the training of neural networks, the phenotype $P$ is the set of network parameters which are to be optimised, namely the set of the parameter vectors $P_k$ of all neurones in ${\bf N}$.

The main network type used in this project is a 2-layer network which is fully interconnected except for one extra neurone in the input and the hidden (the second) layer which has no inputs and is constantly set to 1. The other neurones are either input or standard backpropagation neurones as described in Section 2.4.4.

Thus, the parameters of a $n$-$m$-$o$-network can be described by two real weight matrices $W^{(1)}$ and $W^{(2)}$ of the dimensions $(n+1) \times m$ and $(m+1) \times o$. If we decide to use Grey code with a precision of $b$ bit, restrict the possible weight values to the interval $[-w,w]$ and assume them to be uniformly distributed, then the encoding function $e_{num}$ for weights can be defined as


\begin{displaymath}e_{num}: [-w,w] \rightarrow {\bf B}^b, \qquad e_{num}(x)=\mbox{\rm grey}_b \, \phi (x) \end{displaymath}


\begin{displaymath}\mbox{with} \quad \phi: [-w,w] \rightarrow {\bf Z}_{2^b}, \qq...
...phi(x) = \left\lfloor (2^b-1) \, {x+w \over 2 w} \right\rfloor \end{displaymath}

The encoding function $e_{net}$ for the whole network is very straightforward and simply concatenates all parameters. The input weights of one neurone are encoded together as one coherent substring.


\begin{displaymath}e_{net}: {\bf R}^{(n+1)m} \times {\bf R}^{(m+1)o} \rightarrow {\bf B}^l,
\quad l=b \left( (n+1)m+(m+1)o \right) \end{displaymath}


\begin{displaymath}e_{net}(W^{(1)},W^{(2)})=
\bigoplus_{j=1}^m \, \bigoplus_{i=...
...lus_{k=1}^o \, \bigoplus_{j=1}^{m+1} \, e_{num} (w_{ij}^{(2)}) \end{displaymath}


next up previous contents
Next: The Population Up: The Genetic Algorithm Previous: Chromosome Strings   Contents

(c) Bernhard Ömer - oemer@tph.tuwien.ac.at - http://tph.tuwien.ac.at/~oemer/