NEAT with Acceleration
Last Updated on November 5, 2023 by Editorial Team
Author(s): Kevin Guo
Originally published on Towards AI.
While discussing with several other users in the Learn AI Together discord server (https://discord.gg/learnaitogether), one of us came up with an idea: averaging weight mutations over several generations. With this idea, a genome would receive a weight mutation one generation, and a lesser version of the same weight mutation the next generation, and so on. While I immediately wanted to code this idea, I wasnβt finished implementing NEAT in Unity. Now, I am done and have come up with several techniques to make this strategy work.
Before discussing the aforementioned techniques, let me explain an issue NEAT suffers from. Being a genetic algorithm, NEAT doesnβt use any gradients, and all mutations are entirely random. This means a single genome β or even an entire population β could theoretically get worse. Even then, the algorithm wouldnβt know it got worse and would have to spend several following generations undoing these changes. Of course, on average, with hundreds of genomes, elitism, and speciation, this rarely makes NEAT incapable of solving a problem. Yet, it still slows down the rate at which NEAT converges and is a significant hurdle to its effectiveness.
Now, back to averaging weight mutations. The idea here is to give each weight some acceleration value and mutate that value. The weight is then mutated indirectly by its acceleration value, as illustrated above. Acceleration values are inherited by the children of a genome, with the hope that positive changes will continue propagating in the next generation.
Mathematically, the weight (w) and acceleration (a) for a new generation (t) of a connection after a random mutation (x):
Of course, if these were the only formulas used, then the acceleration could explode in value. Therefore (and here is where I deviate from the original idea), the acceleration values are adjusted by the genomeβs performance.
Logically, the current acceleration of a weight also represents how much the weight changed in the last generation. For instance, if a weight has an acceleration of 0.5, then it was also mutated by a value of 0.5 in the last generation.
You should already have the fitness (f_(t-1)) of a genome from the evaluation of generation (t-1). It is trivial to additionally keep track of the fitness of a genome in its previous evaluation as well (f_(t-2)) (The way this works with crossover is explained later).
With f_(t-1) and f_(t-2), you can calculate d_f, representing how much better or worse a genome got. Intuitively, if a genome got better, then the accelerations that were applied are effective, and vice versa. As such, we adjust the acceleration based on d_f:
Since d_f is a very small value tending towards zero, it is either multiplied by some constant or raised to some power less than one before being multiplied to a_(t-1). It is also capped at a maximum of 1 to prevent a_(t-1) from exploding.
Once the accelerations have been adjusted, crossover and reproduction can be done to produce the new generation (t).
For asexual reproduction, the last fitness (f_(t-2)) of the child is simply equal to the fitness of the parent (f_(t-1)). The acceleration values are directly inherited.
For crossover, the last fitness is averaged from the two parents (although a different strategy can be employed, like picking the lower fitness, higher fitness, weighting the average, etc.). The acceleration values are summed from the two parents, though an average can be taken instead.
Altogether, it looks like this:
Additional details:
- Connection mutations occur after applying weight changes, its acceleration is set equal to the weight it was added with.
- Genomes created from elitism have all their acceleration values set to zero
- Genomes of the 1st generation have all their acceleration values set to zero as f(t-2) does not exist.
- sqrt(d_f) was specifically used to normalize
Experiments
To test the effectiveness of NEAT with acceleration, three comparisons were made in Unity with XOR, pole balancing, and acrobot.
XOR
Surprisingly, XOR was tested a variety of times with different weight powers, and nearly no difference was seen between NEAT with acceleration and standard NEAT, although standard NEAT often did better by an average of one to three generations. This is likely because XOR relies on a certain topography being found and isnβt dependent on finding super-optimal weights.
Pole Balancing
A modified implementation of double pole balancing was used, with one pole being half the length of the other (I had issues getting small poles to work with Unity physics) and varying force magnitudes.
The population is considered successful when a single agent keeps both poles up for 40 seconds.
NEAT with Acceleration took a mean of 33 generations and a median of 21 generations (stddev: 84) out of 500 runs.
Standard NEAT took a mean of 46 generations and a median of 29 generations (stddev: 58) out of 500 runs.
A variety of weight powers were tested, where NEAT with Acceleration consistently outperforms Standard NEAT, showing the improvements were not just due to stronger weight mutations.
Acrobot
A slightly modified implementation of acrobot (from openai gym) was used. No significant differences are present except the magnitude of the forces exerted.
The population is considered successful if an agent reaches an upright position within 10 seconds.
NEAT with Acceleration took a mean of 31 generations and a median of 30 generations (stddev: 17) out of 500 runs.
Standard NEAT took a mean of 41 generations and a median of 38 generations (stddev: 21) out of 500 runs.
As with pole balancing, a variety of weight mutation powers were used and NEAT with Acceleration consistently does better than Standard NEAT.
Conclusion
Although my Unity replications of the benchmarks were admittedly inaccurate, the data strongly suggests that NEAT with Acceleration leads to faster convergence across certain problems. This is likely because, with the acceleration corrections, the mutations are no longer completely random and are instead based on results obtained from the evaluation. Another interesting idea to explore is crossover between more than two genomes since the acceleration contains useful information about which weights should be mutated in which direction. However, I do not suggest NEAT with Acceleration to be used in competitive environments, as d_f tends towards zero, and it would require extensive modification of the fitness function. I would also not suggest using acceleration in very noisy environments, where the same genome can get an entirely different fitness due to luck, as the acceleration values may swing around and explode in value.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI