Neural network is a computer modell based on human brain with with each output of one layer playing the input of the next layer. Normally the parameters of the modell can be optimized with training data and a cost function by the classical backpropagation. For each batch of training data the cost function will be evaluated and the internal parameters would be adjusted by calculating the gradients of the parameters in dependent of the cost function’s output and moving into the negative of the gradients.
This kind of optimization is however not impossible for training a network to play a video game like Flappy Bird. Assuming the cost function is a summation of how far a bird can move, the gradients of the internal parameters can not be infered by calculus methods. Therefore another approach was designed to look for the optimal parameters which make the computer modell plays the game the best.
The challenge of this project is to implement an environment where the network can obtain its inputs from. Since the neural network can not be optimized with proper backprogattion, genetic algorithms were used to select the fittest network and reproduce with a low mutation probability. This method sounds at first the same as the brute-force approache Trial And Error. However, in genetic algorithm, the best neural network of one generation will be kept (elitsm) and will stay in the gene pool of the next generation. The result of the algorithm therefore will produce guaranteed the same of even better results in the next generation.
The inputs of a neural network are
- Bird’s position
- Next pipes' positions
- Distances the the next pipes
- Bird’s current velocity
For evolution purpose, the network’s internal parameters will be flatten into one dimensional byte array to ease bit flipping operations.
The evolution algorithm itself works like following:
- Create 20 random initialized neural network.
- Let each network plays the game as long as it can. Of course some of the network would be able to beat the game and play it forever, so there must be a threshold where to game would stop, regardless of how far a bird can fly.
- Evaluate each network’s fitness.
- Choose N fittest of the population and add them to the next generation.
- Randomly choose two network, mutate them and mate them together. Fitter networks will have more chance of mating.
- Repeat the process until there is a sastified result.
The implementation itself was done in Java Swing and Kotlin. The implementation can be found at the following GitHub link
With no hidden layer, the network struggles to learn and the search optimization does not help that much. But as the GIF shows, the fitness of the population’s best neural network improves after each generation and the behavior of the bird is far different from random decision.
Already with one hidden layer, the agent is already able to beat the game after a few generations. With more layers, the agent can play the game perfectly. The result is however quite boring when the bird never dies.