Reinforcement Learning – The cart pole game, Part 1

                                        The cart pole game

This and the next few blogs are about Tensorflow 2.0. I will go through a rather recent example – the cart pole game – and try to understand its advanced methods along the way.

In this blog, TensorFlow will be upgraded, some key terms will be explained and the example will be introduced.

Reinforcement Learning – The mountain car game, Part 2

I did some experiments with this small example game. My goal was to find effective AI settings; settings that would always lead to quick success. Last time, the AI did manage to beat the game after ~180 attempts – these settings were the starting point of a long testing session.

In this blog, I will go over the results of the experiments.

Reinforcement Learning – The mountain car game, Part 1

I followed a TensorFlow tutorial that puts the paradigms mentioned some blogs prior (Q-Learning, epsilon-greedy policies) to practice. I implemented an AI that learns to play the ‚mountain car game‘, which is about getting a small car onto the peak of a mountain. This is done by swinging left and right, gaining momentum and using that to win. To increase the difficulty a bit, there is a time constraint of three seconds.

This blog is about the setup and the inital results.

Reinforcement Learning – Starting to getting started

So far in this project no programming was needed. For the next step, which is to take a closer look on TensorFlow, a simple Notepad will not be sufficient. That is why this blog is about one of the first thing to do when it comes down to the development: choosing the right Integrated Development Environment (aka IDE), installing the best choice and getting TensorFlow running.

To be precise, this blog is about possible IDE options and  what to keep in mind when installing TensorFlow.

Reinforcement Learning – Advanced practices

Last week I gave a quick introduction to Q learning. This week I want to follow up to this topic by taking a closer look on more advanced development practices than those used in MarI/O.

This blog covers three useful practices in the field of reinforcement learning. The sources mentioned at the end provide code examples that will be useful for future experiments.

Reinforcement Learning – Introduction to Q-Learning

So far, the AI learned to play a game by interacting with its environment and maximizing a desired reward. Practically, the AI just repeatedly played the game; getting better with each iteration. To be successful, it is necessary to act upon a policy. However, there are other approaches for this case: the so-called “Deep Q network” or the “epsilon-greedy policy”. I will focus on the former for one main reason: it is compatible with TensorFlow, which is a python library I wanted to take a closer look on anyway.

This blog serves as an introduction into the paradigm of Q-Learning.

Reinforcement Learning – Another Perspective, QnA

The AI learns. When playing Mario, at first, the AI tries to do nothing. It will see that doing nothing is not how the game is supposed to be played – and therefore it will try to move. Eventually, it will run into a wall and see that simply moving is not enough. After several attempts it will jump over the wall; most likely by accident. It will soon learn that jumping is very important to reach the goal – and thus it will jump more often. At some point, it will jump into an enemy. Step by step the AI will make more and more progress until it eventually reaches the goal.

However, one important question rises here: How is this useful for us? I want to use this blog to look at my topic from a less technical standpoint. I want to answer this kind of rising questions that might be unclear – thus, this blog follows a Q&A approach.