Reinforcement Learning – Tool Dev – To Do

New blog series starting today; the final arc of my work: tool development.

In this diary-formatted blog, I summarize the progress made so far and try to create a small to do list for myself.

The ultimate Goal

The practical part of my work focuses on the development of some kind of API between an AI and a Unity game, as visualized in Figure 1. The idea is that the game provides data to the API, which then directs that data to the AI. The AI uses that data to take an appropriate action, which will be communication to the API, which then forwards that information as input to the game. To make that work, some configuration will be required – that is the user’s job. The main points of configuration cover information about the game and its inputs. When the AI stops, some results should be given to the user, indicating possible difficulty issues or bad practices that may develop within the game.

Figure 1 – Concept Visualization

 

The approach

My approach to this goal is to develop a wizard-like tool that guides users through the configuration process, displays the current AI progress in a console and finally, provides statistics about the overall session, as shown in Figure 2.

Figure 2 – Design of the RL Tool

 

The first step is to set the project settings, which include the company name that may be altered in Unity’s project settings (by default, this company name is „DefaultCompany“) and the name of the project. A possible addition to these settings could be the name of the „score“ PlayerPref, which is a must have for this tool. All of these settings indicate where to look for the AI’s performance.

The AI will focus to master the controls of the game. Therefore, the second step is to set all possible inputs, according to Unity’s Input Manager. It might make sense to implement some presets for this step, to easen the process.

The third step is to select the game window, which displays the Unity game. The idea is that the API forwards every pixel of that window as input to the AI, which analyzes the resulting image to decide for an action. The only requirement for the user to make this work is to simply open the game, preferably in windowed mode. Development-wise, this will be the toughest nut to crack.

After these three steps, the testing/playing/training process may be initiated. During this process, a console will display the latest results of the AI. The process will run until the user cancels the process or the AI utterly finishes the game to the point that it crashes. It might make sense to implement a time limit in the third step, to improve the process…

After the process is finished, results will be shown. Ideally, the user can pinpoint the points the AI struggled. It might make sense to use more debugging PlayerPrefs for this task, which would be added to the project settings in the first step… Anyway, the user should also be able to save the results in form of a PDF file.

 

Challenges

Thinking though this process really shows how many gaps there still are. However, these will surely be filled along the way. So, after all that thinking – where should I start?

First, I need to find a fitting working environment – a tool to develop a tool. I will start researching appropriate actions next week.

Second, I will tackle the biggest challenge in this tool – providing the game window as input to the AI. This will probably require research in the fields of computer graphics and operating systems.

Third, the second biggest challenge – simulating inputs without destroying the whole system. Taking over the computers keyboard may be required for this tool to work, and, as I have already experienced once, this might cause hardware disabilities or at least severe trust issues.

Finally, the remaining tasks, which look like a cakewalk in comparison to the former mentioned:

  • Communicating PlayerPrefs from the game (which are stored in the registry) to the AI.
  • Implementing a console that iteratively prints status information.
  • Calculating results and creating an appropriate chart.
  • Implementing an option to export that chart as PDF.

 

…Seems manageable.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.

18 − acht =