I aim to create a tool that applies a Reinforcement Learning (RL) bot to any given Unity application. The bot shall then try to maximize a game’s built-in score, which will be the only requirement for the tool to work. Functionality-wise, I see no major obstacles in the tools developement – there are a few things to clear up design-wise though.
In this blog, I will go over some thoughts concerning the design of this (still) unnamed RL tool for Unity. This blog marks the beginning of (another) new blog series I will dedicate to the tools‘ design.
Who could make use of this tool – and how?
On the one hand (the productivity driven side), game designers and developers may use the tool to test their applications functionality. The AI would be able to learn to act in the provided environment, fully exploiting every weakness it can find. The best scoring runs could be analyzed later, leading to concrete points of optimization. Alternatively, the AI could also demonstrate why certain features just would not work, avoiding frustating moments in future usability tests.
On the other hand (the entertainment driven side), basically everyone might use the tool to see how an AI would perform in a certain game. As we can see from the MarI/O project, this aspect seems very interesting. The project shows that people show interest in how an AI performs in games we like to play – it shows the difference of the AI that would exploit everything it can to reach the goal in the contrast to our approach of reaching the goal while having as much fun as possible. Adding AI to this entertainment aspect allows for even more entertainment options – for example, users could try to race the AI.
Can this tool replace usability tests?
Although one aspect of this tool is to automate usabiltiy tests, this is not ideal. Usability tests including real people should not be replaced by AI. Since designers base their design on people, and not robots, it is definitely a good idea to test important features with real people. However – AI can be used for testing not-so-important features. Assuming developers implement several features per day, AI could be used to test the features‘ integrity within the overall system overnight, which may save time when it comes to debugging or bugfixing (since developers might get clues where to look).
Flexibility issues
The RL bot will try to maximize a desired „score“ of the application, which (Super Mario taken for reference) could be collected coins, completion time or number of perfect timed jumps. There is a problem with that – it means that the user of tool, needs to make sure that the „score“ increases only when the desired actions were performed. That means that the Unity application needs to be ‚prepared‘ to be tested with the tool, which renders it very unflexible.
A developer might not need the flexability of being able to test individual approaches of completing a level. Game designers however would at least be curious about these things. As well as everyone who uses the tool for entertainment. If users want to test a certain approach, they need to adapt their score system, which gets annoying very quickly.
In conclusion, we can define a few need statements:
Game developers need a way to easen their debugging processes. At the moment they find to difficult to locate the source of unexpected glitches and bugs.
Game designers need a way to check whether a new feature would fit into a work-in-progress project. At the moment they fear that the feature might influence the overall flow in a bad way.
Users need a way to compare their experiences with a game with that of an AI. At the moment they are curious of how an AI would perform – and might improve their own skills by learning from the AI.
Compared to the „Lost Chapters“ series, this series will be short.
There ‚just‘ are a few design steps ahead, including…
- a stakeholder map based on the today mentioned target groups,
- the choice of primary and secondary personas,
- a project rundown that gives a general overview that explains the tool,
- some user journey mapping that also shows the tool in the form of a minimum viable product,
- a risk grid that tries to point out the most important issues (like the flexibility issue) of the tool
- and a visualization – meaning a wiremock or pen and paper prototyp.
I aim to finish the series within the next 2 blogs.