My workplace organised a AWS DeepRacer competition and a few colleagues and I where keen to give it a go. We received a 30min walkthrough about the basics and which track we where going to race and then left us to our own devices.
With AWS DeepRacer you train a virtual car to go around a lap by writing a reward function. You either reward or punish certain actions and over time it should learn how to go around the track while acquiring the most reward points. Which hopefully corresponds with a fast lap around the track. We didn't spend a whole lot of time researching on what a great reward function might look like and in the end we used a function that added a few example functions together. It used three metrics.
As you can see in the video the car was not driving in a smooth fashion at all. So we ended up adjusting metric 3 and punish steering heavily and train it for another hour. Even though the logs looked very promising, the real world experience was far from great. While the large turn was very smooth, the sharp turn got completely ignored and the car would drive straight into the wall. A great example why I wouldn't want to write mission critical software where lives are at stake. Overall we had a great time and learned a bit more about machine learning. In the end we came fourth place.
What would I try next time? Apparently the default training parameters weren't particular efficient. Learning more about the training parameters and how to configure them for better results. And probably remove the steering parameter from the reward function. IF trained well, the model should figure out itself to not steer a lot.
George Timmermans, Research Toolmaker, Software Engineer and Tinkerer