Today I talked at the hacking machine learning meetup in Munich. We looked at a Alex Graves's classic RNN paper and what I took away from implementing the handwriting generation model in PyTorch. To me, the density of insights combined with the almost complete absence of mechanical bits as well as the relatively short training time, makes this a very worthwhile exercise that I can heartily recommend to anyone interested in RNNs. Thanks for the active participation, it was great to speak to and with such an engaged audience! My slides are available.
Here is the link to the paper by Alex Graves, Generating Sequences With Recurrent Neural Networks.
My source code is also available. It should run on PyTorch 0.4 even if it hasn't been switched to idiomatic PyTorch 0.4. Model training takes about 4 hours on a GTX 1080, if you are in a hurry, you can also use my pretrained weights.
The thing I like about this project is that you can learn about
- The typical setup of RNNs for sequence generation with the difference between training mode and prediction mode.
- It's great to have a probabilistic interpretation of your loss function.
- A really simple example of attention - which is also a great example to give "auxiliary" data to the RNN.
In sharing my code, I hope that you actually tinker with it! Have fun.
Thank you to JetBrains for hosting me and meetup organizer Sergii Khomenko for taking the picture.