Oct. 27, 2018
Recently, I discussed the use of PyTorch on Mobile / IoT-like devices. Naturally, the Caffe2 Android tutorial was a starting point. Getting it to work with Caffe2 from PyTorch and recent Android wasn't trivial, though. Apparently, other people have not had much luck, I easily got a dozen questions about it on the first day after mentioning it in a discussion.
This should be easier. Here is how.
July 28, 2018
The beauty of PyTorch is that it makes its magic so conveniently accessible from Python. But how does it do so? We take a peek inside the gears that make PyTroch tick.
(Note that this is a work in progress. I'd be happy to hear your suggestions for additions or corrections.)
June 26, 2018
Today I gave a talk on Alex Graves's classic RNN paper and what I took away from implementing the handwriting generation model in PyTorch. To me, the density of insights combined with the almost complete absence of mechanical bits as well as the relatively short training time, makes this a very worthwhile exercise that I can heartily recommend to anyone interested in RNNs.
June 15, 2018
The beautiful thing of PyTorch's immediate execution model is that you can actually debug your programs.
Sometimes, however, the asynchronous nature of CUDA execution makes it hard. Here is a little trick to debug your programs.
June 4, 2018
At the excellent fast.ai course and website, they are training a language model zoo.
It's a charming idea and here is (not quite complete yet) code and model I got for German.
Oct. 29, 2017
The other day I got a question how to do wavelet transformation in PyTorch in a way that allows to compute gradients (that is gradients of outputs w.r.t. the inputs, probably not the coefficients). I like Pytorch and I happen to have a certain fancy for wavelets as well, so here we go.
May 29, 2017
This is following up on my post on improved and semi-improved training of Wasserstein GANs. A few days ago, Kodaldi et al published How to Train Your DRAGAN. They introduce an algorithmic game theory approach and propose to apply the gradient penalty only close to the real-data manifold. We take a look at their objective function, offer a new possible interpretation and also consider what might be wrong in Improved Training objective.
While doing so we introduce PRODGAN and SLOGAN.
April 13, 2017
We look at Improved Training of Wasserstein GANs and describe some geometric intuition on how it improves over the original Wasserstein GAN article.
Updated: We also introduce Semi-Improved Training of Wasserstein GANs, a variant that is simpler to implement as it does not need second derivatives.