In this blogpost, we explore how Wasserstein Generative Adversarial Nets (WGAN) improve upon the minimax game / objective of Generative Adversarial Nets (GAN) to stabilize training and make the value of the game correlate better with the performance of the generator. We first derive the divergence between the real data distribution and the generated one that GANs minimize. Then, we discuss how this divergence is sub-optimal for the optimization of neural networks and introduce the Wasserstein distance, proving that it has better properties w.r.t. neural net optimization. Thereafter, we prove that the Wasserstein distance, although intractable, can be approximated and indeed back-propagated to the generator. We assume the reader is familiar with basic principles of machine learning, like neural networks and gradient descent, simple probabilistic concepts like, probability density functions , the basic formulation of GANs and Lipschitz functions . GANs minimize the Jensen-Shannon di
The goal of this blogpost is to demonstrate how the formulation of the Variational Autoencoder (VAE) translates to empirical observations using the MNIST dataset. First, we examine how VAEs handle the tasks that their formulation dictates, i.e. reconstruction of their input and generation of samples using the decoder. Then, we study the output distribution of the encoder in the latent space. Last, we use our observations of the latent space and strategically choose the latent variable to generate examples in order to see how the latent space affects the pixel space and the final image. We assume the reader is familiar with VAEs and the how the formulation is derived and interpreted. If not, we have an in-depth study on the matter. Run in Google Colab Open in GitHub Download IPython notebook Run in Google Colab Open in GitHub Download IPython notebook Let's start by examining the loss function of the VAE: \[ \begin{equat