abstract: Generative adversarial training is often formulated as $1$ or $2$-Wasserstein estimation problem, through the so-called Kantorovich formulation of Optimal Transport. In practise, the Kantorovich problem is approximated via neural network (discriminators) and the dual functional is estimated from samples. Many heuristics have been developed in order to reinforce the duality constraintsLipschitzianity of the Kantorovich potentials, which leads to a different source of errors.
I will discuss mathematical aspects of the problem of estimating the Wasserstein distance via discriminator and show numerical results comparing some different methods in the literature.