Optimal transport
Mathematics
Books
- Villani, 2003: Topics in optimal transportation (doi)
- Villani, 2009: Optimal transport: Old and new (doi, pdf)
- Mathematically formidable and physically massive, but surprisingly readable
- My preferred reference for the theory
- Santambrogio, 2015: Optimal transport for applied mathematicians (doi)
- A fine book, but misnamed: it is mostly pure mathematics
- An exception is Chapter 6: Numerical methods
- Rachev & Rüschendorf, 1998: Mass transportation problems, Volume I: Theory
(doi) and Volume II: Applications (doi)
- The standard reference until the publication of Villani’s books
Topical surveys
- De Philippis and Figalli, 2014: The Monge-Ampère equation and its link to
optimal transportation (doi)
- Mentioned in a curious MO question about uses of higher-order derivatives
Unbalanced optimal transport
Sometimes it is too much to ask that the marginal measures be preserved, which in particular assumes they have equal mass. In unbalanced optimal transport, the measure preservation assumption is relaxed.
- Chizat et al, 2018: Unbalanced optimal transport: Dynamic and Kantorovich formulations (doi, arxiv)
- Figalli, 2009: The optimal partial transport problem (doi, pdf)
Computation
Books and surveys
- Peyré & Cuturi, 2019: Computational optimal transport (doi, arxiv)
- The friendliest book on optimal transport, not just for computational issues
- Solomon, 2018: Optimal transport on discrete domains (arxiv, pdf)
- From the 2018 AMS Short Course on Discrete Differential Geometry (pdf)
- For a general audience: Solomon, 2017: Computational optimal transport (doi)
Fast computation via entropic regularization
Chapter 4 of Peyré & Cuturi’s book is a good overview.
- Cuturi, 2013: Sinkhorn distances: Lightspeed computation of optimal transport
(arxiv, pdf)
- The important paper that first applied the Sinkhorn-Knopp algorithm to solve regularized optimal transport
- Summarized in: Peyré & Cuturi, 2019, Sec. 4.2: Sinkhorn’s algorithm and its convergence
- Altschuler, Weed, Rigollet, 2017: Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration (arxiv, pdf)
- Blanchet et al, 2018: Towards optimal running times for optimal transport (arxiv)
- Lin, Ho, Jordan, 2019: On efficient optimal transport: An analysis of greedy
and accelerated mirror descent algorithms (arxiv)
- Convergence analysis of greedy variant of Sinkhorn algorithms, horribly named the “Greenkhorn algorithm”
Statistical inference
What are the statistical properties of optimal transport between random measures, such as empirical distributions? Such questions are just starting to be answered:
- Panaretos & Zemel, 2019: Statistical aspects of Wasserstein distances (doi,
arxiv)
- Review paper, citing and belonging to the same series as: Wang, Chiou, Müller, 2016: Functional data analysis (doi)
- See especially Sec 4: Optimal transport as the object of inference, which is mainly about Fréchet means in Wasserstein space
- Zemel & Panaretos, 2019: Fréchet means and Procrustes analysis in Wasserstein space (doi, arxiv)
- Peyré & Cuturi, 2019: Computational optimal transport, Sec 9.4: Minimum Kantorovich estimators
- Bassetti, Bodini, Regazzini, 2006: On minimum Kantorovich distance estimators
(doi)
- Studies existence, measurability, and consistency of “estimators defined as minimizers of Kantorovich distances between statistical models and empirical distributions”
- Bassetti & Regazzini, 2006: Asymptotic properties and robustness of minimum dissimilarity estimators of location-scale parameters (doi)
- Bernton, Jacob, Gerber, Robert, 2019: On parameter estimation with the
Wasserstein distance (pdf, supplementary )
- Extends results of Bassetti et al, 2006 to misspecified models and non-i.i.d. data
Phillipe Rigollet and his students are doing much interesting work on the statistics of optimal transport:
- Rigollet & Weed, 2018: Entropic optimal transport is maximum-likelihood deconvolution (doi, arxiv)
- Rigollet & Weed, 2019: Uncoupled isotonic regression via minimum Wasserstein deconvolution (doi, arxiv)
- Forrow et al, 2018: Statistical optimal transport via factored couplings
(arxiv)
- Previously called: “Statistical optimal transport via geodesic hubs”
Other applications in ML and statistics include:
- Bonneel, Peyré, Cuturi, 2016: Wasserstein barycentric coordinates: Histogram regression using optimal transport (doi, online )
- Genevay, Peyré, Cuturi: GAN and VAE from an optimal transport point of view (arxiv)
- Schmitz et al, 2018: Wasserstein dictionary learning: Optimal transport-based unsupervised nonlinear dictionary learning (doi, arxiv)
- Bernton, Jacob, Gerber, Robert, 2019: Approximate Bayesian computation with the Wasserstein distance (doi, arxiv)