UnbalancedOptimalTransport

This package provides an MIT license, dependency-free implementation of Algorithm 1 of "Sinkhorn Divergences for Unbalanced Optimal Transport" [SFVTP19] in unbalanced_sinkhorn!, in a generic and extensible way. This is used to compute Sinkhorn divergences via sinkhorn_divergence!.

See Optimal transport for some background and a mathematical description of the quantities computed by this package, Public API for a description of the functions provided, and below for a quick tutorial.

While the code is generic, it is not currently written to dispatch to BLAS or non-scalar GPU operations, although such contributions would be welcomed.

Quick tutorial

julia> using UnbalancedOptimalTransport, Plots

julia> X = 1:4; # a set

julia> a_weights = [0.5, 1.0, 1.0, 0.5]; # weights on X

julia> Y = 3:5; # another set

julia> b_weights = [0.5, 0.75, 0.5]; # weights on Y

julia> plot(bar(X, a_weights, label="a"), bar(Y, b_weights, label="b"),
               xlims = (0, 6), ylims=(0, 1.2), legend=:topleft);

We wish to move the a histogram to the b histogram with the least total cost, however we will clearly need to remove some mass as well. We choose the KL divergence to penalize mass destruction.

julia> a = DiscreteMeasure(a_weights, X);

julia> b = DiscreteMeasure(b_weights, Y);

julia> cost = (x, y) -> abs(x - y)
#1 (generic function with 1 method)

julia> ϵ = 0.01 # small regularization
0.01

julia> D = UnbalancedOptimalTransport.KL(1.0)
UnbalancedOptimalTransport.KL{1.0}()

julia> SD = sinkhorn_divergence!(D, cost, a, b, ϵ)
1.689583294114332

The number SD provides us with a distance between the a and b histograms, as computed by the (unbalanced) Sinkhorn divergence. We can also compute the "optimal coupling" which shows us how to move between the histograms.

julia> π = optimal_coupling!(D, cost, a, b, ϵ)
4×3 Matrix{Float64}:
 0.0406497    9.57016e-9  2.37046e-9
 0.218817     5.15162e-8  1.27602e-8
 0.588947     1.38656e-7  3.43441e-8
 3.21737e-81  0.547343    0.135573

Here, π[x,y] represents how much mass we should move from x to y. Since the sum of the rows is less than the corresponding entries of a_weight, some of the mass is destroyed. To understand this better, let us lower the penalty for mass destruction and mass creation to almost nothing, and see how the optimal coupling changes.

julia> D = UnbalancedOptimalTransport.KL(0.01)
UnbalancedOptimalTransport.KL{0.01}()

julia> π = optimal_coupling!(D, cost, a, b, ϵ)
4×3 Matrix{Float64}:
 1.17188e-44  1.18819e-87  1.52781e-109
 1.21517e-22  1.23208e-65  1.58425e-87
 0.630032     6.38797e-44  8.21386e-66
 7.09839e-45  0.520064     6.68715e-23

We can see that the only non-tiny entries of π are the (3,1) and (4,2), corresponding to (x=3, y=3), as the first element of $Y$ is $3$. We see then with this choice of divergence and cost, we don't really move any mass, and just create and destroy as needed. On the other hand, let us see what happens when there is a high penalty for mass creation and destruction:

julia> D = UnbalancedOptimalTransport.KL(1000.0)
UnbalancedOptimalTransport.KL{1000.0}()

julia> sinkhorn_divergence!(D, cost, a, b, ϵ)
┌ Warning: Maximum iterations (100000) reached
│   max_residual = 0.003752393989088887
└ @ UnbalancedOptimalTransport ~/work/UnbalancedOptimalTransport.jl/UnbalancedOptimalTransport.jl/src/sinkhorn.jl:122
┌ Warning: Maximum iterations (100000) reached
│   max_residual = 0.00297370382702411
└ @ UnbalancedOptimalTransport ~/work/UnbalancedOptimalTransport.jl/UnbalancedOptimalTransport.jl/src/sinkhorn.jl:122
┌ Warning: Maximum iterations (100000) reached
│   max_residual = 0.004481400153572679
└ @ UnbalancedOptimalTransport ~/work/UnbalancedOptimalTransport.jl/UnbalancedOptimalTransport.jl/src/sinkhorn.jl:122
432.7413767888393

julia> π = optimal_coupling!(D, cost, a, b, ϵ)
┌ Warning: Maximum iterations (100000) reached
│   max_residual = 0.004481400153572679
└ @ UnbalancedOptimalTransport ~/work/UnbalancedOptimalTransport.jl/UnbalancedOptimalTransport.jl/src/sinkhorn.jl:122
4×3 Matrix{Float64}:
 0.163659     0.187912  0.125149
 0.327646     0.3762    0.250549
 0.327974     0.376576  0.2508
 3.45923e-88  0.287005  0.191146

We see warnings about the maximum number of iterations being exceeded, so let's increase that parameter and try again. Note that warnings can be disabled by passing warn=false as a keyword argument.

julia> sinkhorn_divergence!(D, cost, a, b, ϵ; max_iters = 10^6)
171.45854254853293

julia> π = optimal_coupling!(D, cost, a, b, ϵ; max_iters = 10^6)
4×3 Matrix{Float64}:
 0.130872     0.150266  0.100077
 0.262006     0.300833  0.200355
 0.262268     0.301134  0.200555
 2.76622e-88  0.229507  0.152852

We see that now we move some mass from each each element of X to each element of Y, to try to avoid needing to create or destroy mass, except no mass is moved from x=4 to y=3, presumably because it's better to move it to y=4 and y=5 instead.