Optimal Transport Between Two Matrices · PythonOT/POT · Discussion #508

kiasar
Aug 19, 2023

Hi. I seem not to find a module to calculate the distance between two matrices (with positive values and equal summation of elements). It would be great to have one.

Replies: 4 comments 1 reply

cedricvincentcuaz
Aug 21, 2023
Maintainer

Hello @kiasar ,

The Gromov-Wasserstein distance can be used to compare any pair of matrices, including those you've mentioned. POT includes several solvers for estimating this distance, e.g. 'ot.gromov.gromov_wasserstein', and other features such as the Gromov-Wasserstein barycenter, etc.
I recommend you take a look at the gallery of examples on this subject, e.g https://pythonot.github.io/auto_examples/gromov/plot_fgw.html#sphx-glr-auto-examples-gromov-plot-fgw-py

0 replies

kiasar
Aug 21, 2023
Author

Hi @cedricvincentcuaz,
Thank you for pointing me to the Gromov-Wasserstein distance and the resources associated with it.

I initially approached this by visualizing the matrix cells as (i, j) points in a 2D space and then utilized ot.emd() to compute the "cost" as the distance between matrices. Given your expertise in this domain, it would be great to know your opinion: Would the ot.gromov.gromov_wasserstein method offers advantages or be more appropriate for this task than my approach? (my interest is to cluster the matrices)

Thank you for your guidance.

0 replies

cedricvincentcuaz
Aug 22, 2023
Maintainer

In most applications, it is relevant to see a matrix as a graph where the $I^{th}$ row and column give you the connections of the node $I$ to all the other nodes of the graph.

If I understood correctly what you did, it comes down to flatten each matrix $M \in \mathbb{R}^{n \times n}$ therefore modelled as 1D distributions with $n^2$ bins. Then indeed you can compare these distributions using 'ot.emd' (or 'ot.emd_1d' and get the same results).
Doing so, you destroy the spatial arrangements of the graphs. Imagine that you have 2 graphs with binary connections and the same number of nodes, it suffices that both graphs have K non-zero connections so that $emd=0$ but these graphs can be very different.
However, the Gromov-Wasserstein (GW) distance would be zero iif these graphs have the same spatial arrangement up to any permutation of the nodes within each graph. If this invariance makes sense for your matrices, clearly GW would be a better choice.

Then to cluster a set of several matrices, you could e.g perform GW-based dictionary learning (https://pythonot.github.io/auto_examples/gromov/plot_gromov_wasserstein_dictionary_learning.html#sphx-glr-auto-examples-gromov-plot-gromov-wasserstein-dictionary-learning-py).

0 replies

kiasar
Aug 25, 2023
Author

Firstly, thank you so much for the insights you've shared. I truly appreciate your expertise.

I wanted to touch base with you on a couple of things:

I've been working with your plot2D_samples_mat() function and thought of an enhancement that might be beneficial. I introduced a boolean draw_arrows variable to allow for directional arrows in the plots. I found it useful for my work, and I've created a merge request for the same. I'd love for you to take a look when you have a moment.
While going through your "Plot Fused-Gromov-Wasserstein" documentation, I noticed a potential oversight. It seems that all the pl.scatter(y, x) instances might need to be corrected to pl.scatter(x, y), as the x-coordinate comes first in pl.scatter.

I've been looking into the "Gromov-Wasserstein" method, and it seems like it's designed for situations a bit trickier than ours. Our challenge feels a bit more straightforward.

Just to paint a clearer picture: We've got thousands of these square matrices that are all the same size and with the same grand sum. When we show them as colorful heatmaps, you can easily spot some repeating designs. Our big question is: how do we cluster these? Here's what they look like:
image

My initial thought was to plot these patterns in a simple way and use ot.emd() to see how different they are from each other. But, as you might have guessed, it didn't quite hit the mark. Here's what that looked like:
image

Given everything I've shared, would you kindly advise if you still think the Gromov-Wasserstein method remains a suitable approach in your opinion?

1 reply

@cedricvincentcuaz

cedricvincentcuaz Aug 25, 2023
Maintainer

Hello @kiasar,

For 1) I find these arrow plots clearly less readable than the plot on the left, but I guess it is a matter of habits and taste so it might indeed be a relevant new feature. For 2) I believe that it was simply a matter of rendering but thank you for pointing these out.

For your original problem, as all matrices have the same size, GW is mostly relevant if you want to find an optimal permutation to align compared matrices. Otherwise If the matrices are already aligned, i.e for two matrices it makes sense to directly compare M_1[I, j] and M_2[I,j], you should rather compare them using distances based on element-wise operations like the euclidean distance or e.g the Aitchison distance that might be more suitable with your simplex constraints. And then use any kind of clustering algorithm that follows the geometry you chose.

Best.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimal Transport Between Two Matrices #508

Uh oh!

{{title}}

Uh oh!

kiasar
Aug 19, 2023

Replies: 4 comments 1 reply

Uh oh!

{{title}}

Uh oh!

cedricvincentcuaz
Aug 21, 2023
Maintainer

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

kiasar
Aug 21, 2023
Author

Uh oh!

{{title}}

Uh oh!

cedricvincentcuaz
Aug 22, 2023
Maintainer

Uh oh!

{{title}}

Uh oh!

kiasar
Aug 25, 2023
Author

Uh oh!

{{title}}

Uh oh!

cedricvincentcuaz Aug 25, 2023
Maintainer

Select a reply

Uh oh!

Optimal Transport Between Two Matrices #508

Uh oh!

kiasar Aug 19, 2023

Replies: 4 comments · 1 reply

Uh oh!

cedricvincentcuaz Aug 21, 2023 Maintainer

Uh oh!

Uh oh!

kiasar Aug 21, 2023 Author

Uh oh!

cedricvincentcuaz Aug 22, 2023 Maintainer

Uh oh!

kiasar Aug 25, 2023 Author

Uh oh!

cedricvincentcuaz Aug 25, 2023 Maintainer

kiasar
Aug 19, 2023

Replies: 4 comments 1 reply

cedricvincentcuaz
Aug 21, 2023
Maintainer

kiasar
Aug 21, 2023
Author

cedricvincentcuaz
Aug 22, 2023
Maintainer

kiasar
Aug 25, 2023
Author

cedricvincentcuaz Aug 25, 2023
Maintainer