I would like to ask for references to algorithms that can project shape information about an object to 1 dimension. Specifically I am training a neural network to be able to identify objects with similar shape as the ones in the training set. The objects in my case are molecules for which the positions, masses and radii of the atoms are known. So far I have used information about the interatomic distances. The shape of the molecule is characterized by the distributions of atomic distances to four strategic reference locations. In turn, each of these distributions is described through its first three moments. In this way, each molecule has associated a vector of 12 shape descriptors. E.g.
shape = [5.263145206201491, 2.374050283937628, 0.5667128412399703, -0.9294169868091768, 5.248610806623028, 2.540858433874688, 0.6060854720036649, -0.8149532304046945, 10.186033275946016, 5.224912272773887, -0.5637327938909833, -1.1097296046204561]
However, a 12-element feature vector is too small to describe accurately the molecular shape. I tried to add more reference location and consequently more atomic distances but I noticed no gains in accuracy. Is anyone aware of any other algorithm that can express the molecular shape in a feature vector form? Please point me to references if you know.
-
1$\begingroup$ Welcome to CS.SE! Interesting question. Are there any particular properties you want the shape vector to satisfy (e.g., rotation invariance, translation invariance)? How do you plan to use it? Do you plan to use it as an input to a machine learning algorithm? Are you going to use it to compare two molecules and see whether they have the same shape? If it's the latter, there might be better solutions that don't work by first constructing a feature vector for each and then comparing feature vectors. $\endgroup$D.W.– D.W. ♦2017年05月30日 16:13:31 +00:00Commented May 30, 2017 at 16:13
-
$\begingroup$ Hello! Yes I would like the shape vector of each molecule to be rotationally and translationally invariant. And yes, these vectors will be the input to a machine learning algorithm. If I wanted to do pairwise comparisons I wouldn't bother to project the shape information to a vector form. $\endgroup$tevang– tevang2017年05月30日 17:28:23 +00:00Commented May 30, 2017 at 17:28
-
$\begingroup$ How many different molecules are there ? $\endgroup$user16034– user160342022年05月06日 08:29:29 +00:00Commented May 6, 2022 at 8:29
-
$\begingroup$ What do you call "distributions of atomic distances", "four strategic reference locations" and "first three moments" exactly ? Your description is pretty vague. $\endgroup$user16034– user160342022年09月03日 10:34:07 +00:00Commented Sep 3, 2022 at 10:34
-
$\begingroup$ Here is the reference: jcheminf.biomedcentral.com/articles/10.1186/1758-2946年4月27日 $\endgroup$tevang– tevang2022年09月04日 11:09:48 +00:00Commented Sep 4, 2022 at 11:09
1 Answer 1
There's lots of work in the computer vision community on "shape descriptors". This might or might not be useful. See e.g., https://en.wikipedia.org/wiki/Shape_analysis_(digital_geometry), https://en.wikipedia.org/wiki/Shape_context, https://en.wikipedia.org/wiki/Image_moment, https://en.wikipedia.org/wiki/Spectral_shape_analysis.
Explore related questions
See similar questions with these tags.