-
Notifications
You must be signed in to change notification settings - Fork 337
Inconsistent behavior of stumpy.motifs across Stumpy versions #794
-
I am trying to recreate the example from this discussion, which uses stumpy.motifs() to detect the top 2 motifs from a univariate sensor time series dataset.
When I run the code using Stumpy 1.11, I do not get the same result. Specifically, the indices and distances returned by the call are of size (1, x) in Stumpy 1.11 and size (3, x) in Stumpy 1.9.1 (x depends on max_matches).
What am I doing wrong here? Or is the behavior indeed inconsistent?
Beta Was this translation helpful? Give feedback.
All reactions
@amurthy-sunysb I think the issue is that when cutoff=None, STUMPY tries its best to guess at the "best" rational cutoff value according to np.nanmax([np.nanmean(P) - 2.0 * np.nanstd(P), np.nanmin(P)]). However, in this case, np.nanmean(P) - 2.0 * np.nanstd(P) is negative and so the smallest possible cutoff is set to np.nanmin(P). Unfortunately, all of the matches have a distance that is greater than max_distance (also a guess by STUMPY). So, ultimately, you need to do something like:
stumpy.motifs(c, p[:,0], cutoff=0.053, max_distance=np.inf)
By increasing cutoff to an even larger value will give you more motifs while decreasing max_distance will reduce the number of possible matches. T...
Replies: 2 comments 6 replies
-
@amurthy-sunysb I think the issue is that when cutoff=None, STUMPY tries its best to guess at the "best" rational cutoff value according to np.nanmax([np.nanmean(P) - 2.0 * np.nanstd(P), np.nanmin(P)]). However, in this case, np.nanmean(P) - 2.0 * np.nanstd(P) is negative and so the smallest possible cutoff is set to np.nanmin(P). Unfortunately, all of the matches have a distance that is greater than max_distance (also a guess by STUMPY). So, ultimately, you need to do something like:
stumpy.motifs(c, p[:,0], cutoff=0.053, max_distance=np.inf)
By increasing cutoff to an even larger value will give you more motifs while decreasing max_distance will reduce the number of possible matches. There is no magic here and it is up to you play around with the "best" cutoff and max_distance values. All of this is mentioned clearly in the stumpy.motifs docstring:
Note that, in the best case scenario, the returned arrays would have shape (max_motifs, max_matches) and contain all finite values. However, in reality, many conditions (see below) need to be satisfied in order for this to be true. Any truncation in the number of rows (i.e., motifs) may be the result of insufficient candidate motifs with matches greater than or equal to min_neighbors or that the matrix profile value for the candidate motif was larger than cutoff. Similarly, any truncation in the number of columns (i.e., matches) may be the result of insufficient matches being found with distances (to their corresponding candidate motif) that are equal to or less than max_distance. Only motifs and matches that satisfy all of these constraints will be returned.
If you must return a shape of (max_motifs, max_matches), then you may consider specifying a smaller min_neighbors, a larger max_distance, and/or a larger cutoff. For example, while it is ill advised, setting min_neighbors=1, max_distance=np.inf, and cutoff=np.inf will ensure that the shape of the output arrays will be (max_motifs, max_matches). However, given the lack of constraints, the quality of each motif and the quality of each match may be drastically different. Setting appropriate conditions will help ensure appropriately constrained results that may be easier to interpret.
I hope that helps
Beta Was this translation helpful? Give feedback.
All reactions
-
I've added additional warnings that provide suggestions on how the user may possibly overcome an empty set of results:
Note that these warnings will only appear in the next (future) release of STUMPY.
Beta Was this translation helpful? Give feedback.
All reactions
-
❤️ 1
-
Thank you @seanlaw! Did the default behavior of cutoff and max_distance change across the two Stumpy versions: 1.9 and 1.11? I was hoping to get the same results across the two versions. Thank you again!
Beta Was this translation helpful? Give feedback.
All reactions
-
Unfortunately, I don't remember as v1.9 was 1.5 years ago. I don't see any obvious difference and so it may have been a bug that was fixed or we tried to make it more stable (i.e., make less assumptions) as much of that code was written over 2 years ago. @NimaSarajpoor Would you happen to have any idea what may have caused a difference in results?
Beta Was this translation helpful? Give feedback.
All reactions
-
@amurthy-sunysb Can you tell me which minor version of 1.9 you are referring to?
Beta Was this translation helpful? Give feedback.
All reactions
-
It looks like the cutoff argument was added in v1.9.2. Prior to this, (i.e., in versions 1.9.1 and before), the cutoff argument didn't exist and so there was no "default". So, in prior versions, it was essentially the same as setting cutoff=np.inf (i.e., no cutoff)
Beta Was this translation helpful? Give feedback.
All reactions
-
Sorry for the late response. As @seanlaw said, the major difference is the new param cutoff.
@amurthy-sunysb: If you are familiar with git and you want to explore further the difference between the two versions, you may run the following command to see the difference:
# In stumpy folder:
git diff 0cfa693^..7439720 "./stumpy/motifs.py"
The first commit, 0cfa693, is for version 1.9.0, and the second commit shown above is for the latest release version, i.e. 1.11.1.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1
-
Thank you @seanlaw and @NimaSarajpoor . I downgraded to Stumpy 1.9.1, which doesn't have the cutoff. All this makes sense now as the default behaviors wrt cutoff are different in the two versions.
Beta Was this translation helpful? Give feedback.