Finding correct window size for the motif discovery · stumpy-dev/stumpy · Discussion #791

adityabhandwalkar
Feb 2, 2023

I was going through your tutorial for the SKIMP In that tutorial the max_m = 1000 , my question is how did you reach to that value of 1000 when you do not known about maximum length of a subsequence/ window. In my case I have 60k data points only with the sampling rate how can i determine the max_m and min_m is there any alternative to determine this

thanks

Replies: 2 comments 9 replies

seanlaw
Feb 2, 2023
Maintainer

@adityabhandwalkar Unfortunately, there is no secret to choosing max_m. Ultimately, you should set max_m to something like 2x the maximum time that you believe an event is likely to take within your domain (nobody can know this but you). Usually, with time series, you might set max_m to an hour, week, month, or year (depending on your data). When you really have no idea then you basically have to set max_m to the length of your time series. Again, there is no magic here and SKIMP is only a tool that may or may not help.

9 replies

@seanlaw

seanlaw Jan 17, 2025
Maintainer

@rajivsam I created an issue here and it looks like they actually have notebook with some code!

@rajivsam

rajivsam Jan 19, 2025

@rajivsam Thank you for sharing this! I wasn't aware that this work was published (by the same authors as the matrix profile) and it seems reasonable. Would you have any time to create a Python implementation that reproduces ones of the listed examples?

@seanlaw , thanks for the template, I was able to recreate the results in the paper. I did the evaluation for two random datasets, the google colab notebook is here:
colab notebook
The template was an excellent start, just needed a little bit of work to reconcile the sketch with section 4 of the paper.

@seanlaw

seanlaw Jan 19, 2025
Maintainer

Perfect! This is super helpful @rajivsam. Would you be interested in submitting a PR for the issue (of creating a standalone notebook reproducer, which you've basically done)? You've already done a lot so there is absolutely no pressure to, especially if you are busy. We are always looking for new contributors but I can also take it from here if you are not able to.

@rajivsam

rajivsam Jan 19, 2025

@seanlaw , thanks, if it is just adding this notebook to a known location(forking the repository and adding the notebook, submitting the PR), I am happy to do it. If there is running a bunch of regression tests, then I think its best if someone familiar with the process does it.

@seanlaw

seanlaw Jan 20, 2025
Maintainer

No, problem. We can handle it. Thanks @rajivsam

rajivsam
Jan 17, 2025

Cool, thanks.

...

On Fri, Jan 17, 2025 at 11:06 AM Sean M. Law ***@***.***> wrote: @rajivsam <https://github.com/rajivsam> I created an issue here <#1062> and it looks like they actually have notebook with some code! — Reply to this email directly, view it on GitHub <#791 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB6PPLJB7IQMEV4ALIVZWTT2LCJFLAVCNFSM6AAAAABVK4EDDWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCOBWGI3DIOA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

Finding correct window size for the motif discovery #791

Uh oh!

adityabhandwalkar Feb 2, 2023

Replies: 2 comments · 9 replies

Uh oh!

seanlaw Feb 2, 2023 Maintainer

Uh oh!

seanlaw Jan 17, 2025 Maintainer

Uh oh!

Uh oh!

rajivsam Jan 19, 2025

Uh oh!

seanlaw Jan 19, 2025 Maintainer

Uh oh!

rajivsam Jan 19, 2025

Uh oh!

seanlaw Jan 20, 2025 Maintainer

Uh oh!

rajivsam Jan 17, 2025

adityabhandwalkar
Feb 2, 2023

Replies: 2 comments 9 replies

seanlaw
Feb 2, 2023
Maintainer

seanlaw Jan 17, 2025
Maintainer

seanlaw Jan 19, 2025
Maintainer

seanlaw Jan 20, 2025
Maintainer

rajivsam
Jan 17, 2025