- 
  Notifications
 
You must be signed in to change notification settings  - Fork 337
 
Finding correct window size for the motif discovery #791
-
Hello @seanlaw ,
I was going through your tutorial for the SKIMP In that tutorial the max_m = 1000 , my question is how did you reach to that value of 1000 when you do not known about maximum length of a subsequence/ window. In my case I have 60k data points only with the sampling rate how can i determine the max_m and min_m is there any alternative to determine this
thanks
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 9 replies
-
@adityabhandwalkar Unfortunately, there is no secret to choosing max_m. Ultimately, you should set max_m to something like 2x the maximum time that you believe an event is likely to take within your domain (nobody can know this but you). Usually, with time series, you might set max_m to an hour, week, month, or year (depending on your data). When you really have no idea then you basically have to set max_m to the length of your time series. Again, there is no magic here and SKIMP is only a tool that may or may not help.
Beta Was this translation helpful? Give feedback.
All reactions
-
@rajivsam I created an issue here and it looks like they actually have notebook with some code!
Beta Was this translation helpful? Give feedback.
All reactions
- 
 
👍 1 
-
@rajivsam Thank you for sharing this! I wasn't aware that this work was published (by the same authors as the matrix profile) and it seems reasonable. Would you have any time to create a Python implementation that reproduces ones of the listed examples?
@seanlaw , thanks for the template, I was able to recreate the results in the paper. I did the evaluation for two random datasets, the google colab notebook is here:
colab notebook  
The template was an excellent start, just needed a little bit of work to reconcile the sketch with section 4 of the paper.
Beta Was this translation helpful? Give feedback.
All reactions
-
Perfect! This is super helpful @rajivsam. Would you be interested in submitting a PR for the issue (of creating a standalone notebook reproducer, which you've basically done)? You've already done a lot so there is absolutely no pressure to, especially if you are busy. We are always looking for new contributors but I can also take it from here if you are not able to.
Beta Was this translation helpful? Give feedback.
All reactions
-
@seanlaw , thanks, if it is just adding this notebook to a known location(forking the repository and adding the notebook, submitting the PR), I am happy to do it. If there is running a bunch of regression tests, then I think its best if someone familiar with the process does it.
Beta Was this translation helpful? Give feedback.
All reactions
-
No, problem. We can handle it. Thanks @rajivsam
Beta Was this translation helpful? Give feedback.
All reactions
- 
 
👍 1 
-
Beta Was this translation helpful? Give feedback.
All reactions
- 
 
👍 1