Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Trying to identify most common distinct motifs in time series data and then search for those motifs #438

Discussion options

Hi,

I'm wondering anyone can possibly point me in the right direction,

I'm reading the paper - https://www.researchgate.net/publication/343280455_Human_Presence_Detection_by_monitoring_the_indoor_CO2_concentration

They mention:
"We identified events of presence or absence using motif detection on the CO2 concentration time series. Therefore, we used Motif Clustering within computing the full distance matrix using the STOMP algorithm".

I believe this is one of the algorithms your API uses.

The following is an example of one of their figures:
fig4

I have used their open data and tried to use stumpy, to find the same first 3 motifs as them, by doing:

#!/usr/bin/python3
import stumpy
import csv
import datetime as dt
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as dates
from matplotlib.patches import Rectangle
import matplotlib.pyplot as plt
if __name__ == "__main__":
 co2 = []
 with open('MUC2020/csv/RoomA_CO2.csv') as csvfile:
 reader = csv.DictReader(csvfile, delimiter=';')
 for row in reader:
 if len(row['CO2-Concentration (in ppm)'].strip()) > 0:
 co2.append(float(row['CO2-Concentration (in ppm)']))
 fig, axs = plt.subplots(1, sharex=True, gridspec_kw={'hspace': 0})
 plt.suptitle('Motif (Pattern) Discovery', fontsize='30')
 axs.plot(co2)
 axs.set_ylabel('CO2 ppm', fontsize='20')
 m = 10
 c = np.array(co2)
 p = stumpy.stump(c, m)
 dists, inde = stumpy.motifs(c, p[:,1], max_matches=3)
 for z in range(0, inde.shape[1]):
 rect = Rectangle((inde[0][z], 0), m, 10, facecolor=(1.0,0.0,0.0))
 axs.add_patch(rect)
 plt.show()

The following image shows my results using their data:

my_fig

The 3 motifs that the motifs function is finding for me, all seem to be very similar in that they're all below the downward trends of the CO2 ppm line.

I'm just wondering if anyone could possibly point me in the right direction to find the same 3 unique motifs they have and to then search for those motifs across the time series data.

I'm thinking the motifs function, might not be the function I should be using to find the most common distinct motifs?

Many thanks!

You must be logged in to vote

@chrisruk Thank you for your question and welcome to the STUMPY community. Please be forewarned that the stumpy.motifs function is still in the experimental stage and should be officially released in the upcoming v1.9.0 release. In the meantime, the API may still change.

Having said that, it looks like what you've done seems fine. However, I could be wrong but your window size of m = 10 appears to be smaller than the window size used in the paper. Also, can you please plot the raw CO2 time series along with the matrix profile computed using stumpy.stump (like what is shown here)? That will at least give you an idea of where the potential motifs are located within your time series (by look...

Replies: 1 comment 12 replies

Comment options

@chrisruk Thank you for your question and welcome to the STUMPY community. Please be forewarned that the stumpy.motifs function is still in the experimental stage and should be officially released in the upcoming v1.9.0 release. In the meantime, the API may still change.

Having said that, it looks like what you've done seems fine. However, I could be wrong but your window size of m = 10 appears to be smaller than the window size used in the paper. Also, can you please plot the raw CO2 time series along with the matrix profile computed using stumpy.stump (like what is shown here)? That will at least give you an idea of where the potential motifs are located within your time series (by looking at the major local minima). Of course, the matrix profile will depend on the appropriate window size used.

You must be logged in to vote
12 replies
Comment options

Thanks again for all your help!

You bet!

What's the soon-to-be-released function going to be called out of interest, so I can look out for it.

It is called stumpy.stimp and actually computes what is referred to as a "pan matrix profile" (see this paper for more details). A work-in-progress tutorial can be found here

Here is the code that I used for your reference (please be warned that this is still in development and subject to change):

min_m, max_m = 10, 100
co2_pan = stumpy.stimp(c, min_m=min_m, max_m=max_m, percentage=1.0) # This percentage controls the extent of `stumpy.scrump` completion
percent_m = 1.0 # The percentage of windows to compute
n = np.ceil((max_m - min_m) * percent_m).astype(int)
for _ in range(n):
 co2_pan.update()

And you can plot the results with:

import matplotlib.pyplot as plt
from matplotlib import cm
fig = plt.figure()
fig.canvas.toolbar_visible = False
fig.canvas.header_visible = False
fig.canvas.footer_visible = False
color_map = cm.get_cmap("Greys_r", 256)
threshold = 0.2 # 0.2 is usually an excellent default but this is something that you'll need to play around with
im = plt.imshow(co2_pan.pan(threshold=threshold), cmap=color_map, origin="lower", interpolation="none", aspect="auto")
plt.xlabel("Time", fontsize="20")
plt.ylabel("m", fontsize="20")
plt.clim(0.0, 1.0)
plt.colorbar()
plt.tight_layout()
plt.show()

download-5

As you adjust the threshold, what you are looking for is the location of the peak of the right angle triangles that are formed. Of course, this is definitely more art than science but at least it is a "better" way to explore the data.

Comment options

Sorry for the delay in replying. I'll have a look at the tutorial and paper you referenced - they look very handy!

Comment options

No apologies necessary. Just an FYI that the new version was released today!

Comment options

Thanks a lot for your help, it's much appreciated!

I adapted your changes, to use the 'stumpy.match' function to find all matches for each motif.

I decided to use just 2 motifs as I'm mainly curious about the entering / leaving of a room.

Figure_1

I also added the 'max_distance' parameter to stumpy.match to get a couple more matches.

import stumpy
import csv
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = [20, 6] # width, height
plt.rcParams['xtick.direction'] = 'out'
co2 = []
with open('MUC2020/csv/RoomA_CO2.csv') as csvfile:
 reader = csv.DictReader(csvfile, delimiter=';')
 for row in reader:
 if len(row['CO2-Concentration (in ppm)'].strip()) > 0:
 co2.append(float(row['CO2-Concentration (in ppm)']))
m = 24
c = np.array(co2)
p = stumpy.stump(c, m)
dists, inde = stumpy.motifs(c, p[:, 0], max_motifs=2)
fig, axs = plt.subplots(2, sharex=True, gridspec_kw={'hspace': 0})
plt.suptitle('Motif (Pattern) Discovery', fontsize='30')
axs[0].plot(co2)
axs[0].set_ylabel('CO2 ppm', fontsize='20')
cols = ['red' , 'green', 'blue' ]
for z in range(0, inde.shape[0]):
 col = cols[z]
 start = inde[z, 0]
 stop = inde[z, 0] + m
 matches = stumpy.match(c[start:stop],c, max_distance=2.0) 
 for mt in range(matches.shape[0]):
 s = matches[mt, 1]
 st = s + m
 axs[0].plot(np.arange(s, st), c[s : st], c=col)
axs[1].plot(p[:, 0])
axs[1].set_ylabel('Matrix profile', fontsize='20')
plt.show()

This example doesn't seem to be working for Stumpy 1.11.1. The inde returned from the call to stumpy.motifs is of shape (1, 10) @chrisruk.

Comment options

This example doesn't seem to be working for Stumpy 1.11.1. The inde returned from the call to stumpy.motifs is of shape (1, 10) @chrisruk.

@amurthy-sunysb Please see this discussion

Answer selected by chrisruk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /