- 
 Couldn't load subscription status. 
- Fork 337
Load Profile Clustering #1037
-
I'm looking at an anonymized smart meter dataset from Switzerland. Each file contains a full month in 15 minute resoltion of about 200'000 meters. There are several years of data, in total about 900GB of CSV files.
Reading the original papers, I had hope to be able to use Matrix Profiles to do:
- identify common pattern/motifs that are shared by some timeseries (like shapelets but i have no labels)
- cluster timeseries according to some kind of MP_dist or
- understand trends or developments within these clusters (maybe using chains?)
This question might be related to the MP XXVII paper, but I'd prefer stumpy - and it's ecosystem with support for dask and GPU.
Could you advise how to go forward?
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment 1 reply
-
The following publications should help you:
https://www.mdpi.com/1424-8220/21/23/8036 
https://link.springer.com/chapter/10.1007/978-981-16-2377-6_5 
Disclaimer: I am an author in these publications.
Beta Was this translation helpful? Give feedback.
All reactions
- 
 👍 1
-
Dear Jakob
Thank you for the ideas and fast response. I see some differences that make it difficult for me to understand how to go about it:
- your sampling rate is higher (per second resolution), compared to 15 minute. this slow sampling is done on purpose to protect the privacy of customers to avoid disaggregation. so I will not be able to cluster smart meters by devices behind the meter.
- I have no labels in my data
Beta Was this translation helpful? Give feedback.