I'm looking for a sanity check my thinking around the memory layout of a key-value store.
We have a system that periodically polls and monitors on host-level jobs. On every poll, a given job emit can multiple timeseries. A timeseries is {timestamp->double} mapping. Each timeseries is identified by its timeseries id (e.g. "cpu", "memory"). It can also be user-defined. These timeseries ids are not globally unique though - what's globally unique is the job_id + timeseries_id. I will call this job_ts
The cardinality of jobs id ~5m and cardinality of timeseries ids is ~1B
We want to store the datapoints of every job timeseries into a key value store. I figured the key could be: hash(job_ts) + fixed_width_unix_timstamp
. The value is the datapoint value (e.g. cpu of the job at time T).
This way I easily write range queries (i.e. give me all datapoints for a given job_ts
from [T, T+100]
The hash function I figured would work is xxhash128 which would cost ~32bytes, per entry. This way I easily write range queries (i.e. give me all datapoints from [T, T+24hours] for job=x and timeseries_id=cpu)
Am I on the right track? I want to do is optimize for our memory footprint on the key value store - what are some techniques/tradeoffs I should consider?
-
1I would try to figure a rough estimate for how much more space per month a readable format would cost, and then talk to some hardware people about the price of that memory/storage.Thomas Koelle– Thomas Koelle07/23/2025 11:45:33Commented Jul 23 at 11:45
-
Is there a use case to query a time range for all or multiple job_ids? Even if it is a use case, is it rare enough to justify a more complex storage and handling? Storing one separate time series per job_id would require the job_id only once without any hash values.Thibe– Thibe07/23/2025 15:16:50Commented Jul 23 at 15:16
-
2Not sure if this really fits as an answer, thus only a comment. It might be worth to take a look into some simple compression techniques, e.g. store one or few base-timestamps and make all other timestamps relative to that and thus are much smaller. Depending on the approach this may or may not affect the time to find the entries.Thibe– Thibe07/23/2025 15:27:50Commented Jul 23 at 15:27
3 Answers 3
If I understand the proposal correctly, you will need to track the hash of the job_ts
in some other location (hashes are not reversible.)
If you already have to do that, there's no reason you are limited to hashes. You could ceate a sequential id or any other scheme you like. With 32 bits, you will be good for billions job_ts before you run out of unique Ids. You will need to make sure these ids are unique before you associate logs with them, of course.
-
The idea here is a 4 byte uint32 should be sufficient to make an ID for each job since the
job_ts
cardinality is low, right? I can see how that would reduce my memory footprint Great suggestionnz_21– nz_2107/23/2025 23:17:54Commented Jul 23 at 23:17 -
@nz_21 You can store around 4 billion unique values in a 32-but int. How many unique job_ts ids do you get in a week? How many weeks of these do you need to store? That will tell you if 32-bits is enough. That part is the same whether you use hashing or not. The main upshot here is that you don't seem to need a hash here. Hashing is cool but I think you might be over-engineering a tiny bit.JimmyJames– JimmyJames07/24/2025 13:38:54Commented Jul 24 at 13:38
-
Sorry, I should clarify, you can store a lot more ids as a monotonically increasing sequence in 32-bits than you can with a hash in 32-bits (without a high risk of collision.) Your monotonically increasing sequence will not produce collisions if managed properly. But the math for how many unique ids you require is the same.JimmyJames– JimmyJames07/24/2025 14:03:54Commented Jul 24 at 14:03
When using a 32 byte hash function for numbers in the range of 5 millions, the risk of hash collisions is extremely low, hence that's surely a possible approach. Still, you have to make sure no job_id is used twice for different jobs, not even accidentally, hence you will need a central registry for the IDs.
But why complicate things by a hash function at all? Just introduce a unique job number which is automatically generated, maybe filled up with leading zeros to get fixed width numbers. Or, if you want to save the central registry, you may add an additional job_GUID - filled with GUIDs, of course. That is the standard solution for distributed systems.
-
Thanks for the answer! The trouble is that some these
job_ts
ids can be arbitrarily long since they can be user defined - that's why I thought about hashing them.nz_21– nz_2107/23/2025 12:55:40Commented Jul 23 at 12:55 -
2@nz_21 be careful about hashing user defined anything. Sometimes the user is not your friend.candied_orange– candied_orange07/24/2025 22:15:22Commented Jul 24 at 22:15
-
@DocBrown xxhash128 is a 32 bit hash?candied_orange– candied_orange07/24/2025 22:16:02Commented Jul 24 at 22:16
-
@candied_orange: you are right, I need more sleep ;-)Doc Brown– Doc Brown08/06/2025 15:07:48Commented Aug 6 at 15:07
-
@nz_21: I changed my answer, according to the comments, hope it fits better now.Doc Brown– Doc Brown08/06/2025 15:17:42Commented Aug 6 at 15:17
"We have a system that periodically polls and monitors on host-level jobs. On every poll, a given job emit can multiple timeseries.
...
Each timeseries is identified by its timeseries id (e.g. "cpu", "memory"). It can also be user-defined.
...
The cardinality of jobs id ~5m and cardinality of timeseries ids is ~1B
...
We want to store the datapoints of every job timeseries into a key value store. I figured the key could be: hash(job_ts) + fixed_width_unix_timestamp. The value is the datapoint value (e.g. cpu of the job at time T).
...
I want to optimize for our memory footprint on the key value store - what are some techniques/tradeoffs I should consider?".
Map "CPU", "memory ", and "user-defined string" to a 32-bit number stored in a key map table; slower lookup but considerable memory savings.
Forget the hash, instead of xxhash128(job_id_string + timeseries_id_string), your "series ID" (the unique identifier for job_id + timeseries_id) becomes a composite of the two integer IDs.
You could combine these two int32 values into a single int64 (8-byte) value. For instance, you could concatenate them (e.g., (job_int_id << 32) | timeseries_int_id). This 8-byte int64 would then serve as the first part of your key in the key-value store.
Updated Key Structure:
Original Key:
[16-byte xxhash128(job_id_string + timeseries_id_string)] + [8-byte fixed_width_unix_timestamp] = 24 bytes
New Key:
[8-byte composite_int64(job_int_id + timeseries_int_id)] + [8-byte fixed_width_unix_timestamp] = 16 bytes
Combine that with Gorilla time series compression.
Consider a Time-series Database such as timescaledb, a PostgreSQL extension for high-performance real-time analytics on time-series and event data, or ArcticDB which sits on MongODB. That saves reinventing the wheel and provides a suggested format for the data; you don't want to convert from what you are using to what format your database uses.
Remember not to get caught by the Y2038 bug, instead of 32 bit you may want to use signed 64-bit time_t integers.
-
1Note ArcticDB.io is now just-s3 (no MongoDB dependency). It's a python data frame database optimised for storing tables of (primarily) numeric values as time series.James Blackburn– James Blackburn07/25/2025 07:18:07Commented Jul 25 at 7:18
-
The traditional way to monitor this in infrastructure would be a metrics store - prometheus, influxdb etc. Cardinality can be quite painful. If you can group your data in a table, and are happy reading / writing data in Python, ArcticDB would be a trivial tool to try. Demos on the website. colab.research.google.com/github/man-group/ArcticDB/blob/v4.5.1/…James Blackburn– James Blackburn07/25/2025 07:19:39Commented Jul 25 at 7:19
-
1@James I wanted to suggest "Arctic" but they changed to "ArcticDB" (which is better, but different). I wanted to suggest open source and a simple upgrade to an existing database, rather than a standalone and complex offering: tdengine.com/how-to-choose-the-best-time-series-database - so I'd prefer Prometheus over the expense of Influx.Rob– Rob07/25/2025 09:43:55Commented Jul 25 at 9:43
-
Here's a comparison of Prometheus and Influx influxdata.com/comparison/influxdb-vs-prometheus (on Influx's website).Rob– Rob07/25/2025 09:53:39Commented Jul 25 at 9:53
-
2Great suggestion, thank you. To maintain the key map table my thinking is to use a mysql table, with primarily key on (job_id, timeseries_id) so I can get atomically incremented IDs.nz_21– nz_2107/25/2025 16:44:31Commented Jul 25 at 16:44