Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Duration storage #5955

gavinking started this conversation in Design Proposals
Jan 14, 2023 · 3 comments · 11 replies
Discussion options

Now that I've got our Duration support working properly again, it has re-occurred to me that the way Hibernate stores Durations is not very user-friendly: sure, Java represents all durations with nanosecond precision, but it's not very nice to open up the database and see a value with 11 zeros on it representing one measly hour. A nanosecond is not actually a very useful human-scale measure of time.

So to my mind, a much friendlier way to store a duration is as seconds, i.e. using numeric(21,9) instead of numeric(21).

Or, hell, even as days would make sense, using something like numeric(22,14).

How could we accommodate this in a way that doesn't break backward compatibility with our own data?

Well, I know I've resisted introducing new special-purpose annotations for this kind of thing in the past, but here I guess I would propose a new @DurationStorage annotation, with enum TimeZoneStorageType { NANOSECOND, SECOND, DAY }.

This would actually be pretty straightforward to implement: all the machinery is there. I was very pleasantly surprised to notice that actually implemented HQL duration support in a pretty robust way, where durations know about their own precision and the translation code automatically adapts to the precision of the duration. So that's easy enough.

Finally, I believe that @beikov would like to be able to store a Duration in a SQL interval second column, and I think we can accommodate that here too, though it would take a bit more work to get it working in HQL. (What was there previously was not working.) Ultimately we could have something like:

enum DurationStorageType { 
 NANOSECOND, 
 SECOND, 
 DAY,
 INTERVAL_SECOND
}

Alternatives?

  • Well, one straightforward alternative worth due consideration is to just introduce SecondDurationJavaType and DayDurationJavaType, and use the @JavaType annotation. This would work well-enough, I believe, and I guess it could also be extended to accommodate IntervalDurationJavaType. Note, however, that the HQL translator would need to be aware of these new JavaTypes and special-case them using instanceof, so it's not quite as clean as it looks.
  • Perhaps this can even be handled with just a converter? I would have to try. (Hell, it probably could even be done with a JdbcType, though I think that would be an abuse of the APIs.)

Either of these alternative approaches has the advantage of not introducing new annotations, though on the flip side that makes the functionality less-discoverable to users. And I guess we could argue about which of:

@DurationStorage(DAY)
Duration duration;

vs

@JavaType(DayDurationJavaType.class)
Duration duration;

is more aesthetically pleasing. I'm not sure, but I guess I lean toward the first option. Especially since we have already gone there with @TimeZoneStorage.

You must be logged in to vote

Replies: 3 comments 11 replies

Comment options

That looks quite nice, and I have no objections at all in adding such handy mapping annotations - however let me point out that I don't agree with the premise of "user friendlyness" of not storing nanoseconds.

Encoding such values at high resolution (not precision) at least it ensures that on reading a stored value, the same value is returned. Let's remember that time-zone dependent values are a mess as they depend on politics and systems being updated timely and consistently to be aware of whatever weird non-technical decisions politicians made; even assuming time-zone settings are correct from a technical point of view, it's quite possible that the JVM and the DB are at different stages or quality of patching in such regard (i.e. one might not have been updated to the latest oddness yet), and since storing at lower precision implies truncations based on such inconsistent rules, the "user friendliness" is arguable. Storing at highest resolution at least prevents some classes of such oddities.

Also, most users won't care what you store. So while I like being able to control the resolution I'd keep high resolution as a default.

You must be logged in to vote
2 replies
Comment options

gavinking Jan 14, 2023
Maintainer Author

however let me point out that I don't agree with the premise of "user friendlyness" of not storing nanoseconds.

I'm not suggesting not storing them. numeric(21,9) lets you store nanoseconds as the fractional part, after the decimal point. And since numeric is an exact type, there's no rounding error.

UPDATE: this isn't quite right, as @yrodiere points out below, for DAY storage, there can be a rounding error, it looks like numeric is actually a fair bit worse than double in this respect.

On the other hand, SECOND storage doesn't have rounding error.

Comment options

great!

Comment options

but it's not very nice to open up the database and see a value with 11 zeros on it representing one measly hour

I'd argue that's an endless problem: when I'm expressing a duration in years, the number of days will likely be so large it will feel meaningless.

Or, hell, even as days would make sense, using something like numeric(22,14).

I very much doubt all durations expressed as a fraction of a day can be represented as a decimal number (with a finite number of decimal digits). You're bound to hit rational but non-decimal numbers at some point and then you might lose precision. That's the kind of limitation I'd hate to expose users to, especially just for the convenience of reading "understandable" number in the DB, and especially if the problem can be introduced by something so seemingly innocuous as picking one enum value over another.

For example take a duration representing a prime number of seconds, say 32212254719. In days (divided by 86400), that's 372827.0222106481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481481....

Rounding might work out fine in this case (I don't know), but then it might not in other cases.

enum TimeZoneStorageType { NANOSECOND, SECOND, DAY }.

I suppose you meant DurationStorageType? I don't see what Timezone has to do with this.

Alternatives?

I believe either alternative is preferable to @DurationStorage, be it only because the separation between "safe" storage and unsafe storage such as "days" is clearer.

You must be logged in to vote
9 replies
Comment options

gavinking Jan 16, 2023
Maintainer Author

Looks good.

Ah hahah, wait, no it doesn't. What?? Let me try again.

Comment options

gavinking Jan 16, 2023
Maintainer Author

OK, fine, you're right @yrodiere, numeric on postgres is significantly worse at this than double. This I did not know.

So, fine, DAY storage probably isn't a great idea.

hreact=# create table t(t double precision);
CREATE TABLE
hreact=# insert into t values(32212254719.0/86400);
INSERT 0 1
hreact=# select t*86400 from t;
 ?column? 
-------------
 32212254719
(1 row)

Good.

hreact=# create table t(t numeric(24,15));
CREATE TABLE
hreact=# insert into t values(32212254719::numeric/86400::numeric);
INSERT 0 1
hreact=# select t*86400::numeric from t;
 ?column? 
-----------------------------
 32212254718.999999987200000
(1 row)

Bad, as you predicted.

Comment options

Reversability of the function being able to get back to the exact same representation would be my primary concern; and not because it could be technically wrong (I doubt that) but because of user surprise.

No biggie, I suppose we could warn about that possibility. I just don't want to have to explain that to hundreds of users who don't have basic math training :)

Comment options

Wait, perhaps I misunderstood your concern. Is your concern reversibility of the conversion to days and back, when you have something like an exact quantity of hours?

Indeed, that was my concern. Users are not introducing the mistake here, we are.

No biggie, I suppose we could warn about that possibility

I doubt documentation will cut it here, especially if we make it so easy to switch from SECONDS to NANOSECONDS to DAYS. It just feels surprising that one of those carries such limitations, especially if those limitations are not consistent (only for some prime numbers etc.).

Like I said, the alternative solutions might be easier to document effectively (read: with users actually getting warned before using the feature).

OR we could just always truncate values, so that at least the behavior is consistent. It feels "right" to me that users needing nanosecond precision will end up storing an (integer) number of nanoseconds while users needing only day/second precision will end up storing an (integer) number of days/seconds. And if we name the annotations/enums right, the truncation won't be surprising at all.

Comment options

gavinking Jan 16, 2023
Maintainer Author

So it looks like it works as long as you call round() explicitly. (Tried on both postgres and MySQL.)

hreact=# select round(t*86400) from t;
 round 
-------------
 32212254719

Which is fine. But the whole point of this was to get values displayed nicely to the user in adhoc queries, where they won't (by default) know to use round(). So that undermines the value of having this at least somewhat.

Comment options

So it seems to me that we all agree the following options would be useful:

enum DurationStorageType { 
 NANOSECOND, // store as numeric(21)
 SECOND, // store as numeric(21,9)
 INTERVAL_SECOND // store as interval second
}

I think that a truncating DAY option could also make sense, similar to how @Temporal(DATE) would truncate the time part away, but maybe we should rather name it DAY_TRUNCATE if we want that and store it as bigint?

Apart from that, the idea to have a storage type enum kind of would help mapping Period to other interval types (also see #4892)

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /