DDL support for partitioning · hibernate/hibernate-orm · Discussion #5360

beikov
Sep 30, 2022
Maintainer

Moving the discussion from #5353 to here, the idea is to introduce the following annotations which will be used for emitting partitioning DDL in hbm2ddl.

@PartitionByRange({ @RangePartition(name = "...", from = "...", to = "...") }) // from is inclusive, to is exclusive
@PartitionByList({ @ListPartition(name = "...", keys = {...}) })
@PartitionByHash({ @HashPartition(name = "...") }) // modulus is determined by count of partitions, index determines remainder value

I'll list the databases that have some support for partitioning:

Database	Range	List	Hash
Oracle	✓(only less than supported)	✓	✓
DB2	✓	✓	✓
PostgreSQL	✓	✓	✓ (since v11)
SQL Server	✓(bounds are inclusive)	✓(emulate through range?)	✓(emulate through computed column)
MySQL	✓(only less than supported)	✓	✓
MariaDB	✓(only less than supported)	✓	✓

For databases that don't support partitioning we simply don't generate DDL or fail. Partitioning is a tool for data management but it shouldn't matter semantically if partitioning is used or not.

The only part I am unsure about is range partitioning since SQL Server uses inclusive bounds, we can't just omit the from member and infer that through the order of defined partitions, as we will have to use LEFT bound values for the DDL on SQL Server. We could think about dropping the to member though, as we could infer that from the next partition value and for the last partition, use the special MAXVALUE. So I am pretty sure we could support

@PartitionByRange({
 @RangePartition(name = "p1", start = "0"),
 @RangePartition(name = "p1", start = "10") 
})

to generate

PARTITION BY LIST (..)
(
 PARTITION p1 VALUES LESS THAN 10
 PARTITION p2 VALUES LESS THAN MAXVALUE
)

Replies: 3 comments 6 replies

gavinking
Nov 12, 2022
Maintainer

I don't hate it, but I'm still sorta struggling to see myself wanting to write this mess of annotations rather than just writing fragments of native SQL in, say, @Partition and @PartitionBy annotation.

@PartitionBy("RANGE (logdate)")
@Partition("FROM ('2006-02-01') TO ('2006-03-01')")
@Partition("FROM ('2006-03-01') TO ('2006-04-01')")

Or, alternatively:

@Partition(by="RANGE (logdate)", 
 partitions={"FROM ('2006-02-01') TO ('2006-03-01')", 
 "FROM ('2006-03-01') TO ('2006-04-01')"})

I think this would be much easier to implement, and probably also easier to use.

If there's a need for "portable" cross-dialect partitioning, which I just don't see as something that's like to be common at all, then we could easily add @PartitionBy/@Partition to @DialectOverride.

That way we would let the user customize partitioning for the particular database, and their particular needs, rather than struggle by ourselves to abstract out some sort of lowest-common-denominator partitioning facility, where the abstraction would by nature not even be very transparent.

WDYT, @beikov ?

5 replies

@beikov

beikov Nov 14, 2022
Maintainer Author

I don't mind if you want to experiment with this approach, but I guess I'd rather not use the "good name" for something that expects SQL dialect specific DDL fragments. I'd like to keep the good name for the abstraction if we anticipate to add this.

I really think that list and hash partitioning are something that can be nicely abstracted and that specifying the partitions for these partitioning schemes is actually something that make sense for people to define in their code.
For range partitioning I am not so sure. I guess most people that use range partitioning will have some sort of tool that automates the generation of new partitions. Specifying the partitions for range partitioning in code then seems a bit unnecessary. We could maybe start out with a simple @PartitionByRange annotation that doesn't accept partitions. We would simply emit a default partition in the DDL that accepts all values. If people start using this and request this feature, we can think about adding it. Wdyt @gavinking?

@gavinking

gavinking Nov 14, 2022
Maintainer

My main issue is that I think the annotations you're proposing are going to end up verbose, difficult to read, and inconvenient to write, compared to just writing fragments of SQL, and that the more verbose approach doesn't actually offer any benefits in practice because the number of systems in the intersection of 1. using partitioning, 2. doing cross-db development, and 3. using schema export to create the partitioned schema, is a vanishingly tiny set.

What i'm saying is that something like what I sketched out above is more user-friendly, significantly less complicated, and also makes it easier for people to get at every last little nuance or toggle switch of what their db offers, rather that having to target some sort of lowest-common-denominator.

@beikov

beikov Nov 14, 2022
Maintainer Author

the number of systems in the intersection of 1. using partitioning, 2. doing cross-db development, and 3. using schema export to create the partitioned schema, is a vanishingly tiny set.

This kind of makes me think we should maybe not do it at all unless someone from the community steps up and provides us with real world needs and use cases.

What i'm saying is that something like what I sketched out above is more user-friendly, significantly less complicated, and also makes it easier for people to get at every last little nuance or toggle switch of what their db offers, rather that having to target some sort of lowest-common-denominator.

Yeah that's true. Since the @Partition column wouldn't interfere with the naming I proposed, it's fine for me if you want to take a stab at that.

@BobMakhlin

BobMakhlin Feb 26, 2024

@beikov
We'd like to provide our real-world use case:
There's an account table (containing accounts and balances), every day we make a snapshot into the account_snapshot table (that contains snapshot_date, account, and balance cols). That's a huge table, and we'd like to partition this table by snapshot_date column, by range (weekly). We're restricted to use hbm2ddl to create database schema.
For us, minimal feature would be the support of partitioning in hbm2ddl, creation of partions themselves would be done via native queries.

@beikov

beikov Feb 26, 2024
Maintainer Author

Your request is noted and I understand that it is desirable to have this, but you really shouldn't rely solely on hbm2ddl for schema management.

loiclefevre
Nov 25, 2022

A few comments related to Oracle. Since 11g release, you have:

INTERVAL partitioning: same as RANGE but with partitions automatically added
REFERENCE partitioning: equipartition a child table with its partitioned parent using parent's partition key (column(s) not present inside the child table)
SYSTEM partitioning: let the application (ORM?) control the placement of data

Since 12cR2, you have:

AUTO-LIST partitioning: partition maintenance for LIST partitioned table done automatically

Also, Oracle supports 2 levels of partitioning (called sub-partitioning) such as partition by interval and then by list.

Finally, regarding point 3 above, Oracle allows partitioning a non-partitioned table online without schema export/import since 12cR2.

0 replies

HollisMeynell
Sep 18, 2024

1 reply

@beikov

beikov Sep 18, 2024
Maintainer Author

This has nothing to do with this discussion. Going to delete this. Go to https://hibernate.org/community/ and ask your question on the forums or chat.

Uh oh!

DDL support for partitioning #5360

Uh oh!

beikov Sep 30, 2022 Maintainer

Replies: 3 comments · 6 replies

Uh oh!

Uh oh!

gavinking Nov 12, 2022 Maintainer

Uh oh!

beikov Nov 14, 2022 Maintainer Author

Uh oh!

gavinking Nov 14, 2022 Maintainer

Uh oh!

beikov Nov 14, 2022 Maintainer Author

Uh oh!

Uh oh!

BobMakhlin Feb 26, 2024

Uh oh!

beikov Feb 26, 2024 Maintainer Author

Uh oh!

loiclefevre Nov 25, 2022

Uh oh!

HollisMeynell Sep 18, 2024

Uh oh!

beikov Sep 18, 2024 Maintainer Author

beikov
Sep 30, 2022
Maintainer

Replies: 3 comments 6 replies

gavinking
Nov 12, 2022
Maintainer

beikov Nov 14, 2022
Maintainer Author

gavinking Nov 14, 2022
Maintainer

beikov Nov 14, 2022
Maintainer Author

beikov Feb 26, 2024
Maintainer Author

loiclefevre
Nov 25, 2022

HollisMeynell
Sep 18, 2024

beikov Sep 18, 2024
Maintainer Author