Insert results from query, update old existing row if query doesn't still return same value

Question 1

I am trying to refactor a table which contains historical data and therefore isn't ever deleted, but has an 'active' field.

Each day we run a query (a few joins, but quite slow) to get some data to insert into this table:

if the row exists already, and is present in the query, we do nothing
if the row exists but the result isn't in todays query, we set 'active' to false
if the row doesn't exist but is in the query, we insert it (with active defaulting to true)

It is possible for an entity to be active, made inactive (same row), then a few days later become active again (would be a new row).

The table is using a surrogate key, as the natural key can be repeated as noted above, though there should only ever be a single entry that is active on the natural key.

I'm curious if there is an action or 'pattern' I can take that will enable such functionality - I realise I'm asking a lot :)

I am using a java library (jOOQ) to generate the sql, so I am only looking for sql as the solution, no PL/pgSQL and no stored procs. I doubt it is possible without 2 passes (1. perform the update where not in query result, 2. perform the insert from query result) but I'm intrigued if there are possibilities.

Edit: Example table history, where ** is an update to same row

pk_surrogate | natural_id | active| date_made_active
1 | 1 | true | 2020年01月01日 -- row for entity 1 added
1 ** | 1 | false | 2020年01月01日 -- row for entity 1 no longer present on 2020年01月02日, so marked inactive
2 | 1 | true | 2020年01月03日 -- new row for entity 1 added when it was present on 2020年01月03日
3 | 2 | true | 2020年01月03日 --new value for 2020年01月03日

Question 2

Neither pk_surrogate nor natural_id seem to be primary (or unique) keys - is that sample data the source or the target?

Question 3

pk_surrogate is primary key - it's unique per row - the row with ** is an update, didn't know how best to repesent an updated row

Question 4

Something like this maybe?

with daily_query as (
 .... this is your current query ...
), upsert as (
 -- covers first and third requirement
 insert into the_table (....)
 select ...
 from daily_query
 on conflict (pk_column) do nothing
)
-- this sets those rows to in active that exist in the target table
-- but are not in the initial query
update the_table t
 set active = false
where not exists (select * 
 from daily_query q
 where q.pk_column = t.pk_column);

Question 5

Thanks for your quick response @a_horse_with_no_name - in this example, is the pk_column the surrogate key or natural (but can repeat) key? In my model the pk would be the surrogate key and that's why I didn't think this would work, as that will always be unique. I must admit I don't think the check "unique(natural_key) where active = true" exists, but perhaps by adding it the above would also work...

Question 6

It a primary or unique key. Obviously you need to have (at least) one, otherwise you can't detect "if the row exists". As you didn't show us your table definition I had to make something up

Question 7

Thanks - I described with words but I should know better by now :) Have included sample table in the question, the unique constraint is on column 'natural_key' where active = true. It's the conditionality but I see now I can probably do on conflict (natural_key) where active = true do nothing if there is a unique index on that key - I'll give it a try. Thanks!

user1822user1822 · Answer 1 · 2020-07-28 20:49:34Z

1

Something like this maybe?

with daily_query as (
 .... this is your current query ...
), upsert as (
 -- covers first and third requirement
 insert into the_table (....)
 select ...
 from daily_query
 on conflict (pk_column) do nothing
)
-- this sets those rows to in active that exist in the target table
-- but are not in the initial query
update the_table t
 set active = false
where not exists (select * 
 from daily_query q
 where q.pk_column = t.pk_column);

Share

Improve this answer

answered Jul 28, 2020 at 20:49

user1822user1822

3

Thanks for your quick response @a_horse_with_no_name - in this example, is the pk_column the surrogate key or natural (but can repeat) key? In my model the pk would be the surrogate key and that's why I didn't think this would work, as that will always be unique. I must admit I don't think the check "unique(natural_key) where active = true" exists, but perhaps by adding it the above would also work...

Mitch Kent
– Mitch Kent

2020年07月29日 06:16:53 +00:00
Commented Jul 29, 2020 at 6:16
It a primary or unique key. Obviously you need to have (at least) one, otherwise you can't detect "if the row exists". As you didn't show us your table definition I had to make something up

user1822
– user1822

2020年07月29日 06:18:38 +00:00
Commented Jul 29, 2020 at 6:18
Thanks - I described with words but I should know better by now :) Have included sample table in the question, the unique constraint is on column 'natural_key' where active = true. It's the conditionality but I see now I can probably do on conflict (natural_key) where active = true do nothing if there is a unique index on that key - I'll give it a try. Thanks!

Mitch Kent
– Mitch Kent

2020年07月29日 07:25:47 +00:00
Commented Jul 29, 2020 at 7:25

Add a comment |

Stack Exchange Network

Insert results from query, update old existing row if query doesn't still return same value

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Insert results from query, update old existing row if query doesn't still return same value

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions