Update table using different random values for each update row

Question 1

I want to update random 20% values of a table in Postgres, I want to assign for this attribute the old attribute + a random number with specific limits, and I want that for each row this random number must be different.

I am currently doing this:

update tab_ex
set val = (SELECT val + (SELECT random()*2000 FROM generate_series(20,2000) LIMIT 1))
where id in (select id from tab_ex order by random() limit (select count(*)*0.2 from tab_ex));

and it is updating 20% of my table however it is updating with a specific random number for every row instead of changing this random number for each update.

Question 2

I have seen this problem in other databases, where a subquery gets "optimized away" even though it has a volatile function in it. That may be happening here. One possibility is to remove the subquery:

update tab_ex
 set val = val + random() * 2000
 where id in (select id
 from tab_ex
 order by random()
 limit (select count(*)*0.2 from tab_ex)
 );

This should re-run random() for every row being updated.

Question 3

It is because the internal subquery SELECT random()*2000 FROM generate_series(20,2000) LIMIT 1 is not depending on the updating row data. Changing it to SELECT random()*2000 + val FROM generate_series(20,2000) LIMIT 1 for example also solves the problem.

Question 4

@Abelisto . . . But why would you use a subquery when one is not necessary?

Question 5

I not using subquery and I don't know the OP's logic at this point. I just explain "why" and "how".

score 1 · Accepted Answer · 2016-05-10 21:45:17Z

1

I have seen this problem in other databases, where a subquery gets "optimized away" even though it has a volatile function in it. That may be happening here. One possibility is to remove the subquery:

update tab_ex
 set val = val + random() * 2000
 where id in (select id
 from tab_ex
 order by random()
 limit (select count(*)*0.2 from tab_ex)
 );

This should re-run random() for every row being updated.

Share

Improve this answer

answered May 10, 2016 at 21:45

Gordon Linoff's user avatar

Gordon Linoff Gordon Linoff

1.3m62 gold badges704 silver badges856 bronze badges

3 Comments

Abelisto

Abelisto Over a year ago

It is because the internal subquery SELECT random()*2000 FROM generate_series(20,2000) LIMIT 1 is not depending on the updating row data. Changing it to SELECT random()*2000 + val FROM generate_series(20,2000) LIMIT 1 for example also solves the problem.

2016年05月10日T21:52:34.787Z+00:00

Gordon Linoff

Gordon Linoff Over a year ago

@Abelisto . . . But why would you use a subquery when one is not necessary?

2016年05月11日T01:36:01.157Z+00:00

Abelisto

Abelisto Over a year ago

I not using subquery and I don't know the OP's logic at this point. I just explain "why" and "how".

2016年05月11日T05:29:52.1Z+00:00

CollectivesTM on Stack Overflow

Update table using different random values for each update row

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related