One possible way to select random rows in PostgreSQL is this:
select * from table order by random() limit 1000;
(see also here.)
My question is, what does order by random()
mean exactly? Is it that somehow a random number is generated and it is taken as some kind of "seed"? Or is this special built in syntax, and in this place random()
has a different meaning than in other contexts?
From some experimentation, the last explanation seems more plausible. Consider the following:
# select random();
random
═══════════════════
0.336829286068678
(1 row)
# select * from article order by 0.336829286068678 limit 5;
ERROR: non-integer constant in ORDER BY
LINE 1: select * from article order by 0.336829286068678 limit 5;
-
1You might find this of interest! p.s. welcome to the forum!Vérace– Vérace2020年03月15日 23:06:03 +00:00Commented Mar 15, 2020 at 23:06
1 Answer 1
ORDERY BY random()
is not a special case. It generates random numbers, one for each row, and then sorts by them. So it results in rows being presented in a random order.
Rather it is ORDER BY <literal constant>
which is the special case, but that special case only works with integers. It throws the error you show for non-integers. The special case is that it uses the integer to index into the select-list, and orders by that column from the select list. This lets you both select and order by an expression, without having to repeat the expression in both places.
-
1Thanks, I'm one step nearer but still not totally clear. (In particular, this was new info: "It generates random numbers, one for each row, and then sorts by them.") My follow-up questions: - normally
random()
generates a number between0.0
and1.0
; when used in the context oforder by
, does it generate integers instead? - if yes, then how does it know, that in that place it has to return integers? and how does it know the maximum (which is the number of rows I guess)? - if no (ie it still generates a value between0..1
, then why doesn't such hard-coded value work too?)Attilio– Attilio2020年03月10日 08:58:32 +00:00Commented Mar 10, 2020 at 8:58 -
1It returns floats, which are ordered on. There is no need for it return integers. The hard coded (aka literal constant) value doesn't work because it triggers the special case in PostgreSQL. PostgreSQL knows the difference between a literal and a function and can act on that difference.jjanes– jjanes2020年03月10日 14:24:13 +00:00Commented Mar 10, 2020 at 14:24
-
1> It returns floats, which are ordered on. ... The hard coded (aka literal constant) value doesn't work because it triggers the special case in PostgreSQL. Thanks, everything clear now. (I tried using
order by dummyrand()
, wheredummyrand
is a function hard-coded to return always the same value, and it worked indeed) Please add this info to the answer, then I can accept it.Attilio– Attilio2020年03月10日 15:35:29 +00:00Commented Mar 10, 2020 at 15:35