Asked 11 years, 3 months ago

Viewed 2k times

I am stuck on the following task (PostgreSQL 9.3). Let's say we have the following table1 (which has 10k rows):

table1:

My goal is to create another table2 with two columns (source, target), where the values of both columns are a random selection of table1.id values. For example:

table2:

source | target
754 | 59
4 | 4
59 | 330

This is what I've done:

CREATE TABLE table2
(
 id serial NOT NULL,
 source integer,
 target integer,
 distance double precision
);
-- Select 300 table1.id values and insertion into table2.source 
INSERT INTO table2(source)
SELECT id FROM table1 ORDER BY RANDOM() LIMIT 300;
-- Select 300 table1.id values and updating table2.target 
UPDATE table2 SET target = i.id
FROM (SELECT id FROM table1 ORDER BY RANDOM() LIMIT 300) i;

I got the following result:

source | target
754 | 59
330 | 59
800 | 59

Unfortunately all the table2.target values are all the same. How can I update table2.target with different random values (like in the example) ? Or maybe UPDATE is not the good way for doing this?

Improve this question

edited Jul 3, 2014 at 15:58

Erwin Brandstetter's user avatar

Erwin Brandstetter

186k28 gold badges463 silver badges636 bronze badges

asked Jul 1, 2014 at 22:40

Theo's user avatar

Theo Theo

212 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default

I suggest a "data-modifying CTE":

WITH cte AS (
 SELECT *, row_number() OVER () AS rn
 FROM (
 SELECT id
 FROM tbl
 ORDER BY random()
 LIMIT 600 -- 2 x 300
 ) sub
 )
INSERT INTO table2(source, target)
SELECT c1.id, c2.id
FROM cte c1
JOIN cte c2 ON c2.rn = c1.rn + 300;

In the CTE:

select 600 random rows (to create 300 new rows)
add a row number in the outer SELECT.

Then couple two values in a self join with 300 offset.

To get random rows from a huge table cheaply, consider:
Best way to select random rows PostgreSQL

Improve this answer

edited May 23, 2017 at 12:40

Community's user avatar

Community Bot

answered Jul 1, 2014 at 22:59

Erwin Brandstetter's user avatar

Erwin Brandstetter Erwin Brandstetter

186k28 gold badges463 silver badges636 bronze badges

Hello Erwin ! Thanks a lot for your answer. Your solution works great. I learned from it.

Theo
– Theo

2014年07月03日 15:39:49 +00:00
Commented Jul 3, 2014 at 15:39
Another idea would be to add random() AS rnd in the subquery, add LAG(id) OVER (ORDER BY rnd) AS lagid in the cte and then use INSERT ... SELECT id, lagid FROM cte WHEN rn % 2 = 0;

ypercubeᵀᴹ
– ypercubeᵀᴹ

2014年08月09日 18:35:47 +00:00
Commented Aug 9, 2014 at 18:35

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-sql

Stack Exchange Network

Create a new table (multiple columns) by selecting random values from an existing table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Create a new table (multiple columns) by selecting random values from an existing table

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions