I am using PostgreSQL and have two tables:
Table_A
colA | colB | colC | colD | colE | colF | colG | colH | colI | colJ
Table_B
colA | colB | colC | columnD | colE | columnF | colG | colH | colI | colJ
I am trying to insert a number of rows from table_B
into table_A
. My problem is that Table_A
has a primary key based on colA, colB, colC, colD, and colE
. Table_B
does not have this restriction, which means a simple insert
won't work:
INSERT INTO Table_A (colA, colB, colC, colD, colE, colF, colG, colH, colI, colJ)
SELECT colA, colB, colC, columnD, colE, columnF, colG, colH, colI, colJ FROM Table_B;
I am trying to work around this by using DISTINCT
in my selection from Table_B
. However, I cannot determine the correct syntax to both select distinct
on the 5 primary key columns used in Table_A
, and select all ten columns to be inserted. I have tried
INSERT INTO Table_A (colA, colB, colC, colD, colE)
SELECT DISTINCT colA, colB, colC, columnD, colE FROM Table_B;
Which correctly pulls unique entries but does not populate columns F-J, and I have tried
INSERT INTO Table_A (colA, colB, colC, colD, colE, colF, colG, colH, colI, colJ)
SELECT DISTINCT(colA, colB, colC, columnD, colE) colA, colB, colC, columnD, colE, columnF, colG, colH, colI, colJ FROM Table_B;
But this fails, as the first column entry is a wrapped version of the 5 unique columns, and fails the insert due to column length restrictions - the SELECT DISTINCT
in parens returns '' which obviously doesn't fit:
ERROR: Value is too long for type character varying(12)
My goal would be form a query which gets all the unique combinations of colA, colB, colC, columnD, colE
from Table_B
, and inserts those full rows, including columnF, colG, colH, colI, colJ
into Table_A
.
-
What the criterion should be for selecting one row (from the many that have the same unique key)?ypercubeᵀᴹ– ypercubeᵀᴹ2015年06月26日 05:50:49 +00:00Commented Jun 26, 2015 at 5:50
-
I don't know that I care - I just need at least one valid record. but is there a way to choose?Tom Doe– Tom Doe2015年06月26日 05:52:29 +00:00Commented Jun 26, 2015 at 5:52
1 Answer 1
You're almost there. SELECT DISTINCT
returns that group of columns, as you've found out. If you use DISTINCT ON
instead though, you'll get the rows you're looking for.
INSERT INTO Table_A
(colA, colB, colC, colD, colE, colF, colG, colH, colI, colJ)
SELECT DISTINCT ON (colA, colB, colC, columnD, colE)
colA, colB, colC, columnD, colE, columnF, colG, colH, colI, colJ
FROM
Table_B
As pointed out by ypercube and in the Postgres docs, you can improve this query by adding ORDER BY
. Without it, it seems that the choice between two conflicting rows is unpredictable.
INSERT INTO Table_A
(colA, colB, colC, colD, colE, colF, colG, colH, colI, colJ)
SELECT DISTINCT ON (colA, colB, colC, columnD, colE)
colA, colB, colC, columnD, colE, columnF, colG, colH, colI, colJ
FROM
Table_B
ORDER BY
colA, colB, colC, columnD, colE -- needed as it is
-- , colX, colY -- to choose which row to pick
;
-
If Tom cares about order, yea, then 'order by' would be important, but if they just wants the first entry for whatever reason, I think this would work.Stevie– Stevie2015年06月26日 05:55:45 +00:00Commented Jun 26, 2015 at 5:55
-
Oh wow, didn't realize I was that close! This totally works (and the ORDER BY is neat, but unnecessary for what I need)Tom Doe– Tom Doe2015年06月26日 05:56:56 +00:00Commented Jun 26, 2015 at 5:56
-
@ypercube strange, I just tested this in Postgres and
order by
wasn't required.Stevie– Stevie2015年06月26日 05:57:28 +00:00Commented Jun 26, 2015 at 5:57 -
@ypercube that's definitely not true. I ran the query in postgres just now to confirm, and it ran successfully without order by.Tom Doe– Tom Doe2015年06月26日 05:59:53 +00:00Commented Jun 26, 2015 at 5:59
-
1Yes my bad, sorry. If there is an order by, the first columns must match the columns of the
distinct on
, this is the restriction.ypercubeᵀᴹ– ypercubeᵀᴹ2015年06月26日 06:07:11 +00:00Commented Jun 26, 2015 at 6:07