3

I am doing a data migration and I end up with a temporary table as follows:

curid cuid rtid cd dd rm
10 4 4 2016年01月02日 2016年07月02日 
16 4 4 2016年06月12日 2016年12月12日 Remarks Jun 12
18 5 3 2016年07月18日 2017年07月31日 
8 5 3 2015年06月21日 2016年06月30日 Add some test
11 6 4 2017年01月01日 2017年07月01日 
9 7 3 2017年01月01日 2018年01月31日 

I need to split the data into two tables. Based on that same data, it should look like this:

Table A
 id curid cuid rtid
 1 10 4 4 
 2 18 5 3 
 3 11 6 4 
 4 9 7 3

That's one row per distinct (cuid, rtid) plus a curid value picked from each set of duplicates. id is just a sequential number.

Table B
 id curid cd dd rm
 1 10 2016年01月02日 2016年07月02日 
 2 10 2016年06月12日 2016年12月12日 Remarks Jun 12
 3 18 2016年07月18日 2017年07月31日 
 4 18 2015年06月21日 2016年06月30日 Add some test
 5 11 2017年01月01日 2017年07月01日 
 6 9 2017年01月01日 2018年01月31日

The actual curid is irrelevant as long as the records in Table B match the associated record in Table A (so we could even use a temp sequence or something to set the curid).

Erwin Brandstetter
186k28 gold badges463 silver badges636 bronze badges
asked Jul 4, 2017 at 12:58
1
  • I've edited the question to make it clearer. I still think a window function should solve this problem but I can't figure out how to set and then match the curid for both tables. Commented Jul 6, 2017 at 16:08

3 Answers 3

3

Your test setup

(Best provided this way in your question - hint!)

CREATE TEMP TABLE tmp (
 curid int
, cuid int
, rtid int
, cd date
, dd date
, rm text);
INSERT INTO tmp VALUES
 (10, 4, 4, '2016-01-02', '2016-07-02', NULL) 
 ,(16, 4, 4, '2016-06-12', '2016-12-12', 'Remarks Jun 12')
 ,(18, 5, 3, '2016-07-18', '2017-07-31', NULL) 
 ,(8 , 5, 3, '2015-06-21', '2016-06-30', 'Add some test')
 ,(11, 6, 4, '2017-01-01', '2017-07-01', NULL)
 ,(9 , 7, 3, '2017-01-01', '2018-01-31', NULL);

Solution

Create target tables if they don't exist:

CREATE TEMP TABLE a (
 id serial
 , curid int -- UNIQUE?
 , cuid int
 , rtid int
);
CREATE TEMP TABLE b (
 id serial
 , curid int
 , cd date
 , dd date
 , rm text
);

Use DISTINCT ON for table A:

INSERT INTO a (curid, cuid, rtid)
SELECT DISTINCT ON (cuid, rtid)
 curid, cuid, rtid
FROM tmp
ORDER BY cuid, rtid, curid -- pick smallest curid per group
RETURNING *; 
id | curid | cuid | rtid
-: | ----: | ---: | ---:
 1 | 10 | 4 | 4
 2 | 8 | 5 | 3
 3 | 11 | 6 | 4
 4 | 9 | 7 | 3

Detailed explanation here:

Use a simple window function for table B:

INSERT INTO b (curid, cd, dd, rm)
SELECT min(curid) OVER (PARTITION BY cuid, rtid), cd, dd, rm
FROM tmp
ORDER BY cuid, rtid -- optional
RETURNING *; 
id | curid | cd | dd | rm 
-: | ----: | :--------- | :--------- | :-------------
 1 | 10 | 2016年01月02日 | 2016年07月02日 | null 
 2 | 10 | 2016年06月12日 | 2016年12月12日 | Remarks Jun 12
 3 | 8 | 2016年07月18日 | 2017年07月31日 | null 
 4 | 8 | 2015年06月21日 | 2016年06月30日 | Add some test 
 5 | 11 | 2017年01月01日 | 2017年07月01日 | null 
 6 | 9 | 2017年01月01日 | 2018年01月31日 | null 

curid is guaranteed to match since we picked the smallest per group in both queries.

dbfiddle here

answered Jul 4, 2017 at 16:06
2
  • @Lee: Consider the update. Next time, rather start a new question than changing the nature of the existing one after answers have been given - even if you realize you asked the wrong question, that's irrelevant for the general public. Commented Jul 7, 2017 at 2:21
  • @ErwinBrandstetter that you so much for teaching how to better ask questions and for teaching me how to be a better developer. This is a great solution! Commented Jul 7, 2017 at 3:30
1

I think what you're looking for is dense_rank()

create table tmp (curid int, cuid int, rtid int) ;
insert into temp
values
 (176,231,23),
 (192,231,23),
 (145,231,23),
 (122,231,23),
 (122,231,24),
 (122,231,25) ;
select dense_rank() over (order by cuid, rtid) as curid,
 cuid, 
 rtid
from temp ;
ypercubeTM
99.7k13 gold badges217 silver badges306 bronze badges
answered Jul 4, 2017 at 13:50
0

With clarification, what I'd do in Microsoft SQL is,

SELECT cuid, rtid, IDENTITY(int, 1, 1) AS curid,
COUNT(*) AS how_many
INTO another_table
FROM original_table 
GROUP BY cuid, rtid
ORDER BY cuid, rtid

That creates a new table [another_table] of every combination of (cuid, rtid), with a row id in the new table.

Now, how much of that works in PostgreSQL?

answered Jul 5, 2017 at 8:13

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.