0

I want to find rows missing from a sequence of values in my table with differents formats. For an example, look at this.

I want to find the missing lines via c_order_id and documentno. the one with '_p' at the end, and the other without '_p', just the numbers.

CREATE TABLE documents (
 "order_id" VARCHAR(22),
 "documentno" VARCHAR(20)
);
INSERT INTO documents
 ("order_id", "documentno")
VALUES
 ('100001120', 'PROFS/2021/02050_P'),
 ('100001125' ,'PROFS/2021/02055_p'),
 ('100001127', 'PROFS/2021/02056'),
 ('100001135' ,'PROFS/2021/02060'),
 ('100001139' ,'AGB/2021/02040'),
 ('100001172' ,'AGB/2021/02047')

Query used:

WITH cte as (
SELECT LEFT("documentno",12) lpart , MIN(RIGHT("documentno",5)) minpart, MaX(RIGHT("documentno",5)) maxpart FROM documents
groUP BY LEFT("documentno",12))
sELECT lpart || generate_series(minpart::INTEGER, maxpart::INTEGER) as missing_documentno FROM cte
except
select documentno from documents
ORDER bY missing_documentno;

This query generates

ERROR: invalid input syntax for type integer: "055_p"
asked Apr 6, 2021 at 7:01
0

1 Answer 1

1

You can use regexp_replace() to cleanup the numbers so that you can cast it to an integer value.

with cte as (
 select left(documentno,12) lpart, 
 regexp_replace(split_part(documentno, '/', 3), '[^0-9]', '', 'g')::int as num
 from documents
), minmax as (
 select lpart, min(num) minpart, max(num) maxpart
 from cte
 group by lpart
)
select m.lpart, t.missing_document_no
from minmax m
 cross join generate_series(m.minpart, m.maxpart) as t(missing_document_no)
order by t.missing_document_no

I have split up the "cleaning" of the data and the group into two CTEs to make the steps a bit more readable. Given your sample data, this returns:

lpart | missing_document_no
-------------+--------------------
PROFS/2021/0 | 2050
PROFS/2021/0 | 2051
PROFS/2021/0 | 2052
PROFS/2021/0 | 2053
PROFS/2021/0 | 2054
PROFS/2021/0 | 2055
PROFS/2021/0 | 2056
PROFS/2021/0 | 2057
PROFS/2021/0 | 2058
PROFS/2021/0 | 2059
PROFS/2021/0 | 2060

To find the missing values, you can extend the final select to:

with (
...
), minmax as (
...
)
select lpart||t.doc_no as missing_doc_no
from minmax m
 cross join generate_series(minpart, maxpart) as t(doc_no) 
where not exists (select * 
 from documents d
 where regexp_replace(split_part(d.documentno, '/', 3), '[^0-9]', '', 'g')::int = t.doc_no)
answered Apr 6, 2021 at 7:38
5
  • i want to get 'PROFS/2021/02051_P', 'PROFS/2021/02052_P', 'PROFS/2021/02053_P', 'PROFS/2021/02054_P', 'PROFS/2021/02057, 'PROFS/2021/02058, 'PROFS/2021/02059 Commented Apr 6, 2021 at 9:27
  • @FranckWhite: why is it PROFS/2021/02051_P and not PROFS/2021/02051? - how do you know when to append the _P and when not? Commented Apr 6, 2021 at 9:38
  • table come like this, those who have _p should be generate with _p, and the others without _p. example in the table , i have PROFS/2021/02050_P and PROFS/2021/02055_P, and 'PROFS/2021/02056' and 'PROFS/2021/02060' i want to get the missing documentno «PROFS / 2021 / 02051_P», «PROFS / 2021 / 02052_P», «PROFS / 2021 / 02053_P», «PROFS / 2021 / 02054_P» and «PROFS / 2021/02057,« PROFS / 2021/02058, 'PROFS / 2021/02059» Commented Apr 6, 2021 at 9:48
  • how can i return missing rows that lpart < 12 example dbfiddle.uk/… Commented Apr 6, 2021 at 11:05
  • @FranckWhite When can lpart be < 12? It's a string! Please ask a new question - pointing back to this one if required. Commented Apr 6, 2021 at 17:22

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.