I want to find rows missing from a sequence of values in my table with differents formats. For an example, look at this.
I want to find the missing lines via c_order_id
and documentno
. the one with '_p' at the end, and the other without '_p', just the numbers.
CREATE TABLE documents (
"order_id" VARCHAR(22),
"documentno" VARCHAR(20)
);
INSERT INTO documents
("order_id", "documentno")
VALUES
('100001120', 'PROFS/2021/02050_P'),
('100001125' ,'PROFS/2021/02055_p'),
('100001127', 'PROFS/2021/02056'),
('100001135' ,'PROFS/2021/02060'),
('100001139' ,'AGB/2021/02040'),
('100001172' ,'AGB/2021/02047')
Query used:
WITH cte as (
SELECT LEFT("documentno",12) lpart , MIN(RIGHT("documentno",5)) minpart, MaX(RIGHT("documentno",5)) maxpart FROM documents
groUP BY LEFT("documentno",12))
sELECT lpart || generate_series(minpart::INTEGER, maxpart::INTEGER) as missing_documentno FROM cte
except
select documentno from documents
ORDER bY missing_documentno;
This query generates
ERROR: invalid input syntax for type integer: "055_p"
1 Answer 1
You can use regexp_replace()
to cleanup the numbers so that you can cast it to an integer value.
with cte as (
select left(documentno,12) lpart,
regexp_replace(split_part(documentno, '/', 3), '[^0-9]', '', 'g')::int as num
from documents
), minmax as (
select lpart, min(num) minpart, max(num) maxpart
from cte
group by lpart
)
select m.lpart, t.missing_document_no
from minmax m
cross join generate_series(m.minpart, m.maxpart) as t(missing_document_no)
order by t.missing_document_no
I have split up the "cleaning" of the data and the group into two CTEs to make the steps a bit more readable. Given your sample data, this returns:
lpart | missing_document_no
-------------+--------------------
PROFS/2021/0 | 2050
PROFS/2021/0 | 2051
PROFS/2021/0 | 2052
PROFS/2021/0 | 2053
PROFS/2021/0 | 2054
PROFS/2021/0 | 2055
PROFS/2021/0 | 2056
PROFS/2021/0 | 2057
PROFS/2021/0 | 2058
PROFS/2021/0 | 2059
PROFS/2021/0 | 2060
To find the missing values, you can extend the final select to:
with (
...
), minmax as (
...
)
select lpart||t.doc_no as missing_doc_no
from minmax m
cross join generate_series(minpart, maxpart) as t(doc_no)
where not exists (select *
from documents d
where regexp_replace(split_part(d.documentno, '/', 3), '[^0-9]', '', 'g')::int = t.doc_no)
-
i want to get 'PROFS/2021/02051_P', 'PROFS/2021/02052_P', 'PROFS/2021/02053_P', 'PROFS/2021/02054_P', 'PROFS/2021/02057, 'PROFS/2021/02058, 'PROFS/2021/02059Franck White– Franck White2021年04月06日 09:27:19 +00:00Commented Apr 6, 2021 at 9:27
-
@FranckWhite: why is it
PROFS/2021/02051_P
and notPROFS/2021/02051
? - how do you know when to append the_P
and when not?user1822– user18222021年04月06日 09:38:02 +00:00Commented Apr 6, 2021 at 9:38 -
table come like this, those who have _p should be generate with _p, and the others without _p. example in the table , i have PROFS/2021/02050_P and PROFS/2021/02055_P, and 'PROFS/2021/02056' and 'PROFS/2021/02060' i want to get the missing documentno «PROFS / 2021 / 02051_P», «PROFS / 2021 / 02052_P», «PROFS / 2021 / 02053_P», «PROFS / 2021 / 02054_P» and «PROFS / 2021/02057,« PROFS / 2021/02058, 'PROFS / 2021/02059»Franck White– Franck White2021年04月06日 09:48:10 +00:00Commented Apr 6, 2021 at 9:48
-
how can i return missing rows that lpart < 12 example dbfiddle.uk/…Franck White– Franck White2021年04月06日 11:05:48 +00:00Commented Apr 6, 2021 at 11:05
-