-1

I have data like

{"name": "a", "scope": "1", "items": [{"code": "x", "description": "xd"}, {"code": "x2", "description": "xd2"}]}
{"name": "b", "scope": "2", "items": [{"code": "x", "description": "xd"}]}
{"name": "c", "scope": "3", "items": [{"code": "x", "description": "xd"}]}
{"name": "d", "scope": "4", "items": [{"code": "x", "description": "xd"}]}

I want to filter out some fields in the json objects in my SELECT result, and the result could be something like:

{"name": "a","items": [{"code": "x"}, {"code": "x2"}]}
{"name": "b","items": [{"code": "x"}]}
{"name": "c","items": [{"code": "x"}]}
{"name": "d","items": [{"code": "x"}]}
asked Feb 2, 2021 at 11:26
5
  • Hi and welcome to the forum! Will the text to be removed always be "scope" followed by a number in double quotes and always "description" followed by a string in double quotes? Or are the patterns more difficult than that? Commented Feb 2, 2021 at 12:06
  • Also, you have "a","items": do you require a space between the "a", and "items":? Commented Feb 2, 2021 at 12:15
  • Welcome. In order to help us help you can you follow the steps here dba.stackexchange.com/help/minimal-reproducible-example Commented Feb 2, 2021 at 12:22
  • your_jsonb_column - 'scope' would be the first step. But for the nested array elements, you will need to iterate over them, remove the items you don't want and the put them together using jsonb_agg() Commented Feb 2, 2021 at 12:51
  • I want to filter out some fields from the json object, spaces don't matter. Commented Feb 2, 2021 at 14:14

1 Answer 1

0

Well, it's not pretty (see fiddle here):

CREATE TABLE test
(
 j_str TEXT NOT NULL
);

Populate it:

INSERT INTO test VALUES 
('{"name": "a", "scope": "1", "items": [{"code": "x", "description": "xd"}, {"code": "x2", "description": "xd2"}]}'),
('{"name": "b", "scope": "2", "items": [{"code": "x", "description": "xd"}]}'), 
('{"name": "c", "scope": "3", "items": [{"code": "x", "description": "xd"}]}'),
('{"name": "d", "scope": "4", "items": [{"code": "x", "description": "xd"}]}'),
('{"name": "a", "scope": "1", "items": [{"code": "x", "description": "xd"}, {"code": "x2", "description": "xd2"}, {"code": "x3", "description": "xd3"}]}');

Notice that I've added a record with 3 codes!

And the first step (see fiddle):

SELECT
REGEXP_REPLACE
(
 j_str, 
 '(^.*,)( "scope": "\d{1,3}", )("\w{5}": \[\{"\w{2,10}": "\w{1,5}")(, "\w{10,15}": "\w{1,10}")', -- ("\w10,15": "\w{2,10}")', 
 '1円 3円'
) FROM test;

which gives:

regexp_replace
{"name": "a", "items": [{"code": "x"}, {"code": "x2", "description": "xd2"}]}
{"name": "b", "items": [{"code": "x"}]}
{"name": "c", "items": [{"code": "x"}]}
{"name": "d", "items": [{"code": "x"}]}
{"name": "a", "items": [{"code": "x"}, {"code": "x2", "description": "xd2"}, {"code": "x3", "description": "xd3"}]}

Explanation of the regexp:

'(^.*,)( "scope": "\d{1,3}", )("\w{5}": \[\{"\w{2,10}": "\w{1,5}")(, "\w{10,15}": "\w{1,10}")',
'1円 3円'

Start from the beginning of the string (^) anchor, then go to the first occurrence of the word " scope" (note preceding space) which is then followed by a double-quote and then by 1 to 3 digits (\d{1,3}) followed by a double-quote then a comma and another space then followed by "\w{5}".... - the rest of the string! The round brackets (...) are "capturing groups" - so in the replacement string I have 1円 (means the first capturing group) followed by the third - so the second one is deleted.

and then:

SELECT
REGEXP_REPLACE
(
 REGEXP_REPLACE
 (
 j_str, 
 '(^.*,)( "scope": "\d{1,3}", )("\w{5}": \[\{"\w{2,10}": "\w{1,5}")(, "\w{10,15}": "\w{1,10}")', 
 '1円 3円'
 ), 
 '\{("code": "\w{1,10}")(, "\w{10,20}": "\w{1,9}")(\})', 
 '{1円3円', 'g' 
)
FROM test;

Result:

regexp_replace
{"name": "a", "items": [{"code": "x"}, {"code": "x2"}]}
{"name": "b", "items": [{"code": "x"}]}
{"name": "c", "items": [{"code": "x"}]}
{"name": "d", "items": [{"code": "x"}]}
{"name": "a", "items": [{"code": "x"}, {"code": "x2"}, {"code": "x3"}]}

So, you can see that the data is in your desired format! With a bit of work, it should be possible to use one single regexp for all of this - I can see a way of doing it which shouldn't require a specific "scope" or "code" words to be present - i.e. formulate the capturing groups in such a way as to combine regex 1 with the second one. Might be a good exercise?

answered Feb 2, 2021 at 14:45

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.