1

I am currently figuring out how to do a bit more complex data migration in my database and whether it is even possible to do in SQL (not very experienced SQL developer myself).

Let's say that I store JSONs in one of my text columns in a Postgres table wtih roughly the following format:

{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}]}

Now, I would like to migrate the value part to a bit more complex format:

{"type":"something","params":[{"value":{"id":"00de1be5-f75b-4072-ba30-c67e4fdf2333","path":"/hardcoded/string"}}]}

Furthermore, I also need to reason whether the value contains a UUID pattern, and if not, use slightly different structure:

{"type":"something-else","params":[{"value":"not-id"}]} ---> {"type":"something-else","params":[{"value":{"value":"not-id","path":""}}]}

I know I can define a procedure and use REGEX_REPLACE: REGEXP_REPLACE(source, pattern, replacement_string,[, flags]) but I have no idea how to approach the reasoning about whether the content contains ID or not. Could someone suggest at least some direction or hint how to do this?

asked Nov 4, 2021 at 10:41

2 Answers 2

1

You can use jsonb function for extract data and change them. At the end you should extend data.

Sample data structure and query result: dbfiddle

select
 (t.data::jsonb || jsonb_build_object(
 'params',
 jsonb_agg(
 jsonb_build_object(
 'value',
 case
 when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
 jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
 else 
 jsonb_build_object('value', 'not-id', 'path', '')
 end
 )
 )
 ))::text
from
 test t
 cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.data

PS:

If your table had id or unique field you can change group by t.data to do things like that:

select
 (t.data::jsonb || jsonb_build_object(
 'params',
 jsonb_agg(
 jsonb_build_object(
 'value',
 case
 when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
 jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
 else 
 jsonb_build_object('value', 'not-id', 'path', '')
 end
 )
 )
 ))::text
from
 test t
 cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.id
answered Nov 4, 2021 at 11:51
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you! Exactly what I was looking for. It did not occur to me, that I could leverage jsonb structure.
Sometimes, this value field can be hidden deep in a very complex JSON, is it possible to have some automated way of fixing all the occurences in one single field?
0

To replace values at any depth, you can use a recursive CTE to run replacements for each value of a value key, using a conditional to check if the value is a UUID, and producing the proper JSON object accordingly:

with recursive cte(v, i, js) as (
 select (select array_to_json(array_agg(distinct t.i)) 
 from (select (regexp_matches(js, '"value":("[\w\-]+")', 'g'))[1] i) t), 0, js from (select '{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}]}' js) t1
 union all
 select c.v, c.i+1, regexp_replace( 
 regexp_replace(c.js, regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 
 case when not ((c.v -> c.i)::text ~ '\w+\-\w+\-\w+\-\w+\-\w+') then 
 json_build_object('value', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '')::text 
 else json_build_object('id', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '/hardcoded/path')::text end, 'g'), 
 '(")(?=\{)|(?<=\})(")', '', 'g')
 from cte c where c.i < json_array_length(c.v)
)
select js from cte order by i desc limit 1

Output:

{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}]}

On a more complex JSON input string:

{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}, {"more":[{"additional":[{"value":"00f41be5-g75b-4072-ba30-c67e4fdf3777"}]}]}]}

Output:

{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}, {"more":[{"additional":[{"value":{"id" : "00f41be5-g75b-4072-ba30-c67e4fdf3777", "path" : "/hardcoded/path"}}]}]}]}
answered Nov 4, 2021 at 16:28

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.