I'm trying to deal with a JSON column that does not have an explicit JSON schema or convention. There are varying keys within the JSON. Some keys do not exist, some keys do exist but empty string values.
For example:
CREATE TABLE json_table (
id int,
json_data json
);
INSERT INTO json_table (id, json_data) VALUES
(1, '{"foo" : "bar", "baz": "biz"}');
(2, '{"foo" : "", "baz": "biz"}');
(3, '{"hello" : "world"}');
(4, '{"hello" : "world2", "foo" : "bar2", "baz" : "" }');
Is it possible to query this table and select/yield a subset of
json_data
with only key-value pairs that are non-empty strings?Is it possible to query this table and select records that contain JSON keys that are within a given list? (I essentially want to find records with
json_data
that have the key "foo")
For #2 I do have a solution that selects only records where json_data
keys are within a regex pattern:
SELECT tmp.*
FROM (
SELECT id,
ARRAY(SELECT json_object_keys(json_data))::text AS keys
FROM json_table
) tmp
WHERE tmp.keys LIKE ANY(ARRAY['%foo%', '%bar%']);
This yields:
"id","keys"
1,"{foo,baz}"
2,"{foo,baz}"
4,"{hello,foo,baz}"
But there's a caveat! I'm returning records that DO have the keys, but empty string values! Is there a way to incorporate a solution to #1 into this and pre-filter the JSON to only "valid" key-value pairs?
-
Here's an additional postgresql JSON question you might find interesting: dba.stackexchange.com/q/281480/45101blong– blong2020年12月14日 02:26:20 +00:00Commented Dec 14, 2020 at 2:26
1 Answer 1
You can remove the empty values during aggregation. And use an array operator in the final select to get only rows that contain a specific key:
SELECT tmp.*
FROM (
SELECT id,
ARRAY(SELECT t.k
from json_each_text(json_data) as t(k,v)
where nullif(trim(t.v),'') is not null) AS keys
FROM json_table
) tmp
WHERE keys && array['foo','bar']
;
If you want the complete JSON values, rather than just the keys, you can use something like this:
SELECT tmp.*
FROM (
SELECT id,
(SELECT jsonb_object_agg(t.k, t.v)
from json_each_text(json_data) as t(k,v)
where nullif(trim(t.v),'') is not null) AS keys
FROM json_table
) tmp
WHERE keys ?| array['foo','bar']
;
(Note that I am using jsonb
for the result in this case as that is much more flexible to handle. In general it's recommended to prefer jsonb over json these days)
If you need to cleanup the json frequently, you might want to consider writing a function for that:
create function non_empty_values(p_input jsonb)
returns jsonb
as
$$
SELECT jsonb_object_agg(t.k, t.v)
from jsonb_each(jsonb_strip_nulls(p_input)) as t(k,v)
where trim(t.v::text) <> '';
$$
language sql
immutable;
-
My friend thank you so so much! Wow! I actually learned so much from this solution: jsonb, how to aggregate json like you did (and subselect), different ways of filtering with arrays. This helps me out very much and I feel like everyone deals with frustrating JSON columns like mine haha. I did get a bit of trouble with your jsonb solution. I got an error saying "The # of binded parameters < the # of parameter markers". Regardless this I have a solution that works so I can move forward! Thank you again!Andy Matuszewski– Andy Matuszewski2020年04月24日 07:53:09 +00:00Commented Apr 24, 2020 at 7:53
-
@AndyMatuszewski: that error message is not a Postgres error messageuser1822– user18222020年04月24日 08:03:42 +00:00Commented Apr 24, 2020 at 8:03
Explore related questions
See similar questions with these tags.