After some research I found an example where I can find tables and its columns with NULL values. But the function also returns true when the column is empty. It gives me errors when I try to add an or statement. How can I modify the function so it returns also true when the column contains blank values? This is the function I found:
create function has_nulls(p_schema in text, p_table in text, p_column in text)
returns boolean language plpgsql as $$
declare
b boolean;
begin
execute 'select exists(select * from '||
p_table||' where '||p_column||' is null)' into b;
return b;
end;$$;
1 Answer 1
Assuming "empty" and "blank values" means empty string (''
).
This function checks whether the passed table has any NULL
or empty values (''
) in the passed column (which must be a string type or some other type where the empty string is valid; not the case for numeric types for instance):
CREATE FUNCTION f_has_missing(_tbl regclass, _col text, OUT has_missing boolean)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE
format($$SELECT EXISTS (SELECT FROM %s WHERE %2$I = '' OR %2$I IS NULL)$,ドル _tbl, _col)
INTO has_missing;
END
$func$;
Call:
SELECT f_has_missing('tablename', 'column')
Or, optionally, schema-qualified:
SELECT f_has_missing('schema.tablename', 'column')
db<>fiddle here
Probably most important: Never concatenate parameter values into SQL code blindly. That's begging for SQL injection. I sanitized the code with format()
. See:
There are a couple of smart expressions to check for both:
(string_col = '') IS NOT FALSE
(string_col <> '') IS NOT TRUE
coalesce(string_col, '') = ''
See:
But I chose this plain and more verbose expression for two reason:
string_col = '' OR string_col IS NULL
I have grown fond of simple, obvious code, and none of the above is as clear as this. But more importantly, this expression can use an index on (string_col)
, while the above cannot - which makes a big difference for big tables. See the added demo in the fiddle!
db<>fiddle here
Obviously, we want a different name for the function than "has_nulls", now.
I use an OUT
parameter for convenience and short code. Now we can assign to it and be done.
And I use regclass
as IN
parameter for the table name. This way I can provide a schema explicitly or not. Again, see:
Aside, one might loop through all columns of a table or a whole db to find any such column at once. Related:
Explore related questions
See similar questions with these tags.
WHERE coalesce(col, '') ~ '^[[:space:]]*$'