3

For several years now, I've been hunting this extremely quick-footed little bug which knows every hiding place in my metaphorical coding house. He lurks in the walls, sometimes takes months to show himself, and BAM! He's back, without forewarning. And the hunt continues. And then he runs into a hole in the wall and I can't get him out again no matter what. And so the wait begins...

To make a long story short: PHP and PostgreSQL simply do not have a fully consistent idea of what constitutes a valid UTF-8 string. I can now say this for 100% sure.

To avoid the errors that inevitably are logged when I try to INSERT data which PHP claims is valid UTF-8 but PostgreSQL thinks is not (regardless of who is actually right), I must make some kind of pre-query whose only job is to check if the string is valid UTF-8 according to PostgreSQL. Is this possible somehow?

I'm thinking of something like this:

SELECT is_string_valid_utf8_according_to_postgresql(1ドル);

This would then return a true/false so that I can know for sure if I can go ahead and execute the real, INSERT query, or skip it.

asked Feb 27, 2020 at 23:11

1 Answer 1

4

PostgreSQL is very strict about encodings, so you normally won't be able to get an incorrectly encoded string into the database. That is, your proposed function cannot operate on text, because you won't get an incorrectly encoded string into the database as text.

But if you load the string as binary data (type bytea), you can check the string with a function like this:

CREATE FUNCTION is_valid_utf8(bytea) RETURNS boolean
 LANGUAGE plpgsql AS
$$BEGIN
 PERFORM convert_from(1,ドル 'UTF8');
 RETURN TRUE;
EXCEPTION
 WHEN character_not_in_repertoire THEN
 RAISE WARNING '%', SQLERRM;
 RETURN FALSE;
END;$$;

Here it is in action:

SELECT is_valid_utf8('\x736368c3b66e');
 is_valid_utf8 
---------------
 t
(1 row)
SELECT is_valid_utf8('\x736368c3f66e');
WARNING: invalid byte sequence for encoding "UTF8": 0xc3 0xf6
 is_valid_utf8 
---------------
 f
(1 row)
answered Feb 28, 2020 at 8:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.