There are two issues with your otherwise elegant approach:
iconv
silently cuts the string if a disallowed UTF-8 character is present. The solution would be to add//IGNORE
to theiconv()
call but 1/ a bug in glibc seems to prevent this 2/ PHP developers don't seem to want to implement a work-around. An option is to remove invalid characters yourself:ini_set('mbstring.substitute_character', "none"); $text= mb_convert_encoding($text, 'UTF-8', 'UTF-8');
You're not removing all characters that are present in ASCII but disallowed in a URL: see this StackOverflow answer see this StackOverflow answer.
There are two issues with your otherwise elegant approach:
iconv
silently cuts the string if a disallowed UTF-8 character is present. The solution would be to add//IGNORE
to theiconv()
call but 1/ a bug in glibc seems to prevent this 2/ PHP developers don't seem to want to implement a work-around. An option is to remove invalid characters yourself:ini_set('mbstring.substitute_character', "none"); $text= mb_convert_encoding($text, 'UTF-8', 'UTF-8');
You're not removing all characters that are present in ASCII but disallowed in a URL: see this StackOverflow answer.
There are two issues with your otherwise elegant approach:
iconv
silently cuts the string if a disallowed UTF-8 character is present. The solution would be to add//IGNORE
to theiconv()
call but 1/ a bug in glibc seems to prevent this 2/ PHP developers don't seem to want to implement a work-around. An option is to remove invalid characters yourself:ini_set('mbstring.substitute_character', "none"); $text= mb_convert_encoding($text, 'UTF-8', 'UTF-8');
You're not removing all characters that are present in ASCII but disallowed in a URL: see this StackOverflow answer.
There are two issues with your otherwise elegant approach:
iconv
silently cuts the string if a disallowed UTF-8 character is present. The solution would be to add//IGNORE
to theiconv()
call but 1/ a bug in glibc seems to prevent this 2/ PHP developers don't seem to want to implement a work-around. An option is to remove invalid characters yourself:ini_set('mbstring.substitute_character', "none"); $text= mb_convert_encoding($text, 'UTF-8', 'UTF-8');
You're not removing all characters that are present in ASCII but disallowed in a URL: see this StackOverflow answer.