How to fix double-encoded UTF8 characters in postgres

Asked 4 years, 9 months ago

Viewed 3k times

I have a dataset (shapefile) with the same problem as the post below: https://stackoverflow.com/questions/11436594/how-to-fix-double-encoded-utf8-characters-in-an-utf-8-table

"A previous LOAD DATA INFILE was run under the assumption that the CSV file is latin1-encoded. During this import the multibyte characters were interpreted as two single character and then encoded using utf-8 (again). This double-encoding created anomalies like ÃƒÂ± instead of ñ."

However, the solution given is in mysql and not postgres, i tried it on postgres and it didn't work, just worked on mysql:

UPDATE tablename SET
 field = CONVERT(CAST(CONVERT(field USING latin1) AS BINARY) USING utf8);

I need to import and fix this shapefile using postgres because I will need to use postgis to do various spatial analyzes.

How can I solve this using postgis?

Improve this question

asked Dec 17, 2020 at 15:22

hugonbg's user avatar

hugonbg hugonbg

1011 silver badge2 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default

That is pretty similar in PostgreSQL:

convert_from(convert_to(textcol, 'LATIN1'), 'UTF8')

Improve this answer

answered Dec 17, 2020 at 15:38

Laurenz Albe's user avatar

Laurenz Albe Laurenz Albe

62k4 gold badges57 silver badges93 bronze badges

Thanks, but it didn't work. I tried this: SELECT( convert_from( convert_to(nome_area, 'LATIN1'), 'UTF8')) FROM sigef_sp_multiparte_latin1; Give a error: ERROR: invalid byte sequence for encoding "UTF8": 0xc3 0x3f SQL state: 22021

hugonbg
– hugonbg

2020年12月17日 21:40:35 +00:00
Commented Dec 17, 2020 at 21:40
Then that string is not "doubly UTF-8 encoded".

Laurenz Albe
– Laurenz Albe

2020年12月18日 06:56:28 +00:00
Commented Dec 18, 2020 at 6:56

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-sql

Stack Exchange Network

How to fix double-encoded UTF8 characters in postgres

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to fix double-encoded UTF8 characters in postgres

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions