In python script i'm exporting .shp files to .csv/.gpx/.kml
import ogr, ogr2ogr
...
ogr2ogr.main(["","-f", "CSV", csvfile, shpfile, "-lco", "GEOMETRY=AS_WKT", "-lco", "SEPARATOR=SEMICOLON"])
ogr2ogr.main(["","-f", "GPX", "-dsco", "GPX_USE_EXTENSIONS=YES", gpxfile, shpfile])
ogr2ogr.main(["","-f", "KML", kmlfile, shpfile])
The files are exported, but the language(Latvian) specific string field characters are incorrect. Shape files have string field "LABEL" with a value i.e. "marķieris".
In case of .kml file it is turned into "markieris", and in case of .csv and .gpx it is turned into "maríieris". All exported files have utf-8 (without BOM) encoding.
So far i had no luck finding if i have to pass some kind of additional parameter, or do something else to fix this language issue. Scrip is executed in Windows.
1 Answer 1
You should try with the following configuration option:
--config SHAPE_ENCODING="ISO-8859-4"
because ISO-8859-4 is the ISO alias of Latin 4.
Latin 4 introduces letters for Estonian, Latvian, and Lithuanian.
-
If i use
ogr2ogr --config SHAPE_ENCODING ISO-8859-4 -f KML output.kml input.shp
then it will give out warning "Warning 1: marĒieris is not a valid UTF-8 string. forcing it to ASCII." Forcing can be disabled byOGR_FORCE_ASCII NO
but then special characters are question marks. Also when tried with ISO-8859-1 it didn't change anything, nor it gave any warning, so i assume that is the default used encoding here.Sunder– Sunder2014年01月28日 09:07:49 +00:00Commented Jan 28, 2014 at 9:07
SHAPE_ENCODING=ISO-8859-4
, i.e. the ISO alias of Latin 4?