I am using below code to read a CSV file:
using (StreamReader readfile = new StreamReader(FilePath, Encoding.GetEncoding("iso-8859-1")))
{
// some code will go here
}
There is a character œin a column of the CSV file. Which is getting converted to ? in output. How can i get this œ encoded correctly so that in output i will get same œ character not a question mark.
-
1A question mark always means that the character has no valid representation in the desired encoding. You must use some other encoding to get it working, or be ready to replace that character with some other.Alejandro– Alejandro2019年02月01日 12:39:03 +00:00Commented Feb 1, 2019 at 12:39
-
1iso-8859-1 has no encoding for this character (U+0153 Œ LATIN SMALL LIGATURE OE), so your CSV file seems to use a different charset.Klaus Gütter– Klaus Gütter2019年02月01日 13:05:45 +00:00Commented Feb 1, 2019 at 13:05
2 Answers 2
This is an encoding problem. Many non-Unicode encodings are either incomplete and translate many characters to "?", or have subtly different behavior on different platforms. Consider using UTF-8 or UTF-16 as the default. At least, if you can.
"windows-1252" is a superset of "ISO-8859-1". Try with Encoding.GetEncoding(1252).
Demo:
public static void Main()
{
System.IO.File.AppendAllText("test","œ", System.Text.Encoding.GetEncoding(1252));
var content = System.IO.File.ReadAllText("test", System.Text.Encoding.GetEncoding(1252));
Console.WriteLine(content);
}
Comments
The iso-8859-15 character set contain those symbols, as does the Windows-1252 codepage. However, be aware that 8859-15 redefines six other RARELY USED (or ASCII duplicate) chars that are in 8859-1, but so does Windows 1252. A quick web search will reveal those differences.