I am trying to port some Javascript to C# and I'm having a bit of trouble. The javascript I am porting calls this
var binary = out.map(function (c) {
return String.fromCharCode(c);
}).join("");
return btoa(binary);
out is an array of numbers. I understand that it is taking the numbers and using fromCharCode to add characters to a string. At first I wasn't sure if my C# equivalent of btoa was working correctly, but the only characters I'm having issues with are the first 6 or 8. My encoded string outputs the same except for the first few characters.
At first in C# I was doing this
String binary = "";
foreach(int val in output){
binary += ((char)val);
}
And then I tried
foreach(int val in output){
System.Text.ASCIIEncoding convertor = new System.Text.ASCIIEncoding();
char o = convertor.GetChars(new byte[] { (byte)val })[0];
binary += o;
}
Both work fine on the later characters of the String but not the start. I've researched but I don't know what I'm missing.
My array of numbers is as follows: { 10, 135, 3, 10, 182, ....}
I know the 10s are newline characters, the 3 is end of text, the 182 is ¶, but what's confusing me is that the 135 should be the double dagger ‡. The Javascript does not show it when I print the string.
So what ends up happening is when the String is converted to Base64 my string looks like Cj8DCj8CRFF.... while the Javascript String looks like CocDCrYCRFF.... The rest of the strings are the same and the int arrays used are identical.
Any ideas?
-
Are you trying to encode binary data, or text? Also, you probably need to read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)Tim S.– Tim S.2014年05月13日 18:28:15 +00:00Commented May 13, 2014 at 18:28
-
out (or output in my C# code) is an array of numbers. I'm trying to encode it to a text String.user3629996– user36299962014年05月13日 18:30:26 +00:00Commented May 13, 2014 at 18:30
1 Answer 1
It's important to understand that binary data does not always represent valid text in a given encoding, and that some encodings have variable numbers of bytes to represent different characters. In short: binary data and text are not the same at all, and you can only convert between the two in some cases and by following clear, accurate rules. Treating them incorrectly will cause pain.
That said, if you have a list of int
s, that are always within the range 0-255, that should become a base64 string, here is a way to do it:
var output = new[] { 0, 1, 2, 68, 69, 70, 254, 255 };
var binary = new List<byte>();
foreach(int val in output){
binary.Add((byte)val);
}
var result = Convert.ToBase64String(binary.ToArray());
If you have text that should be encoded as a base64 string...generally I'd recommend UTF8 encoding, unless you need it to match the JS's implementation.
var str = "Hello, world!";
var result = Convert.ToBase64String(Encoding.UTF8.GetBytes(str));
The encoding that JS uses appears to be the same as casting between byte
and char
(char
s> 255 are invalid), which isn't one of the standard Encoding
s available.
Here's how you might combine raw numbers and strings, then convert that to base64.
checked // ensures that values outside of byte's range do not fail silently
{
var output = new int[] { 10, 135, 3, 10, 182 };
var binary = output.Select(x => (byte)x)
.Concat("Hello, world".Select(c => (byte)c)).ToArray();
var result = Convert.ToBase64String(binary);
}