Skip to main content
Code Review

Return to Revisions

5 of 5
replaced http://stackoverflow.com/ with https://stackoverflow.com/

edit3: I've created a full-fledged RadixEncoding (both encoding and decoding) class that will accept any base and can handle the ending zero bytes. I posted it in Q&A form on StackOverflow

@codesparkle's implementation can be improved some more (performance wise).

edit2: I've come across an issue with using BigInt. While BigInt works fine for text, it will lose precision when dealing with raw bytes that end in more than one (or two) zero bytes (in Base36's case anyway). Eg, if you fed ToBase36String {0xFF,0xFF,0x00,0x00} you would actually lose that last byte (MSB) in the resulting Base36 encoded string ("1ekf"). Due to how BigInt works around the "sign" of a BigInt value, if you were to feed the encoder with {0xFF,0x7F,0x00} the zero byte will be lost ("pa7"). See: http://msdn.microsoft.com/en-us/library/system.numerics.biginteger.aspx (search for "positive values in which the most significant bit of the last byte in the byte array would ordinarily be set should include an additional byte whose value is 0")

Instead of using a StringBuilder, Inserting each digit at index 0, and having to deal with byte swaping the input, one can use a generic List of char, which we Add to and Reverse at the end of the procedure.

edit: Whoops. When I was running this on actual big endian data, I realized you still had to reverse the array at one point...but the operation can be held off until the end. So if the byte array is big endian, you actually wouldn't have to Reverse the result, saving even more processing time. I've updated the code with the needed fixes

Using BigInteger.IsZero is also less computationally intensive (simple == condition under the hood) than BigInteger's greater-than operator, which calls CompareTo(BigInteger) underneath the hood.

using System;
using System.Numerics;
const int kByteBitCount= 8; // number of bits in a byte
const string kBase36Digits= "0123456789abcdefghijklmnopqrstuvwxyz";
// constants that we use in ToBase36CharArray
static readonly double kBase36CharsLengthDivisor= Math.Log(kBase36Digits.Length, 2);
static readonly BigInteger kBigInt36= new BigInteger(36);
public static string ToBase36String(this byte[] bytes, bool bigEndian=false)
{
 // Estimate the result's length so we don't waste time realloc'ing
 int result_length= (int)
 Math.Ceiling(bytes.Length * kByteBitCount / kBase36CharsLengthDivisor);
 // We use a List so we don't have to CopyTo a StringBuilder's characters
 // to a char[], only to then Array.Reverse it later
 var result= new System.Collections.Generic.List<char>(result_length);
 var dividend= new BigInteger(bytes);
 // IsZero's computation is less complex than evaluating "dividend > 0"
 // which invokes BigInteger.CompareTo(BigInteger)
 while (!dividend.IsZero)
 {
 BigInteger remainder;
 dividend= BigInteger.DivRem(dividend, kBigInt36, out remainder);
 int digit_index= Math.Abs((int)remainder);
 result.Add(kBase36Digits[digit_index]);
 }
 // orientate the characters in big-endian ordering
 if(!bigEndian)
 result.Reverse();
 // ToArray will also trim the excess chars used in length prediction
 return new string(result.ToArray());
}

Compared to @codesparkle's "A test 1234", 1,000,000 iterations:

His: 3.593306
Mine: 2.6857364

Compared to @codesparkle's "A test 1234. Made slightly larger!", 100,000 iterations:

His: 1.5366169
Mine: 1.1814889

My CPU is an i7 860 @2.80GHz. The code was tested in a Release x64 build (ran by itself, not under a debugger) using Stopwatch for timing.

kornman00
  • 171
  • 1
  • 5
default

AltStyle によって変換されたページ (->オリジナル) /