Recently I was thinking about how Google, Microsoft Outlook on Windows Phone (and now many others) method of giving you a coloured icon with a letter in the centre of it works.
Essentially, given some sort of name, they generate a colour based on that name. The kicker is that the same name always generates the same colour, the icon then generates with the 'initials' (dependent upon implementation) of the sender of the email within it. A sample is below.
(Sorry for the size, it was enormous at native resolution, feel free to click to view original.)
In the image you can see that both "WO" icons are the same colour, but the two "G" icons are a different colour, this is because the "G" icons come from different email addresses and the "WO" icons come from the same one.
So of course I wanted to write a system that could generate a similar thing, based upon "some string", it could generate a colour that would be mostly unique to that string. (I don't particularly care if multiple strings return the same colour, but a similar string should be substantially different.)
This should be entirely predictable. Every single string should result in the exact same output every time. There should be no deviation (unless the underlying implementation of Random
changes).
For the solution of this I went purely LINQ / method calls, and I want to keep it that way.
new string[]
{
"EBrown"
}
.Select(name =>
new Random(
Encoding
.UTF8
.GetBytes(name)
.Select((b, i) =>
b << (i % 4) * 8
)
.Aggregate(0, (acc, i) =>
acc + i
)
)
)
.Select(r =>
new double[3]
{
r.NextDouble(),
r.NextDouble(),
r.NextDouble()
}
)
I did not return a Color
mostly because I'm not referencing any libraries that have a Color
(this was built in the C# Interactive window), but also because it should be generic enough to just return a list of three values for the color.
To address one of the comments:
What purpose of this:
.Aggregate(0, (acc, i) => acc + i)
? It is just the same asSum()
...
As the two of us discovered (thanks Maxim for starting this discussion) the Sum
method performs it's operation within a checked
context, and will throw an OverflowException
on the last test-case I provided in this case. Wrapping the entire thing in an unchecked
context does not result in a solution, as the local checked
context in Sum
takes precedence.
We could define a SumOverflow
method that would do the summation inside an unchecked
context, but that takes us out of vanilla-LINQ mode, and requires an additional dependency on a new method. (I literally want zero additional dependencies, I only want vanilla / naked LINQ.)
Regarding:
I also think this
b << (i % 4) * 8
requires encapsulation too. It's a magic formula. Perhaps you could explain what you are calculating here?
Shout out to t3chb0t: you're not wrong; however, that, once again, goes against my intent. I'm trying to design this purely LINQ-only, as I want to be able to copy/pasta it and modify as necessary for the specific implementation (if I am even doing so), but I don't want to use any additional dependencies. That said, if someone comes up with a way to encapsulate it and keep it simple, I'm more than happy to use it.
The formula itself is rather simple though: the goal is to take the [i]ndex
of the [b]yte
and move the [b]yte
left 8
bits times the rotational [i]ndex
across a rotation of 4
. So if the [b]yte
is 4
, and it's the 7
th byte, then it gets shifted so that it's the highest section of the int
. (Since the next step is to add all the integers together, I shift beforehand to make sure that we're not just adding the lowest byte of integers together.)
As an example:
String: EBrown
Byte Array: 0x45, 0x42, 0x72, 0x6f, 0x77, 0x6e
Shifted Int Array: 0x45, 0x4200, 0x720000, 0x6f000000, 0x77, 0x6e00
Sum: 6f72b0bc
Test cases:
EBrown: double[3] { 0.93380133292349121, 0.61282262793407893, 0.99123903596365781 }
Elliott: double[3] { 0.46952278049174828, 0.71652967376472876, 0.041705311295439168 }
Brown: double[3] { 0.45409023782894492, 0.70977234454349258, 0.61635322292165518 }
Elliott Brown: double[3] { 0.033125448521750721, 0.84549407793464793, 0.41680685589872618 }
3 Answers 3
You can use some encapsulation if you turn certain parts of your exprssions into LINQ queries. This will allow you to use the let
keyword to define Func
s. I think with all the helper let
s it's both easier to maintain and to understand.
var colors =
(from name in new[] { "EBrown", "t3chb0t" }
let shiftColorComponent = new Func<byte, int, int>((b, i) => b << (i % 4) * 8)
let safeSum = new Func<IEnumerable<int>, int>(values => values.Aggregate(0, (acc, i) => acc + i))
let encodedChars =
(from t in Encoding.UTF8.GetBytes(name).Select((b, i) => (b: b, i: i))
select shiftColorComponent(t.b, t.i))
let nameSeed = safeSum(encodedChars)
let nameRandom = new Random(nameSeed)
select Enumerable.Range(0, 3).Select(x => nameRandom.NextDouble()).ToArray());
-
\$\begingroup\$ I was actually just thinking about this as I was reading the other answer. :) +1 \$\endgroup\$Der Kommissar– Der Kommissar2017年08月01日 17:29:10 +00:00Commented Aug 1, 2017 at 17:29
-
\$\begingroup\$ I believe it is a bad idea to use
let
for variables that will never be changed and don't rely on elements of iterated collection. Initial LINQ query is more pretty in my opinion. This one looks like overengineering. But I like the last statemenet of your solution which eliminates three identical instructionsnameRandom.NextDouble()
. \$\endgroup\$Maxim– Maxim2017年08月02日 02:37:40 +00:00Commented Aug 2, 2017 at 2:37 -
\$\begingroup\$ @Maxim it is a bad idea to use
let
for variables that will never be changed and don't rely on elements of iterated collection I find it is silly to impose such constraints, sorry ;-) Maybe it's not optimal because the twoFunc
s will be recreated for each name but putting a large LINQ expression inside of the random-constructor doesn't seem to be right either. But who cares, the question already constrains what improvements are possible by forbidding the usage of additional functions which is against OO anyway because the reusabiliy here works by copy/paste and not via a library. \$\endgroup\$t3chb0t– t3chb0t2017年08月02日 04:07:58 +00:00Commented Aug 2, 2017 at 4:07 -
\$\begingroup\$ @t3chb0t Yes, recreation of delegates for each element of an input collection makes me silly as you said :) \$\endgroup\$Maxim– Maxim2017年08月02日 05:17:48 +00:00Commented Aug 2, 2017 at 5:17
-
\$\begingroup\$ @EBrown consider using a local function to gain some encapsulation and naming of the function. Alternatively, if you want the shift function to be specificable from "the outside" make it a delegate and force the client to specify the function. You can always specify a default and allow the client to override it too. \$\endgroup\$RubberDuck– RubberDuck2017年08月04日 00:46:32 +00:00Commented Aug 4, 2017 at 0:46
There are an infinite number of ways to do this but one possible approach would be to use a dedicated hash function to compress your inputs; out of sheer laziness and convenience I chose to use the built-in MD5CryptoServiceProvider
for my example.
To improve the performance characteristics of your code I hoisted the construction of the hashing class up a level and use SelectMany
so that we only have to instantiate it once per call instead of once per string. I then chose to sample the first three bytes of the hash result in order to come up with values for Red
, Green
, and Blue
.
new[] { new MD5CryptoServiceProvider() }
.SelectMany(
strings => new[] {
"A",
"B",
"C",
"D",
"E",
},
(hasher, value) => new {
StringValue = value,
StringHash = hasher.ComputeHash(Encoding.Unicode.GetBytes(value))
}
)
.Select(a => new {
Input = a.StringValue,
r = a.StringHash[0],
g = a.StringHash[1],
b = a.StringHash[2]
});
One could get doubles for RGB by using a provider that returns at least 24 bits (such as SHA256CryptoServiceProvider
) combined with some altered sampling logic:
.Select(a => new {
Input = a.StringValue,
// "magic" reference: https://www.doornik.com/research/randomdouble.pdf
r = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 0) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 4) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
g = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 8) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 12) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
b = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 16) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 20) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
});
-
\$\begingroup\$ Realistically, each color only has resolution for the three primary components for
[0, 255]
, so it wouldn't be necessary to use a higher-entropy hash, asa.StringHash[i] / 255.0
would suffice. \$\endgroup\$Der Kommissar– Der Kommissar2017年08月01日 17:31:48 +00:00Commented Aug 1, 2017 at 17:31 -
\$\begingroup\$ @EBrown You're absolutely right, the example is a bit more generalized than needed for this particular setup; a more efficient implementation would ideally use your logic along with a compression function that outputs 24 bits instead of the 128 that MD5 does. \$\endgroup\$Kittoes0124– Kittoes01242017年08月01日 18:00:12 +00:00Commented Aug 1, 2017 at 18:00
It's hard to get nice bright colors if you just pick random rgb values.
Instead I would recommend using HSV (or HSL) where it is easier to manage the brightness and saturation.
First result I got when searching.
If your only requirement is that it's copy/paste-able I would suggest using code blocks and writing proper functions:
.Select(name =>
{
var hash = HashName(name);
var hue = hash[0] / 255.0;
var saturation = 0.7 + hash[1] / 255.0 / 8;
var value = 0.92 + hash[2] / 255.0 / 16;
var color = HsvToRgb(hue, saturation, value);
return (name, r: color.r, g: color.g, b: color.b);
byte[] HashName(string toHash)
{
using(var hasher = new System.Security.Cryptography.SHA256Managed())
{
return hasher.ComputeHash(System.Text.Encoding.Unicode.GetBytes(toHash));
}
}
/* https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/ */
(double r, double g, double b) HsvToRgb(double h, double s, double v)
{
CheckInRange(h, nameof(h));
CheckInRange(s, nameof(s));
CheckInRange(v, nameof(v));
var h_i = (int)Math.Floor(h * 6);
var f = h * 6 - h_i;
var p = v * (1 - s);
var q = v * (1 - f * s);
var t = v * (1 - (1 - f) * s);
switch(h_i)
{
case 0: return (v, t, p);
case 1: return (q, v, p);
case 2: return (p, v, t);
case 3: return (p, q, v);
case 4: return (t, p, v);
case 5: return (v, p, q);
default: throw new InvalidOperationException();
}
void CheckInRange(double parameter, string parameterName)
{
if (double.IsNaN(parameter)
|| parameter < 0
|| parameter >= 1)
{
throw new ArgumentOutOfRangeException(parameterName, parameter, $"Expected range [0,1[ was: {parameter}");
}
}
}
})
You can then easily tweak the saturation and brightness in constrained ranges.
If you use LinqPad you can view the results with the following script:
foreach(var x in
Enumerable.Range(0, 100)
.Select(x => x.ToString())
.Select(name =>
{
var hash = HashName(name);
var hue = hash[0] / 255.0;
var saturation = 0.7 + hash[1] / 255.0 / 8;
var value = 0.92 + hash[2] / 255.0 / 16;
var color = HsvToRgb(hue, saturation, value);
return (name, r: color.r, g: color.g, b: color.b);
byte[] HashName(string toHash)
{
using(var hasher = new System.Security.Cryptography.SHA256Managed())
{
return hasher.ComputeHash(System.Text.Encoding.Unicode.GetBytes(toHash));
}
}
/* https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/ */
(double r, double g, double b) HsvToRgb(double h, double s, double v)
{
CheckInRange(h, nameof(h));
CheckInRange(s, nameof(s));
CheckInRange(v, nameof(v));
var h_i = (int)Math.Floor(h * 6);
var f = h * 6 - h_i;
var p = v * (1 - s);
var q = v * (1 - f * s);
var t = v * (1 - (1 - f) * s);
switch(h_i)
{
case 0: return (v, t, p);
case 1: return (q, v, p);
case 2: return (p, v, t);
case 3: return (p, q, v);
case 4: return (t, p, v);
case 5: return (v, p, q);
default: throw new InvalidOperationException();
}
void CheckInRange(double parameter, string parameterName)
{
if (double.IsNaN(parameter)
|| parameter < 0
|| parameter >= 1)
{
throw new ArgumentOutOfRangeException(parameterName, parameter, $"Expected range [0,1[ was: {parameter}");
}
}
}
})
)
{
var nameLabel = new System.Windows.Controls.Label
{
Content = x.Item1,
Background =
new System.Windows.Media.SolidColorBrush(
System.Windows.Media.Color.FromRgb((byte)(x.Item2 * 255), (byte)(x.Item3 * 255), (byte)(x.Item4 * 255))
)
};
PanelManager.StackWpfElement(nameLabel);
}
(Add PresentationCore.dll and PresentationFramework.dll to Additional References)
-
\$\begingroup\$ Using HSV is a little excessive, you can achieve a very similar manner of output by using clamping:
(int)((v * clampScale + clampOffset) * 255);
, for aclampScale
of0.35
and aclampOffset
of0.6
, it delivers the colors as very appropriately muted. \$\endgroup\$Der Kommissar– Der Kommissar2017年08月03日 14:43:47 +00:00Commented Aug 3, 2017 at 14:43 -
\$\begingroup\$ With clamping you're going to get a lot of grayish colors. If that's not a problem then go for it. HSV/HSL makes it a lot easier to get a proper rainbow of colors with a nice saturation and brightness. \$\endgroup\$Johnbot– Johnbot2017年08月04日 09:40:45 +00:00Commented Aug 4, 2017 at 9:40
(int)(maxColorVal * randomValue)
, so that if your colour system runs0-255
, you just(int)(255 * value)
, if it's 0-100 you(int)(100 * randomValue)
. It's not dependent on any one color scale. \$\endgroup\$HashCode
? \$\endgroup\$#if FEATURE_RANDOMIZED_STRING_HASHING
block, I need to find out if that feature is enabled or disabled for my current build. \$\endgroup\$FEATURE_RANDOMIZED_STRING_HASHING
is enabled, that could be problematic. That would mean that if two strings are in different application domains they would have a different hash-code result. I'd prefer not to have that become an issue, so I'm not going to useGetHashCode
for it (unless someone can prove that it's a non-issue). \$\endgroup\$