LINQ-ifying Colour Generation from Strings

Question 1

Recently I was thinking about how Google, Microsoft Outlook on Windows Phone (and now many others) method of giving you a coloured icon with a letter in the centre of it works.

Essentially, given some sort of name, they generate a colour based on that name. The kicker is that the same name always generates the same colour, the icon then generates with the 'initials' (dependent upon implementation) of the sender of the email within it. A sample is below.

Example images

(Sorry for the size, it was enormous at native resolution, feel free to click to view original.)

In the image you can see that both "WO" icons are the same colour, but the two "G" icons are a different colour, this is because the "G" icons come from different email addresses and the "WO" icons come from the same one.

So of course I wanted to write a system that could generate a similar thing, based upon "some string", it could generate a colour that would be mostly unique to that string. (I don't particularly care if multiple strings return the same colour, but a similar string should be substantially different.)

This should be entirely predictable. Every single string should result in the exact same output every time. There should be no deviation (unless the underlying implementation of Random changes).

For the solution of this I went purely LINQ / method calls, and I want to keep it that way.

new string[]
{
 "EBrown"
}
.Select(name =>
 new Random(
 Encoding
 .UTF8
 .GetBytes(name)
 .Select((b, i) =>
 b << (i % 4) * 8
 )
 .Aggregate(0, (acc, i) =>
 acc + i
 )
 )
)
.Select(r =>
 new double[3]
 {
 r.NextDouble(),
 r.NextDouble(),
 r.NextDouble()
 }
)

I did not return a Color mostly because I'm not referencing any libraries that have a Color (this was built in the C# Interactive window), but also because it should be generic enough to just return a list of three values for the color.

To address one of the comments:

What purpose of this: .Aggregate(0, (acc, i) => acc + i)? It is just the same as Sum()...

As the two of us discovered (thanks Maxim for starting this discussion) the Sum method performs it's operation within a checked context, and will throw an OverflowException on the last test-case I provided in this case. Wrapping the entire thing in an unchecked context does not result in a solution, as the local checked context in Sum takes precedence.

We could define a SumOverflow method that would do the summation inside an unchecked context, but that takes us out of vanilla-LINQ mode, and requires an additional dependency on a new method. (I literally want zero additional dependencies, I only want vanilla / naked LINQ.)

Regarding:

I also think this b << (i % 4) * 8 requires encapsulation too. It's a magic formula. Perhaps you could explain what you are calculating here?

Shout out to t3chb0t: you're not wrong; however, that, once again, goes against my intent. I'm trying to design this purely LINQ-only, as I want to be able to copy/pasta it and modify as necessary for the specific implementation (if I am even doing so), but I don't want to use any additional dependencies. That said, if someone comes up with a way to encapsulate it and keep it simple, I'm more than happy to use it.

The formula itself is rather simple though: the goal is to take the [i]ndex of the [b]yte and move the [b]yte left 8 bits times the rotational [i]ndex across a rotation of 4. So if the [b]yte is 4, and it's the 7th byte, then it gets shifted so that it's the highest section of the int. (Since the next step is to add all the integers together, I shift beforehand to make sure that we're not just adding the lowest byte of integers together.)

As an example:

String: EBrown
Byte Array: 0x45, 0x42, 0x72, 0x6f, 0x77, 0x6e
Shifted Int Array: 0x45, 0x4200, 0x720000, 0x6f000000, 0x77, 0x6e00
Sum: 6f72b0bc

Test cases:

EBrown: double[3] { 0.93380133292349121, 0.61282262793407893, 0.99123903596365781 }
Elliott: double[3] { 0.46952278049174828, 0.71652967376472876, 0.041705311295439168 }
Brown: double[3] { 0.45409023782894492, 0.70977234454349258, 0.61635322292165518 }
Elliott Brown: double[3] { 0.033125448521750721, 0.84549407793464793, 0.41680685589872618 }

Question 2

What color control takes 3 double for input?

Question 3

@Paparazzi Anything that uses ARGB with a 100% alpha. The idea is to be able to (int)(maxColorVal * randomValue), so that if your colour system runs 0-255, you just (int)(255 * value), if it's 0-100 you (int)(100 * randomValue). It's not dependent on any one color scale.

Question 4

ohai HashCode?

Question 5

@Vogel612 It's possible to utilize that. I had assumed it would not be string-specific, but a review of the source shows that it takes the current string into account. (It does the hash-code on all chars of the string from what I just read of the source.) Though I am a bit concerned on the #if FEATURE_RANDOMIZED_STRING_HASHING block, I need to find out if that feature is enabled or disabled for my current build.

Question 6

@Vogel612 So as it turns out, if FEATURE_RANDOMIZED_STRING_HASHING is enabled, that could be problematic. That would mean that if two strings are in different application domains they would have a different hash-code result. I'd prefer not to have that become an issue, so I'm not going to use GetHashCode for it (unless someone can prove that it's a non-issue).

Question 7

You can use some encapsulation if you turn certain parts of your exprssions into LINQ queries. This will allow you to use the let keyword to define Funcs. I think with all the helper lets it's both easier to maintain and to understand.

var colors =
 (from name in new[] { "EBrown", "t3chb0t" }
 let shiftColorComponent = new Func<byte, int, int>((b, i) => b << (i % 4) * 8)
 let safeSum = new Func<IEnumerable<int>, int>(values => values.Aggregate(0, (acc, i) => acc + i))
 let encodedChars =
 (from t in Encoding.UTF8.GetBytes(name).Select((b, i) => (b: b, i: i))
 select shiftColorComponent(t.b, t.i))
 let nameSeed = safeSum(encodedChars)
 let nameRandom = new Random(nameSeed)
 select Enumerable.Range(0, 3).Select(x => nameRandom.NextDouble()).ToArray());

Question 8

I was actually just thinking about this as I was reading the other answer. :) +1

Question 9

I believe it is a bad idea to use let for variables that will never be changed and don't rely on elements of iterated collection. Initial LINQ query is more pretty in my opinion. This one looks like overengineering. But I like the last statemenet of your solution which eliminates three identical instructions nameRandom.NextDouble().

Question 10

@Maxim it is a bad idea to use let for variables that will never be changed and don't rely on elements of iterated collection I find it is silly to impose such constraints, sorry ;-) Maybe it's not optimal because the two Funcs will be recreated for each name but putting a large LINQ expression inside of the random-constructor doesn't seem to be right either. But who cares, the question already constrains what improvements are possible by forbidding the usage of additional functions which is against OO anyway because the reusabiliy here works by copy/paste and not via a library.

Question 11

@t3chb0t Yes, recreation of delegates for each element of an input collection makes me silly as you said :)

Question 12

@EBrown consider using a local function to gain some encapsulation and naming of the function. Alternatively, if you want the shift function to be specificable from "the outside" make it a delegate and force the client to specify the function. You can always specify a default and allow the client to override it too.

Question 13

There are an infinite number of ways to do this but one possible approach would be to use a dedicated hash function to compress your inputs; out of sheer laziness and convenience I chose to use the built-in MD5CryptoServiceProvider for my example.

To improve the performance characteristics of your code I hoisted the construction of the hashing class up a level and use SelectMany so that we only have to instantiate it once per call instead of once per string. I then chose to sample the first three bytes of the hash result in order to come up with values for Red, Green, and Blue.

new[] { new MD5CryptoServiceProvider() }
.SelectMany(
 strings => new[] {
 "A",
 "B",
 "C",
 "D",
 "E",
 },
 (hasher, value) => new {
 StringValue = value,
 StringHash = hasher.ComputeHash(Encoding.Unicode.GetBytes(value))
 }
)
.Select(a => new {
 Input = a.StringValue,
 r = a.StringHash[0],
 g = a.StringHash[1],
 b = a.StringHash[2]
});

One could get doubles for RGB by using a provider that returns at least 24 bits (such as SHA256CryptoServiceProvider) combined with some altered sampling logic:

.Select(a => new {
 Input = a.StringValue,
 // "magic" reference: https://www.doornik.com/research/randomdouble.pdf
 r = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 0) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 4) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
 g = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 8) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 12) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
 b = (0.5d + (2.22044604925031308085e-016d / 2) + (BitConverter.ToInt32(a.StringHash, 16) * 2.32830643653869628906e-010d) + ((BitConverter.ToInt32(a.StringHash, 20) & 0x000FFFFF) * 2.22044604925031308085e-016d)),
});

Question 14

Realistically, each color only has resolution for the three primary components for [0, 255], so it wouldn't be necessary to use a higher-entropy hash, as a.StringHash[i] / 255.0 would suffice.

Question 15

@EBrown You're absolutely right, the example is a bit more generalized than needed for this particular setup; a more efficient implementation would ideally use your logic along with a compression function that outputs 24 bits instead of the 128 that MD5 does.

Question 16

It's hard to get nice bright colors if you just pick random rgb values.

Instead I would recommend using HSV (or HSL) where it is easier to manage the brightness and saturation.

First result I got when searching.

If your only requirement is that it's copy/paste-able I would suggest using code blocks and writing proper functions:

.Select(name =>
{
 var hash = HashName(name);
 var hue = hash[0] / 255.0;
 var saturation = 0.7 + hash[1] / 255.0 / 8;
 var value = 0.92 + hash[2] / 255.0 / 16;
 var color = HsvToRgb(hue, saturation, value);
 return (name, r: color.r, g: color.g, b: color.b);
 byte[] HashName(string toHash)
 {
 using(var hasher = new System.Security.Cryptography.SHA256Managed())
 {
 return hasher.ComputeHash(System.Text.Encoding.Unicode.GetBytes(toHash));
 }
 }
 /* https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/ */
 (double r, double g, double b) HsvToRgb(double h, double s, double v)
 {
 CheckInRange(h, nameof(h));
 CheckInRange(s, nameof(s));
 CheckInRange(v, nameof(v));
 var h_i = (int)Math.Floor(h * 6);
 var f = h * 6 - h_i;
 var p = v * (1 - s);
 var q = v * (1 - f * s);
 var t = v * (1 - (1 - f) * s);
 switch(h_i)
 {
 case 0: return (v, t, p);
 case 1: return (q, v, p);
 case 2: return (p, v, t);
 case 3: return (p, q, v);
 case 4: return (t, p, v);
 case 5: return (v, p, q);
 default: throw new InvalidOperationException();
 }
 void CheckInRange(double parameter, string parameterName)
 {
 if (double.IsNaN(parameter)
 || parameter < 0
 || parameter >= 1)
 {
 throw new ArgumentOutOfRangeException(parameterName, parameter, $"Expected range [0,1[ was: {parameter}");
 }
 }
 }
})

You can then easily tweak the saturation and brightness in constrained ranges.

If you use LinqPad you can view the results with the following script:

foreach(var x in
Enumerable.Range(0, 100)
.Select(x => x.ToString())
.Select(name =>
{
 var hash = HashName(name);
 var hue = hash[0] / 255.0;
 var saturation = 0.7 + hash[1] / 255.0 / 8;
 var value = 0.92 + hash[2] / 255.0 / 16;
 var color = HsvToRgb(hue, saturation, value);
 return (name, r: color.r, g: color.g, b: color.b);
 byte[] HashName(string toHash)
 {
 using(var hasher = new System.Security.Cryptography.SHA256Managed())
 {
 return hasher.ComputeHash(System.Text.Encoding.Unicode.GetBytes(toHash));
 }
 }
 /* https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/ */
 (double r, double g, double b) HsvToRgb(double h, double s, double v)
 {
 CheckInRange(h, nameof(h));
 CheckInRange(s, nameof(s));
 CheckInRange(v, nameof(v));
 var h_i = (int)Math.Floor(h * 6);
 var f = h * 6 - h_i;
 var p = v * (1 - s);
 var q = v * (1 - f * s);
 var t = v * (1 - (1 - f) * s);
 switch(h_i)
 {
 case 0: return (v, t, p);
 case 1: return (q, v, p);
 case 2: return (p, v, t);
 case 3: return (p, q, v);
 case 4: return (t, p, v);
 case 5: return (v, p, q);
 default: throw new InvalidOperationException();
 }
 void CheckInRange(double parameter, string parameterName)
 {
 if (double.IsNaN(parameter)
 || parameter < 0
 || parameter >= 1)
 {
 throw new ArgumentOutOfRangeException(parameterName, parameter, $"Expected range [0,1[ was: {parameter}");
 }
 }
 }
})
)
{
 var nameLabel = new System.Windows.Controls.Label 
 {
 Content = x.Item1, 
 Background = 
 new System.Windows.Media.SolidColorBrush(
 System.Windows.Media.Color.FromRgb((byte)(x.Item2 * 255), (byte)(x.Item3 * 255), (byte)(x.Item4 * 255))
 ) 
 };
 PanelManager.StackWpfElement(nameLabel);
}

(Add PresentationCore.dll and PresentationFramework.dll to Additional References)

Question 17

Using HSV is a little excessive, you can achieve a very similar manner of output by using clamping: (int)((v * clampScale + clampOffset) * 255);, for a clampScale of 0.35 and a clampOffset of 0.6, it delivers the colors as very appropriately muted.

Question 18

With clamping you're going to get a lot of grayish colors. If that's not a problem then go for it. HSV/HSL makes it a lot easier to get a proper rainbow of colors with a nice saturation and brightness.

t3chb0t t3chb0t 44.7k9 gold badges84 silver badges190 bronze badges · Accepted Answer · 2017-08-01 17:27:57Z

7

\$\begingroup\$

You can use some encapsulation if you turn certain parts of your exprssions into LINQ queries. This will allow you to use the let keyword to define Funcs. I think with all the helper lets it's both easier to maintain and to understand.

var colors =
 (from name in new[] { "EBrown", "t3chb0t" }
 let shiftColorComponent = new Func<byte, int, int>((b, i) => b << (i % 4) * 8)
 let safeSum = new Func<IEnumerable<int>, int>(values => values.Aggregate(0, (acc, i) => acc + i))
 let encodedChars =
 (from t in Encoding.UTF8.GetBytes(name).Select((b, i) => (b: b, i: i))
 select shiftColorComponent(t.b, t.i))
 let nameSeed = safeSum(encodedChars)
 let nameRandom = new Random(nameSeed)
 select Enumerable.Range(0, 3).Select(x => nameRandom.NextDouble()).ToArray());

Share

answered Aug 1, 2017 at 17:27

t3chb0t's user avatar

t3chb0t t3chb0t

44.7k9 gold badges84 silver badges190 bronze badges

\$\endgroup\$

5

\$\begingroup\$ I was actually just thinking about this as I was reading the other answer. :) +1 \$\endgroup\$

Der Kommissar
– Der Kommissar

2017年08月01日 17:29:10 +00:00
Commented Aug 1, 2017 at 17:29
\$\begingroup\$ I believe it is a bad idea to use let for variables that will never be changed and don't rely on elements of iterated collection. Initial LINQ query is more pretty in my opinion. This one looks like overengineering. But I like the last statemenet of your solution which eliminates three identical instructions nameRandom.NextDouble(). \$\endgroup\$

Maxim
– Maxim

2017年08月02日 02:37:40 +00:00
Commented Aug 2, 2017 at 2:37
\$\begingroup\$ @Maxim it is a bad idea to use let for variables that will never be changed and don't rely on elements of iterated collection I find it is silly to impose such constraints, sorry ;-) Maybe it's not optimal because the two Funcs will be recreated for each name but putting a large LINQ expression inside of the random-constructor doesn't seem to be right either. But who cares, the question already constrains what improvements are possible by forbidding the usage of additional functions which is against OO anyway because the reusabiliy here works by copy/paste and not via a library. \$\endgroup\$

t3chb0t
– t3chb0t

2017年08月02日 04:07:58 +00:00
Commented Aug 2, 2017 at 4:07
\$\begingroup\$ @t3chb0t Yes, recreation of delegates for each element of an input collection makes me silly as you said :) \$\endgroup\$

Maxim
– Maxim

2017年08月02日 05:17:48 +00:00
Commented Aug 2, 2017 at 5:17
\$\begingroup\$ @EBrown consider using a local function to gain some encapsulation and naming of the function. Alternatively, if you want the shift function to be specificable from "the outside" make it a delegate and force the client to specify the function. You can always specify a default and allow the client to override it too. \$\endgroup\$

RubberDuck
– RubberDuck

2017年08月04日 00:46:32 +00:00
Commented Aug 4, 2017 at 0:46

Add a comment |

Stack Exchange Network

LINQ-ifying Colour Generation from Strings

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

LINQ-ifying Colour Generation from Strings

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions