"Lowest unique number" challenge

Question 1

Again, out of fun on Saturday night, I decided to solve the following problem from CodeEval.

The challenge is finding the index of the lowest unique number.

There is a game where each player picks a number from 1 to 9, writes it on a paper and gives to a guide. A player wins if his number is the lowest unique. We may have 10-20 players in our game.

Input sample:

Your program should accept as its first argument a path to a filename.

You're a guide and you're given a set of numbers from players for the round of game. E.g. 2 rounds of the game look this way:
3 3 9 1 6 5 8 1 5 3
9 2 9 9 1 8 8 8 2 1 1
Output sample:

Print a winner's position or 0 in case there is no winner. In the first line of input sample the lowest unique number is 6. So player 5 wins.
5
0
Suppose you're given a set of numbers from players for the round of game. E.g. 2 rounds of the game look this way:
3 3 9 1 6 5 8 1 5 3
9 2 9 9 1 8 8 8 2 1 1
The output is the winner's position, or 0 in case there is no winner. In the first line of input sample the lowest unique number is 6, so player 5 wins (it was player number 5 who threw number 6). And in the second set there is no unique number, so the output will be 0.

So, the overall output will be:
5
0

I solved this problem correctly using Linq in the following code, but the performance is not that great at 255 ms for about 100 data points. Any suggestion for improving performance or perhaps another novel solution? The data have to be read from a text file in CodeEval, so my code is structured around that way.

static void Main(string[] args)
{
 using (StreamReader reader = File.OpenText(args[0]))
 {
 while (!reader.EndOfStream)
 {
 var linevalue = reader.ReadLine();
 if (!string.IsNullOrEmpty(linevalue))
 {
 var numbers = linevalue.Split(' ');
 var groups = numbers.GroupBy(x => x).Select(g => new
 {
 number = g.Key,
 freq = g.Count()
 });
 var nonRepeated = groups.Where(x => x.freq == 1).ToList();
 if (nonRepeated.Count() != 0)
 {
 var singles = nonRepeated.Select(x => x.number).ToList();
 var smallest = singles.Min();
 var playerNumber = Array.IndexOf(numbers, smallest);
 Console.WriteLine(playerNumber + 1);
 }
 else
 {
 Console.WriteLine("0");
 }
 }
 }
 }
}

Question 2

The question is written in a very confusing way.

Question 3

Do you need to use LINQ?

Question 4

@TopinFrassi You are allowed to use .Net 4.0, so LINQ is allowed.

Question 5

I threw your code through a profiler and it told me that ToList was responsible for quite a bit of your execution time. So why is that?

You have 2 ToList calls in your code::

var nonRepeated = groups.Where(x => x.freq == 1).ToList();
if (nonRepeated.Count() != 0)
{
 var singles = nonRepeated.Select(x => x.number).ToList();
 var smallest = singles.Min();

So what happens here? You start by iterating through groups to find those with freq equal to 1. Count is optimized to lists, so that one should be constant time. Then you iterate through nonRepeated to select number and then finally you iterate through it to select the minimum value. That's a lot of iterations. Let's try to cut that down:

var nonRepeated = groups.Where(x => x.freq == 1)
 .Min(x => x.number);
if (nonRepeated != null)
{
 var playerNumber = Array.IndexOf(numbers, nonRepeated);

By comparing the execution time of this and the other one (on my system) it seems we have achieved a small improvement. Not enough to brag about though. Since we don't have any ToList calls anymore, I profiled the code once again. This time it is Min that takes up the time (not really surprising - It is forcing the GroupBy, Select and Where to be evaluated). So our bottleneck is (still) the LINQ queries. Can we speed it up?

Hint: LINQ usually have a slight overhead compared to doing things 'manually'.

Let's try to do these things ourselves, instead of relying on LINQ. Let's replace that GroupBy and Select stuff:

var values = new Dictionary<string, int>();
foreach (var number in numbers)
{
 if (values.ContainsKey(number)) //No curly-brackets to keep it short.
 values[number]++;
 else
 values.Add(number, 1);
}

A benchmark of the code told me that was a clear improvement. Nice! But the profiler still says that Where and Min are expensive. Let's do those manually too.

string minValue = null; 
foreach (var value in values)
{
 if (value.Value == 1)
 {
 if (minValue == null || String.Compare(value.Key, minValue, StringComparison.InvariantCulture) == -1)
 minValue = value.Key;
 }
}
if (minValue != null)
 ...

Once again, it seems we got an overall improvement in performance. Interestingly now the bottlenecks seem to be WriteLine and OpenText. We could try to experiment with other ways of reading from file/writing to console, but I don't really feel like doing that right now (and I don't think the 'easy' methods like File.ReadAllLines and File.ReadLines is much faster than your way, if at all). You could try to play with StringBuilder and only write to console once, but in my experience it is rarely an improvement when the method is only executed once.

But maybe you can do some other things to squeeze just a little bit more out of your method.

First of all, is the IsNullOrEmpty really necessary? It's a safety mechanism and in real life we would probably want to have it. But this is a coding exercise and we are all about performance right now.

Another thing you might want to consider is that strings are expensive. As it is, you are currently working with all your numbers as strings. Split creates a bunch of new strings and every time we compare values we do a string comparison. It might be worth to convert your values to int or char instead.

HINT: If you try to convert to int values, it can be done by using char ASCII values. You know that the values will always be in the range [1-9]. The ASCII value for '1' is 49, '2' is 50, etc. So if you do value[0] - 48, you can convert you strings to ints with very little overhead.

When you want to optimize something it is always a good idea to measure and profile it and see just where the time is spent. Sometimes you can get big improvements with very few changes.. But you can also make things worse with very few changes.

After the changes mentioned in this post (except the string to int conversion) is applied, the code will look something like this:

using (StreamReader reader = File.OpenText(args[0]))
{
 while (!reader.EndOfStream)
 {
 var linevalue = reader.ReadLine();
 var numbers = linevalue.Split(' ');
 var values = new Dictionary<string, int>();
 string minValue = null;
 foreach (var number in numbers)
 {
 if (values.ContainsKey(number))
 values[number]++;
 else
 values.Add(number, 1);
 }
 foreach (var value in values)
 {
 if (value.Value == 1)
 {
 if (minValue == null || String.Compare(value.Key, minValue, StringComparison.InvariantCulture) == -1)
 minValue = value.Key;
 }
 }
 if (minValue != null)
 {
 var playerNumber = Array.IndexOf(numbers, minValue);
 Console.WriteLine(playerNumber + 1);
 }
 else
 {
 Console.WriteLine("0");
 }
 }
}

As you can see, you have 2 loops within the while now. It might be possible to merge them into one by doing something clever.

I tried the code on CodeEval and got a 181ms execution time. That is a 74 ms improvement. To motivate you a bit, I know it can be done (at least a little) faster. ;)

Question 6

Thank you for spending time and writing a good answer. I did put the code in the CodeEval as well, it ran at 213 ms, but of course there is a decent improvement.

Question 7

@user843681 There can be a little luck involved with at managed language like C#. The same solution can sometimes have different performance results, simply because we do not control the runtime (that is why, when benchmarking, we typically run the code multiple times). It does seem like a big difference though and I don't know what the reason is.

Question 8

Using LINQ might not be the best solution for something relatively simple, like this. Your code, for instance, involves a lot of iterating through different collections. Using a SortedDictionary to keep track of the unique numbers, the lowest being the first one, and a list to keep track of which numbers have been considered, greatly cuts down the number of iterations:

static void Main(string[] args)
{
 using (StreamReader reader = File.OpenText(args[0]))
 {
 while (!reader.EndOfStream)
 {
 var numbers = reader.ReadLine().Split(' ').Select<string, int>(x => int.Parse(x)).ToArray();
 int numcount = numbers.Length;
 if (numcount > 0)
 {
 SortedDictionary<int, int> freqtable = new SortedDictionary<int, int>();
 var usedints = Enumerable.Range(0, 10).Select(x => x = 0).ToList();
 for (int i = 0; i < numcount; i++)
 {
 if (freqtable.ContainsKey(numbers[i]))
 {
 freqtable.Remove(numbers[i]);
 usedints[numbers[i]] = i;
 }
 else
 {
 if (usedints[numbers[i]] == 0)
 freqtable.Add(numbers[i], i + 1);
 }
 }
 if (freqtable.Count > 0)
 Console.WriteLine(freqtable.First().Value);
 else
 Console.WriteLine(0);
 }
 }
 }
}

EDIT: Re-worked the solution. Did some more thinking on it and got the time down to 151 ms. Changed the stream to binary, the loop to only read 2 bytes at a time and the collections to be arrays. This cut down on time and memory:

using System;
using System.Collections.Generic;
using System.Linq;
using System.IO;
namespace LowestUniqueNum_cs
{
 class Program
 {
 static void Main(string[] args)
 {
 using (BinaryReader reader = new BinaryReader(File.OpenRead(args[0])))
 {
 while (reader.PeekChar() > 0)
 {
 KeyValuePair<int, int>[] minvalue = new KeyValuePair<int,int>[10];
 int[] usedints = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
 int number = 0;
 for (int i = 1; ; i++)
 {
 byte[] temp = reader.ReadBytes(2);
 number = temp[0] - 48;
 if (usedints[number] == 0)
 {
 minvalue[number] = new KeyValuePair<int, int>(number, i);
 usedints[number] = number;
 }
 else if(minvalue[number].Key != 0)
 {
 minvalue[number] = new KeyValuePair<int, int>();
 }
 if (temp.Length != 2 || temp[1] != 32)
 {
 break;
 }
 }
 Console.WriteLine(minvalue.FirstOrDefault(x => x.Key > 0).Value);
 }
 }
 }
 }
}

Question 9

Your solution takes about 266 ms for the same amount of data input, it just has a slightly better memory usage than my solution. There must be a better and faster solution.

Question 10

Select(x => x = 0) What is this trying to do? Because it's equivalent to Select(x => 0) and I doubt that was the intention.

Question 11

It sets the list to default values

Question 12

There is no need for the last Clear and reassignment of usedints since you recreate both in the next iteration of the loop.

Question 13

@user843681 - Made some revisions that might interest you.

MAV MAV 3012 silver badges7 bronze badges · Accepted Answer · 2014-10-06 00:05:12Z

I threw your code through a profiler and it told me that ToList was responsible for quite a bit of your execution time. So why is that?

You have 2 ToList calls in your code::

var nonRepeated = groups.Where(x => x.freq == 1).ToList();
if (nonRepeated.Count() != 0)
{
 var singles = nonRepeated.Select(x => x.number).ToList();
 var smallest = singles.Min();

So what happens here? You start by iterating through groups to find those with freq equal to 1. Count is optimized to lists, so that one should be constant time. Then you iterate through nonRepeated to select number and then finally you iterate through it to select the minimum value. That's a lot of iterations. Let's try to cut that down:

var nonRepeated = groups.Where(x => x.freq == 1)
 .Min(x => x.number);
if (nonRepeated != null)
{
 var playerNumber = Array.IndexOf(numbers, nonRepeated);

By comparing the execution time of this and the other one (on my system) it seems we have achieved a small improvement. Not enough to brag about though. Since we don't have any ToList calls anymore, I profiled the code once again. This time it is Min that takes up the time (not really surprising - It is forcing the GroupBy, Select and Where to be evaluated). So our bottleneck is (still) the LINQ queries. Can we speed it up?

Hint: LINQ usually have a slight overhead compared to doing things 'manually'.

Let's try to do these things ourselves, instead of relying on LINQ. Let's replace that GroupBy and Select stuff:

var values = new Dictionary<string, int>();
foreach (var number in numbers)
{
 if (values.ContainsKey(number)) //No curly-brackets to keep it short.
 values[number]++;
 else
 values.Add(number, 1);
}

A benchmark of the code told me that was a clear improvement. Nice! But the profiler still says that Where and Min are expensive. Let's do those manually too.

string minValue = null; 
foreach (var value in values)
{
 if (value.Value == 1)
 {
 if (minValue == null || String.Compare(value.Key, minValue, StringComparison.InvariantCulture) == -1)
 minValue = value.Key;
 }
}
if (minValue != null)
 ...

Once again, it seems we got an overall improvement in performance. Interestingly now the bottlenecks seem to be WriteLine and OpenText. We could try to experiment with other ways of reading from file/writing to console, but I don't really feel like doing that right now (and I don't think the 'easy' methods like File.ReadAllLines and File.ReadLines is much faster than your way, if at all). You could try to play with StringBuilder and only write to console once, but in my experience it is rarely an improvement when the method is only executed once.

But maybe you can do some other things to squeeze just a little bit more out of your method.

First of all, is the IsNullOrEmpty really necessary? It's a safety mechanism and in real life we would probably want to have it. But this is a coding exercise and we are all about performance right now.

Another thing you might want to consider is that strings are expensive. As it is, you are currently working with all your numbers as strings. Split creates a bunch of new strings and every time we compare values we do a string comparison. It might be worth to convert your values to int or char instead.

HINT: If you try to convert to int values, it can be done by using char ASCII values. You know that the values will always be in the range [1-9]. The ASCII value for '1' is 49, '2' is 50, etc. So if you do value[0] - 48, you can convert you strings to ints with very little overhead.

When you want to optimize something it is always a good idea to measure and profile it and see just where the time is spent. Sometimes you can get big improvements with very few changes.. But you can also make things worse with very few changes.

After the changes mentioned in this post (except the string to int conversion) is applied, the code will look something like this:

using (StreamReader reader = File.OpenText(args[0]))
{
 while (!reader.EndOfStream)
 {
 var linevalue = reader.ReadLine();
 var numbers = linevalue.Split(' ');
 var values = new Dictionary<string, int>();
 string minValue = null;
 foreach (var number in numbers)
 {
 if (values.ContainsKey(number))
 values[number]++;
 else
 values.Add(number, 1);
 }
 foreach (var value in values)
 {
 if (value.Value == 1)
 {
 if (minValue == null || String.Compare(value.Key, minValue, StringComparison.InvariantCulture) == -1)
 minValue = value.Key;
 }
 }
 if (minValue != null)
 {
 var playerNumber = Array.IndexOf(numbers, minValue);
 Console.WriteLine(playerNumber + 1);
 }
 else
 {
 Console.WriteLine("0");
 }
 }
}

As you can see, you have 2 loops within the while now. It might be possible to merge them into one by doing something clever.

I tried the code on CodeEval and got a 181ms execution time. That is a 74 ms improvement. To motivate you a bit, I know it can be done (at least a little) faster. ;)

Thank you for spending time and writing a good answer. I did put the code in the CodeEval as well, it ran at 213 ms, but of course there is a decent improvement.
@user843681 There can be a little luck involved with at managed language like C#. The same solution can sometimes have different performance results, simply because we do not control the runtime (that is why, when benchmarking, we typically run the code multiple times). It does seem like a big difference though and I don't know what the reason is.

Stack Exchange Network

"Lowest unique number" challenge

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

"Lowest unique number" challenge

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions