Im running a photocontest and making a list of winners with one winner each day. My loop is running very slow. It seams to be the toList() that is time consuming. Any suggestions on how to make it run faster?
foreach (var date in dates)
{
var voteList = _images.Select(image => new ImageWinnerVotes
{
ID = image.ID,
Image = image.ImagePath,
Thumbnail = image.ThumbnailImagePath,
Title = image.Name,
Url = GetFriendlyUrl(image.ID),
Date = image.CreatedDate,
WinnerDateString = date.Date.ToString("dd/MM"),
WinnerDate = date.Date,
TotalVotes =
(votes.Where(vote => vote.ItemId == image.ID)).Count(),
Name = image.Fields["Name"].ToString(),
Votes =
(votes.Where(
vote =>
vote.ItemId == image.ID &&
vote.Date.DayOfYear == date.Date.DayOfYear)).Count()
}).ToList();
voteList = voteList.OrderByDescending(v => v.Votes).ToList();
var j = 0;
var findingWinner = true;
while(findingWinner)
{
var inWinnersList = false;
if (winnersList.Count() != 0)
{
if(!winnersList.Contains(voteList.ElementAt(j)))
{
winnersList.Add(voteList.ElementAt(j));
findingWinner = false;
}
}
else
{
winnersList.Add(voteList.ElementAt(j));
findingWinner = false;
}
j++;
}
}
3 Answers 3
First up calls to .ToList()
are not inherently slow - it's only when each element in the list is slow to compute that .ToList()
appears slow.
So then, what makes the time to compute each element slow?
Well, you're querying votes
twice in each query - and computing TotalVotes
for each iteration of the foreach
loop.
After the query you are then sorting all the candidates by daily votes and then searching through the candidates and picking the first candidate image that hasn't previously won. The search for the winner uses Count
, Contains
, and ElementAt
- all of which can cause the list to be enumerated each time.
All of this is surrounded by a foreach
loop to iterate through each date.
Lots of nested iterating - and most of which is unnecessary.
I've got a solution which I've timed to be 135 times faster.
Here it is:
var query =
from image in _images
join tvote in votes on image.ID equals tvote.ItemId into tvs
let TotalVotes = tvs.Count()
join vote in votes on image.ID equals vote.ItemId
join date in dates on vote.Date.DayOfYear equals date.Date.DayOfYear
group vote by new
{
image,
date.Date,
TotalVotes,
} into dvs
let Date = dvs.Key.Date
let Image = dvs.Key.image
let DailyVotes = dvs.Count()
orderby DailyVotes descending
orderby Date
group new
{
Image = dvs.Key.image,
dvs.Key.TotalVotes,
DailyVotes = dvs.Count(),
} by Date;
var winners = query.Aggregate(new ImageWinnerVotes[] { }, (iwvs, gs) =>
{
var previousWinners = iwvs.Select(iwv => iwv.ID).ToArray();
var winner = gs
.Where(g => !previousWinners.Contains(g.Image.ID))
.Select(g => new ImageWinnerVotes
{
ID = g.Image.ID,
Image = g.Image.ImagePath,
Thumbnail = g.Image.ThumbnailImagePath,
Title = g.Image.Name,
Url = GetFriendlyUrl(g.Image.ID),
Date = g.Image.CreatedDate,
WinnerDateString = gs.Key.ToString("dd/MM"),
WinnerDate = gs.Key,
TotalVotes = g.TotalVotes,
Name = g.Image.Fields["Name"].ToString(),
Votes = g.DailyVotes,
})
.FirstOrDefault();
return winner == null ? iwvs.ToArray() : iwvs.Concat(new [] { winner, }).ToArray();
}).ToArray();
This is purely a LINQ solution. It's a good idea not to mix LINQ & non-LINQ.
Now, if you didn't have the requirement to eliminate previous winners then I have a solution that is twice as fast again:
var query =
from image in _images
join tvote in votes on image.ID equals tvote.ItemId into tvs
let TotalVotes = tvs.Count()
join vote in votes on image.ID equals vote.ItemId
join date in dates on vote.Date.DayOfYear equals date.Date.DayOfYear
group vote by new
{
image,
date.Date,
TotalVotes,
} into dvs
group new
{
Image = dvs.Key.image,
dvs.Key.TotalVotes,
DailyVotes = dvs.Count(),
} by dvs.Key.Date into gxs
let winners =
from x in gxs
orderby x.DailyVotes descending
select x
let winner = winners.First()
let image = winner.Image
let winnerDate = gxs.Key
orderby winnerDate
select new ImageWinnerVotes
{
ID = image.ID,
Image = image.ImagePath,
Thumbnail = image.ThumbnailImagePath,
Title = image.Name,
Url = GetFriendlyUrl(image.ID),
Date = image.CreatedDate,
WinnerDateString = winnerDate.ToString("dd/MM"),
WinnerDate = winnerDate,
TotalVotes = winner.TotalVotes,
Name = image.Fields["Name"].ToString(),
Votes = winner.DailyVotes,
};
var winners = query.ToArray();
Let me know if these help and/or if you need more info.
You have a lot of iterations in there. the votes.Where's are both a complete iteration of the votes enumerable, and you are doing those once for each Image. As mentioned earlier, everything in that linq query is just a definition of a query, the query itself isn't executed until you call .ToList()
which is why it seems that's the time sink.
Save yourself the trouble and cache your votes results:
Tuple<int, int> numberOfVotesPerImageId =
votes.GroupBy(vote => vote.ItemId)
.Select(voteGroup => Tuple.Create(voteGroup.Key, voteGroup.Count()))
.ToArray();
Tuple<int, int, int> numberOfVotesPerImageIdAndDayOfYear =
votes.GroupBy(vote => new { vote.ItemId, vote.Date.DayOfYear })
.Select(voteGroup => Tuple.Create(voteGroup.Key.Item1, voteGroup.Key.Item2, voteGroup.Count()))
.ToArray();
Then you're select will have far fewer iterations as you'll have these aggregates on the votes to match up against the itemid where these tuples have: .Item1 == image.ID
and in the second one Item2 == date.Date.DayOfYear
, and in the first one .Item2 == TotalVotes
while the second one .Item3 = Votes
Also, put the order by in the first clause before calling .ToList()
so it doesn't need to reiterate twice.
Also, you're while loop should be a for loop because the iterator, you need to test against length or risk going out of index. And anytime you see an if block nested in an if block with no space leftover, that means you should modify your clauses.. actually I would do this (doing the count and the contains is two iterations when the contains alone is all you need) also your inWinnersList
variable is unused.. oh and the .ElementAt is another iteration across the list which makes it O(j) when lists directly expose the array index which is O(1):
for (int j = 0; j < voteList.Count; j++)
{
inWinnersList = winnersList.Contains(voteList[j]);
if (inWinnersList)
{
continue;
}
winnersList.Add(voteList[j]);
break;
}
There's probably more things I'm missing and details I'm not going into, but I think this is a good start for improving performance and readability. Once you get these pieces put together, other improvements may become obvious as the intent becomes clearer.
The toList() executes everything in the select-clause. My guess is that the votes-collection that you are quering against, without knowing how it operates, is the performance thief.
Try running the example again without setting the Votes-variable. If it runs smoothly, then you know what to optimize.
What DAL are you using?
GetFriendlyUrl(image.ID)
do ? \$\endgroup\$votes
and a list ofimage.ID
s to assemble your winners list long before constructing theImageWinnerVotes
, once you have thatwinnersList
ofimage.ID
then construct theImageWinnerVotes
from that so you don't have to create any extras that end up throw away. \$\endgroup\$