I have a list of records containing Id, DateFrom, DateTo
. For the sake of this question we can use this one:
List<(int, DateTime, DateTime)> data = new List<(int, DateTime, DateTime)>
{
(1, new DateTime(2012, 5, 16), new DateTime(2018, 1, 25)),
(2, new DateTime(2009, 1, 1), new DateTime(2011, 4, 27)),
(3, new DateTime(2014, 1, 1), new DateTime(2016, 4, 27)),
(4, new DateTime(2015, 1, 1), new DateTime(2015, 1, 3)),
};
In my real case the List could be a lot bigger, for now I assume that there is only one entry for a certain Id
.
The task is to find the two records that has the longest time overlap and to return the ids and the number of days overlapped.
Which in this sample case means that these should be records 1 and 3, because 2 is not overlapping with anyone, and 4 overlaps for shorter time.
My implementation of this is the following:
public (int, int, int) GetLongestElapsedPeriod(List<(int, DateTime, DateTime)> periods)
{
int firstId = -1;
int secondId = -1;
int periodInDays = 0;
foreach (var period in periods)
{
var Id = period.Item1;
var dateFrom = period.Item2;
var dateTo = period.Item3;
for (int i = 0; i < periods.Count; i++)
{
if (Id != periods[i].Item1)
{
var tempId = periods[i].Item1;
var tempDateFrom = periods[i].Item2;
var tempDateTo = periods[i].Item3;
if (tempDateFrom < dateTo && tempDateTo > dateFrom)
{
DateTime commonStartingDate = tempDateFrom > dateFrom ? tempDateFrom : dateFrom;
DateTime commonEndDate = tempDateTo > dateTo ? dateTo : tempDateTo;
var elapsedPeriod = commonEndDate - commonStartingDate;
if ((int)elapsedPeriod.TotalDays > periodInDays)
{
periodInDays = (int)elapsedPeriod.TotalDays;
firstId = Id;
secondId = tempId;
}
}
}
}
}
return (firstId, secondId, periodInDays);
}
But I think this way is not very efficient and I'm looking for ideas to optimize this logic. Also, I suspect that this is variation of some more general algorithmic problem.
1 Answer 1
I am not a c# programmer.
Firstly, you use a for each
and a nested for
loop, but both of these are iterating through the same List
. For consistency (and for the next comment to make sense), I suggest that you use for
loops.
for (int i = 0; i < periods.Count; i++)
You could start i
not from 0. but from your current location + 1
in the outer loop. The logic behind this is that you have already compared the pairs of periods previous to this - no need to double handle. Hence why using similar for
loops will make this easier - you will be counting where you are in the List
in the outer loop.
Making this small change means that if (Id != periods[i].Item1
is no longer needed - reducing one nesting level in the code.
As a 'meaningful name thing', I would call commonStartingDate
and commonEndDate
overlapStart
and overlapEnd
respectively - this is more descriptive of what you mean.
overlapStart = max(period1.Start, period2.Start)
overlapEnd = min(period1.End,period2.End)
Error checking: check that Item2
is always less than Item3
(and fix if not), otherwise your logic will be broken.
You could simplify var elapsedPeriod = commonEndDate - commonStartingDate
and the following code by using:
int elapsedPeriod = (int)(commonEndDate - commonStartingDate).TotalDays
May not look simpler above, but in the next block:
if (elapsedPeriod > periodInDays)
{
periodInDays = elapsedPeriod;
firstId = Id;
secondId = tempId;
You don't use elapsedPeriod
for anything else.
The main logic is good - I remember having to solve this problem 25 years ago when developing an accommodation booking database.
-
\$\begingroup\$ Thanks for the review. I will most certainly keep in mind your recommendations when I refactor my code. The reason why I won't accept it just yet is because I am also looking for performance optimization suggestions. The nested for-loops makes the complexity too big and I feel there must be a better way. \$\endgroup\$Leron– Leron2018年01月26日 07:50:04 +00:00Commented Jan 26, 2018 at 7:50
-
\$\begingroup\$ @Leron: A double
for
is not excessively complex for the posed problem. You're already nesting afor
in aforeach
, so I don't quite see how this answer would be more complex than your solution. Also, just to make sure in case that's what you meant with your comment, code complexity and application performance are not related to each other. Simple code can have a slow performance, complex code can run very efficiently. \$\endgroup\$Flater– Flater2018年01月26日 08:22:38 +00:00Commented Jan 26, 2018 at 8:22 -
\$\begingroup\$ @Flater You are correct, and maybe my comment wasn't clear enough. I don't mean that
AJD
answer has some additional performance issue. As you pointed out, both mine idea, and whatAJD
suggest is relaying on nested for loops which makes the complexityn * n
and even though I am not too much into algorithms I know that in general this is not that great and usually there's a way to optimize such code. Asvnp
commented in the original post, ordering by dates may be the key to reduce the complexity. From what I am trying right now, it seems it's possible to have only one for loop. \$\endgroup\$Leron– Leron2018年01月26日 09:27:18 +00:00Commented Jan 26, 2018 at 9:27 -
\$\begingroup\$ And for this choice to use
foreach
and thenfor
I will most certainly follow the recommendations ofAJD
and refactor my code. But essentially in terms of faster execution changing theforeach
withfor
won't help a lot. As he mentioned I will be able to avoid theif
check but that's about it. \$\endgroup\$Leron– Leron2018年01月26日 09:30:44 +00:00Commented Jan 26, 2018 at 9:30 -
\$\begingroup\$ @Leron: I'm not convinced you can compare every element to every other element with a single for loop (unless you mean to obfuscate the second loop, e.g. LINQ or convoluted preprocessing of data). The first loop defines the left operand for comparison, the second loop defines the right operand for comparison. A single loop can't define both operands at the same time. You're not comparing the items themselves (which could be simplified by ordering and keeping track of the highest/lowest encountered value), but rather the overlap between two items, which must be calculated first. \$\endgroup\$Flater– Flater2018年01月26日 09:39:58 +00:00Commented Jan 26, 2018 at 9:39
Item2
perhaps? \$\endgroup\$