1
\$\begingroup\$

I have a list of records containing Id, DateFrom, DateTo. For the sake of this question we can use this one:

List<(int, DateTime, DateTime)> data = new List<(int, DateTime, DateTime)>
 {
 (1, new DateTime(2012, 5, 16), new DateTime(2018, 1, 25)),
 (2, new DateTime(2009, 1, 1), new DateTime(2011, 4, 27)),
 (3, new DateTime(2014, 1, 1), new DateTime(2016, 4, 27)),
 (4, new DateTime(2015, 1, 1), new DateTime(2015, 1, 3)),
 };

In my real case the List could be a lot bigger, for now I assume that there is only one entry for a certain Id.

The task is to find the two records that has the longest time overlap and to return the ids and the number of days overlapped.

Which in this sample case means that these should be records 1 and 3, because 2 is not overlapping with anyone, and 4 overlaps for shorter time.

My implementation of this is the following:

 public (int, int, int) GetLongestElapsedPeriod(List<(int, DateTime, DateTime)> periods)
 {
 int firstId = -1;
 int secondId = -1;
 int periodInDays = 0;
 foreach (var period in periods)
 {
 var Id = period.Item1;
 var dateFrom = period.Item2;
 var dateTo = period.Item3;
 for (int i = 0; i < periods.Count; i++)
 {
 if (Id != periods[i].Item1)
 {
 var tempId = periods[i].Item1;
 var tempDateFrom = periods[i].Item2;
 var tempDateTo = periods[i].Item3;
 if (tempDateFrom < dateTo && tempDateTo > dateFrom)
 {
 DateTime commonStartingDate = tempDateFrom > dateFrom ? tempDateFrom : dateFrom;
 DateTime commonEndDate = tempDateTo > dateTo ? dateTo : tempDateTo;
 var elapsedPeriod = commonEndDate - commonStartingDate;
 if ((int)elapsedPeriod.TotalDays > periodInDays)
 {
 periodInDays = (int)elapsedPeriod.TotalDays;
 firstId = Id;
 secondId = tempId;
 }
 }
 }
 }
 }
 return (firstId, secondId, periodInDays);
 }

But I think this way is not very efficient and I'm looking for ideas to optimize this logic. Also, I suspect that this is variation of some more general algorithmic problem.

200_success
146k22 gold badges190 silver badges479 bronze badges
asked Jan 25, 2018 at 23:07
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Sort them by Item2 perhaps? \$\endgroup\$ Commented Jan 26, 2018 at 3:55
  • \$\begingroup\$ @vnp Yes, it seems that sorting by date can reduce the overall complexity but I would like to see concrete example. \$\endgroup\$ Commented Jan 26, 2018 at 7:53

1 Answer 1

3
\$\begingroup\$

I am not a c# programmer.

Firstly, you use a for each and a nested for loop, but both of these are iterating through the same List. For consistency (and for the next comment to make sense), I suggest that you use for loops.

for (int i = 0; i < periods.Count; i++)

You could start i not from 0. but from your current location + 1 in the outer loop. The logic behind this is that you have already compared the pairs of periods previous to this - no need to double handle. Hence why using similar for loops will make this easier - you will be counting where you are in the List in the outer loop.

Making this small change means that if (Id != periods[i].Item1 is no longer needed - reducing one nesting level in the code.

As a 'meaningful name thing', I would call commonStartingDate and commonEndDate overlapStart and overlapEnd respectively - this is more descriptive of what you mean. overlapStart = max(period1.Start, period2.Start) overlapEnd = min(period1.End,period2.End)

Error checking: check that Item2 is always less than Item3 (and fix if not), otherwise your logic will be broken.

You could simplify var elapsedPeriod = commonEndDate - commonStartingDate and the following code by using:

int elapsedPeriod = (int)(commonEndDate - commonStartingDate).TotalDays 

May not look simpler above, but in the next block:

 if (elapsedPeriod > periodInDays)
 {
 periodInDays = elapsedPeriod;
 firstId = Id;
 secondId = tempId;

You don't use elapsedPeriod for anything else.

The main logic is good - I remember having to solve this problem 25 years ago when developing an accommodation booking database.

answered Jan 26, 2018 at 4:45
\$\endgroup\$
7
  • \$\begingroup\$ Thanks for the review. I will most certainly keep in mind your recommendations when I refactor my code. The reason why I won't accept it just yet is because I am also looking for performance optimization suggestions. The nested for-loops makes the complexity too big and I feel there must be a better way. \$\endgroup\$ Commented Jan 26, 2018 at 7:50
  • \$\begingroup\$ @Leron: A double for is not excessively complex for the posed problem. You're already nesting a for in a foreach, so I don't quite see how this answer would be more complex than your solution. Also, just to make sure in case that's what you meant with your comment, code complexity and application performance are not related to each other. Simple code can have a slow performance, complex code can run very efficiently. \$\endgroup\$ Commented Jan 26, 2018 at 8:22
  • \$\begingroup\$ @Flater You are correct, and maybe my comment wasn't clear enough. I don't mean that AJD answer has some additional performance issue. As you pointed out, both mine idea, and what AJD suggest is relaying on nested for loops which makes the complexity n * n and even though I am not too much into algorithms I know that in general this is not that great and usually there's a way to optimize such code. As vnp commented in the original post, ordering by dates may be the key to reduce the complexity. From what I am trying right now, it seems it's possible to have only one for loop. \$\endgroup\$ Commented Jan 26, 2018 at 9:27
  • \$\begingroup\$ And for this choice to use foreach and then for I will most certainly follow the recommendations of AJD and refactor my code. But essentially in terms of faster execution changing the foreach with for won't help a lot. As he mentioned I will be able to avoid the if check but that's about it. \$\endgroup\$ Commented Jan 26, 2018 at 9:30
  • \$\begingroup\$ @Leron: I'm not convinced you can compare every element to every other element with a single for loop (unless you mean to obfuscate the second loop, e.g. LINQ or convoluted preprocessing of data). The first loop defines the left operand for comparison, the second loop defines the right operand for comparison. A single loop can't define both operands at the same time. You're not comparing the items themselves (which could be simplified by ordering and keeping track of the highest/lowest encountered value), but rather the overlap between two items, which must be calculated first. \$\endgroup\$ Commented Jan 26, 2018 at 9:39

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.