Matching items in different arrays and performing operations on them

Question 1

I have a data structure that uses a Dictionary<ID,TimeSeries> to hold time series for different products. TimeSeries is implemented as something holding an array of Point, which contain a DateTime and a double.

Its interface is:

TimeSeries{
 int Length{ get; }
 Point ElementAt(int index);
 DateTime StartTime { get; }
 DateTime EndTime { get; } 
}

Given a pair of elements of such data structure I need to find the first time instant where they diverge. For my definition of divergence, if a dictionary contains an element and the other doesn't I need to return the time of the first point of the TimeSeries I have.

This is the code I'm using to compute the difference and that I'd like you to review and comment:

bool TryFindFirstDivergenceTime(Dictionary<ID, TimeSeries> baseline, Dictionary<ID, TimeSeries> other, out DateTime firstDivergenceTime)
{
 firstDivergenceTime = default(DateTime);
 foreach(var baselineIDSeriesPair in baseline)
 {
 if(other.ContainsKey(baselineIDSeriesPair.Key))
 {
 firstDivergenceTime = Min(FindFirstDivergenceTime(baselineIDSeriesPair.Value, other[baselineIDSeriesPair.Key]), firstDivergenceTime);
 }
 else
 {
 firstDivergenceTime = Min(baselineIDSeriesPair.Value.Start, firstDivergenceTime);
 }
 }
 foreach(var otherIDSeriesPair in other)
 {
 if(!baseline.ContainsKey(otherIDSeriesPair.Key))
 {
 firstDivergenceTime = Min(otherIDSeriesPair.Value.Start, firstDivergenceTime);
 }
 }
 return firstDivergenceTime != default(DateTime);
}
public DateTime Min(DateTime first, DateTime second)
{
 if(first == default(DateTime))
 return second;
 if(second == default(DateTime))
 return first;
 return new DateTime(Math.Min(first.Ticks, second.Ticks));
}
DateTime FindFirstDivergenceTime(TimeSeries first, TimeSeries second)
{
 if(first.Length != second.Length)
 {
 throw new ArgumentException();
 }
 for(var i=0; i < first.Length; i++)
 {
 if(! first.ElementAt(i).Equals(second.ElementAt(i)))
 {
 return Min(first.ElementAt(i).Time, second.ElementAt(i).Time);
 }
 }
 return default(DateTime);
}

I have the feeling that I am not approaching the problem in the right way as the code contains the repetition of the logic to find the time of the first difference when I don't have the same product in both the dictionaries.

The other think that I don't particularly like is that I find myself breaking th symmetry and using for on dictionary Key and Value from its IEnumerable, while from the other I need to perform a lookup.

How would you address those issues?

Question 2

I'm having a hard time telling what this is supposed to do without Min and FindFirstDivergenceTime, so IMO they are quite relevant. Also, this doesn't look like it would compile (see duplicate StartTime definitions in TimeSeries and first use of otherIDSeriesPair in the TryFindFirstDivergenceTime method.

Question 3

I fixed the compilation issues and I added the Min and FindFirstDivergenceTime method implementations

Question 4

I'd probably try to use a first.Zip(second, (a, b) => { . . . }) of some kind instead of the foreaches. That does require the order to be the same, though.

Question 5

Before getting to your question about reducing the duplication, there's another issue I'd like to point out first. The use of default(DateTime) as a check for the absence of a value is incorrect and misleading. default(DateTime) is a valid date. However unlikely it is that you will have a date that equals that value, it is misleading to use because anyone looking at your code that sees a DateTime variable will assume that it holds a proper date value because it has to. As a struct, it can't be represented as null, so it always holds a value. Enter nullable types. DateTime? clearly expresses that the variable can be in a state where it doesn't hold a date value.

That said, it appears you're trying to join the two dictionary and compare in these ways:

When there are matching keys.
When only one of the dictionaries holds a key.

Sounds like a Full Outer Join, which someone has kindly shared a Linq-esque extension method for over on Stack Overflow. Using this, you can concisely and expressively rewrite your methods like so to avoid duplication (note the use of DateTime?):

// Note: I think this is a bad method name (it's the same as the one below,
// which I think is better suited for the name). It should probably express
// the comparison between these Dictionary maps.
public static DateTime? GetEarliestDivergenceTime(Dictionary<int, TimeSeries> baseline, Dictionary<int, TimeSeries> other, DateTime? defaultEarliestDivergenceTime = null)
{
 // Omitting 'FullOuterJoin' code in StackOverflow link:
 var earliestDivergenceTimes = baseline.FullOuterJoin(
 other,
 b => b.Key,
 o => o.Key,
 (b,o) => GetEarliestDivergenceTimeOrDefault(b, o));
 return earliestDivergenceTimes.Min() ?? defaultEarliestDivergenceTime;
}
public static DateTime? GetEarliestDivergenceTime(TimeSeries ts1, TimeSeries ts2, DateTime? defaultEarliestDivergenceTime)
{
 DateTime? earliestDivergenceTime = null;
 if(ts1 != null && ts2 != null)
 {
 // Compare.
 }
 else if(ts1 != null && ts2 == null)
 {
 // Use ts1.
 }
 else if(ts1 == null && ts2 != null)
 {
 // Use ts2.
 }
 // earliestDivergenceTime will be null if both t1 and t2 are null.
 return earliestDivergenceTime ?? defaultEarliestDivergenceTime;
}

Question 6

I agree, although as you can see in my code comment I had a hard time with method/argument names since I'm still not 100% on how this would get used. Suggestions for improvement are welcome.

Ocelot20 Ocelot20 1,89311 silver badges18 bronze badges · Accepted Answer · 2014-04-25 14:07:26Z

Before getting to your question about reducing the duplication, there's another issue I'd like to point out first. The use of default(DateTime) as a check for the absence of a value is incorrect and misleading. default(DateTime) is a valid date. However unlikely it is that you will have a date that equals that value, it is misleading to use because anyone looking at your code that sees a DateTime variable will assume that it holds a proper date value because it has to. As a struct, it can't be represented as null, so it always holds a value. Enter nullable types. DateTime? clearly expresses that the variable can be in a state where it doesn't hold a date value.

That said, it appears you're trying to join the two dictionary and compare in these ways:

When there are matching keys.
When only one of the dictionaries holds a key.

Sounds like a Full Outer Join, which someone has kindly shared a Linq-esque extension method for over on Stack Overflow. Using this, you can concisely and expressively rewrite your methods like so to avoid duplication (note the use of DateTime?):

// Note: I think this is a bad method name (it's the same as the one below,
// which I think is better suited for the name). It should probably express
// the comparison between these Dictionary maps.
public static DateTime? GetEarliestDivergenceTime(Dictionary<int, TimeSeries> baseline, Dictionary<int, TimeSeries> other, DateTime? defaultEarliestDivergenceTime = null)
{
 // Omitting 'FullOuterJoin' code in StackOverflow link:
 var earliestDivergenceTimes = baseline.FullOuterJoin(
 other,
 b => b.Key,
 o => o.Key,
 (b,o) => GetEarliestDivergenceTimeOrDefault(b, o));
 return earliestDivergenceTimes.Min() ?? defaultEarliestDivergenceTime;
}
public static DateTime? GetEarliestDivergenceTime(TimeSeries ts1, TimeSeries ts2, DateTime? defaultEarliestDivergenceTime)
{
 DateTime? earliestDivergenceTime = null;
 if(ts1 != null && ts2 != null)
 {
 // Compare.
 }
 else if(ts1 != null && ts2 == null)
 {
 // Use ts1.
 }
 else if(ts1 == null && ts2 != null)
 {
 // Use ts2.
 }
 // earliestDivergenceTime will be null if both t1 and t2 are null.
 return earliestDivergenceTime ?? defaultEarliestDivergenceTime;
}

I agree, although as you can see in my code comment I had a hard time with method/argument names since I'm still not 100% on how this would get used. Suggestions for improvement are welcome.

Stack Exchange Network

Matching items in different arrays and performing operations on them

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Matching items in different arrays and performing operations on them

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions