This is my Result
class:
public class Result
{
public DateTime Checked { get; set; }
public bool IsWorking { get; set; }
}
Let's say I have theese results in list:
Date: ..., IsWorking = true;
Date: ..., IsWorking = true;
Date: ..., IsWorking = false;
Date: ..., IsWorking = false;
Date: ..., IsWorking = true;
Date: ..., IsWorking = false;
Date: ..., IsWorking = true;
Date: ..., IsWorking = true;
I want to split them as following:
// group 1
Date: ..., IsWorking = true;
Date: ..., IsWorking = true;
// group 2
Date: ..., IsWorking = false;
Date: ..., IsWorking = false;
// group 3
Date: ..., IsWorking = true;
// group 4
Date: ..., IsWorking = false;
// group 5
Date: ..., IsWorking = true;
Date: ..., IsWorking = true;
I've managed something like this:
public static List<List<T>> SplitByEqualProperty<T>(this IEnumerable<T> inputs, string property)
{
List<List<T>> temp = new List<List<T>>();
temp.Add(new List<T>());
object previousSelector = null;
for (int i = 0; i < inputs.Count(); i++)
{
var current = inputs.ElementAt(i);
Type t = current.GetType();
PropertyInfo prop = t.GetProperty(property);
object currentSelector = prop.GetValue(current);
if (previousSelector == null)
{
temp.LastOrDefault().Add(current);
}
else
{
if (currentSelector.Equals(previousSelector))
{
temp.LastOrDefault().Add(current);
}
else
{
temp.Add(new List<T>() { current });
}
}
previousSelector = currentSelector;
}
return temp;
}
And this is working, but I don't like the code. Expecially part with string property
. How can I improve that?
2 Answers 2
You've tagged this linq so you're clearly aware of Linq, but this is not a Linq-like approach. The closest approximation in Linq is GroupBy
, so that can serve as a model:
- Instead of
string property
takeFunc<TSource, TKey>
. - Instead of
List<List<T>>
(which, apart from anything else, violates the principle of coding to the interface instead of the implementation) returnIEnumerable<IGrouping<TKey, TSource>>
(or, if you don't care about the value itself,IEnumerable<IEnumerable<TSource>>
).
Also, in Linq style, favour lazy implementation with yield return
.
I would advise you to rewrite from scratch with those principles in mind.
But I must address one major problem in this code, lest you repeat it in the rewrite:
for (int i = 0; i < inputs.Count(); i++) { var current = inputs.ElementAt(i);
This is absolutely the wrong way to iterate over an IEnumerable
. The right way is
foreach (var current in inputs)
{
Otherwise if inputs
is lazy you do far more work than necessary, and potentially execute side-effects more times than you might expect.
Also,
if (previousSelector == null)
is problematic. What if this method is used with objects which actually have null
as a value of the relevant property? I think the best approach is to use the current IGrouping
: either initialise that to null
and test whether it's null, or initialise it to a grouping with key default(T)
and replace the special case with a special case which doesn't return an empty grouping.
Here's a Linq
approach that allows you to perform an analytical function on a collection, given a order by clause, and a predicate to determine whether the current item and the previous item (analytical lag) are in the same group. It can be extended to also take into account a partition, but that's out of scope to serve your purpose.
This is a more generalized approach for the OP's SplitByEqualProperty method. I have augmented the problem into not just equal property check, but any kind of property check. Because of this generalization, I opt to use IEnumerable over IGrouping.
Usage
using System;
using System.Linq;
using System.Text;
using System.Collections.Generic;
using System.Globalization;
public class Program
{
public static void Main()
{
var results = new List<Result>
{
Create("JAN", true), Create("FEB", true),
Create("MAR", false),
Create("APR", true),
Create("MAY", false), Create("JUN", false), Create("JUL", false),
Create("AUG", true),
Create("SEP", true), Create("OCT", true),
Create("NOV", false), Create("DEC", false),
};
var grouped = results.JoinBy(
x => x.Checked,
x => x.IsWorking,
(previous, current) => previous == current);
}
internal static Result Create(string month, bool isWorking) {
return new Result {
Checked = DateTime.ParseExact("2019" + month + "01", "yyyyMMMdd", CultureInfo.InvariantCulture),
IsWorking = isWorking
};
}
public class Result
{
public DateTime Checked { get; set; }
public bool IsWorking { get; set; }
}
}
Linq
public static class LinqExtension
{
public static IEnumerable<IEnumerable<TSource>> JoinBy<TSource, TOrderKey, TKey>(
this IEnumerable<TSource> source,
Func<TSource, TOrderKey> orderBy,
Func<TSource, TKey> keySelector,
Func<TKey, TKey, bool> join) {
var results = new List<List<TSource>>();
var orderedSource = new List<TSource>(source).OrderBy(orderBy).ToArray();
if (orderedSource.Length > 0) {
var group = new List<TSource> { orderedSource[0] };
results.Add(group);
if (orderedSource.Length > 1) {
for (int i = 1; i < orderedSource.Length; i++) {
var lag = orderedSource[i - 1];
var current = orderedSource[i];
if (join(keySelector(lag), keySelector(current))) {
group.Add(current);
}
else {
group = new List<TSource> { current };
results.Add(group);
}
}
}
}
return results;
}
}
-
\$\begingroup\$ Since this is about grouping, I think it would be better if the return value of
JoinBy
wasIEnumerable<IGrouping<TKey, TSource>>
. Oh, one more thing, you have presented an alternative version without mentioning anything in particular about OP's code... On Code Review this is usually required... \$\endgroup\$t3chb0t– t3chb0t2019年05月18日 12:29:50 +00:00Commented May 18, 2019 at 12:29 -
1\$\begingroup\$ @t3ch0t Thanks for the tip about community guidelines. About the Grouping vs Enumerable dilemma: my code isn't necessarely a grouped-by-key collection, instead the 'join' clause can also group on any function between comparing two adjacent keys. \$\endgroup\$dfhwze– dfhwze2019年05月18日 12:36:45 +00:00Commented May 18, 2019 at 12:36