First time writing a generic method here. Input a List<T>
and an int
value and output a List<List<T>>
with each member List<T>
of the provided int
input.
No doubt there are easier / less verbose ways of handling this via LINQ
but I wanted to try my hand with a generic method. Please let me know of any ways that I could improve this method.
static List<List<T>> SplitIntoChunks<T>(List<T> fullBatch, int chunkSize)
{
if (chunkSize <= 0)
{
throw new ArgumentOutOfRangeException("Chunk size cannot be less than or equal to zero.");
}
if (fullBatch == null)
{
throw new ArgumentNullException("Input to be split cannot be null.");
}
int numOfChunks = fullBatch.Count / chunkSize;
//handles uneven number of items within the full batch to ensure none at the end are missed
if (fullBatch.Count % chunkSize > 0)
{
numOfChunks++;
}
int cellCounter = 0;
List<List<T>> splitChunks = new List<List<T>>();
for (int chunkNum = 0; chunkNum < numOfChunks; chunkNum++)
{
var chunk = new List<T>();
for (int index = 0; index < chunkSize; index++)
{
if (index < fullBatch.Count)
{
chunk.Add(fullBatch[index]);
cellCounter++;
}
}
splitChunks.Add(chunk);
}
return splitChunks;
}
2 Answers 2
Your implementation is not bad for a no-LinQ solution. But there's always room for improvement. First I'll provide a LinQ solution that provides a clean way to return a chunked list:
public static List<List<T>> Split<T>(List<T> collection, int size)
{
var chunks = new List<List<T>>();
var chunkCount = collection.Count() / size;
if (collection.Count % size > 0)
chunkCount++;
for (var i = 0; i < chunkCount; i++)
chunks.Add(collection.Skip(i * size).Take(size).ToList());
return chunks;
}
Basically it comes down to this:
- calculate the count of chunks that are needed
- loop over the length of chunks
- use the Enumerable.Skip and Enumerable.Take methods to get chunks
- return the list of chunks
Now, you implemented a no-LinQ solution so I created one myself too. My implementation doesn't have to calculate the amount of chunks or use two loops to create the list of chunks:
public static List<List<T>> SplitNoLinq<T>(List<T> collection, int size)
{
var chunks = new List<List<T>>();
var count = 0;
var temp = new List<T>();
foreach (var element in collection)
{
if (count++ == size)
{
chunks.Add(temp);
temp = new List<T>();
count = 1;
}
temp.Add(element);
}
chunks.Add(temp);
return chunks;
}
The code iterates over the collection and keeps a counter, adding the iterated item to a temporary list. If the counter equals the desired length of a chunk it will add the temporary list to the return list. At the end, the last chunk is added.
The var
keyword:
From the C# Programming Guide:
The var keyword can also be useful when the specific type of the variable is tedious to type on the keyboard, or is obvious, or does not add to the readability of the code.
So lines like:
List<List<T>> splitChunks = new List<List<T>>();
would become:
var splitChunks = new List<List<T>>();
Furthermore you could place the code in an extension method, also using an IEnumerable<T>
instead of List<T>
:
public static class Extensions
{
public static List<List<T>> Split<T>(this IEnumerable<T> collection, int size)
{
var chunks = new List<List<T>>();
var count = 0;
var temp = new List<T>();
foreach (var element in collection)
{
if (count++ == size)
{
chunks.Add(temp);
temp = new List<T>();
count = 1;
}
temp.Add(element);
}
chunks.Add(temp);
return chunks;
}
}
//USAGE::
var numbers = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 0 };
var chunked = numbers.Split(5);
-
3\$\begingroup\$ if you're going to make the parameter
IEnumerable<T>
, might as well make the return typeIEnumerable<IEnumerable<T>>
too. Probably an opportunity for lazy evaluation withyield return
as well. \$\endgroup\$Jesse C. Slicer– Jesse C. Slicer2015年05月08日 19:29:17 +00:00Commented May 8, 2015 at 19:29 -
2\$\begingroup\$ Excellent suggestions thank you, I'll keep this question open for a while to see what others come up with. \$\endgroup\$Michael McGriff– Michael McGriff2015年05月08日 19:30:12 +00:00Commented May 8, 2015 at 19:30
Here's a version using techniques I referenced in my comment on this answer:
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> fullBatch, int chunkSize)
{
if (chunkSize <= 0)
{
throw new ArgumentOutOfRangeException(
"chunkSize",
chunkSize,
"Chunk size cannot be less than or equal to zero.");
}
if (fullBatch == null)
{
throw new ArgumentNullException("fullBatch", "Input to be split cannot be null.");
}
var cellCounter = 0;
var chunk = new List<T>(chunkSize);
foreach (var element in fullBatch)
{
if (cellCounter++ == chunkSize)
{
yield return chunk;
chunk = new List<T>(chunkSize);
cellCounter = 1;
}
chunk.Add(element);
}
yield return chunk;
}
Note I'm doing the following:
- Pre-allocating list size to be the chunk size (i.e. minimizes re-allocations while adding to the list)
- Using the "state machine" of
yield return
so that the evaluation is lazy (can be effectively used in LINQ) - Extension method on
IEnumerable<T>
so that it plays nicely with LINQ - Use the proper overloads on the exception constructors as to provide all the pertinent information
-
1\$\begingroup\$ And because it should be mentioned: stackoverflow.com/questions/30176121/… \$\endgroup\$Jesse C. Slicer– Jesse C. Slicer2015年05月12日 21:11:04 +00:00Commented May 12, 2015 at 21:11
-
\$\begingroup\$ Not the shortest source code, but I love how nicely optimized it is. I assume Skip() can be pretty slow with lazy evaluation and expensive element getter, this one offers reasonable balance between memory usage and speed. BTW, sometimes LINQ is faster than eager evaluation, especially with huge amount of data. \$\endgroup\$Harry– Harry2020年04月02日 14:19:30 +00:00Commented Apr 2, 2020 at 14:19