I Needed a method to randomly sample an IEnumerable collection without replacement. I'm using this for writing behavioral acceptance tests.
For example, in the test code, I write:
GetElements("#SearchResults > li").Sample(3);
Here's the how I implemented it:
I welcome any feedback or optimizations!
/// <summary>
/// returns a random sample of the elements in an IEnumerable
/// </summary>
public static IEnumerable<T> Sample<T>(this IEnumerable<T> population, int sampleSize)
{
List<T> localPopulation = population.ToList();
if (localPopulation.Count() < sample.size) return localPopulation;
List<T> sample = new List<T>(sampleSize);
Random random = new Random();
while(sample.Count < sampleSize)
{
int i = random.Next(0, localPopulation.Count);
sample.Add(localPopulation[i]);
localPopulation.RemoveAt(i);
}
return sample;
}
1 Answer 1
Make the Random
generator a class-level member so you're not generating a new one with the default seed each time. Also, check your population
for null:
/// <summary>
/// random number generator for the enumerable sampler.
/// </summary>
private static readonly Random random = new Random();
/// <summary>
/// returns a random sample of the elements in an IEnumerable
/// </summary>
public static IEnumerable<T> Sample<T>(this IEnumerable<T> population, int sampleSize)
{
if (population == null)
{
return null;
}
List<T> localPopulation = population.ToList();
if (localPopulation.Count < sample.size) return localPopulation;
List<T> sample = new List<T>(sampleSize);
while(sample.Count < sampleSize)
{
int i = random.Next(0, localPopulation.Count);
sample.Add(localPopulation[i]);
localPopulation.RemoveAt(i);
}
return sample;
}
-
\$\begingroup\$ just a question but does .Count enumerate over all it's internals on each call or does it contain a local record count. Just interested in the performance of including it in the while loop rather than storing to a local etc \$\endgroup\$dreza– dreza2012年05月10日 19:17:06 +00:00Commented May 10, 2012 at 19:17
-
\$\begingroup\$
.Count()
does, but.Count
does not. Subtle distinction there..Count
on aList<T>
does not enumerate while.Count()
is a LINQ extension onIEnumerable<T>
which will enumerate. \$\endgroup\$Jesse C. Slicer– Jesse C. Slicer2012年05月10日 19:23:48 +00:00Commented May 10, 2012 at 19:23 -
\$\begingroup\$ @JesseC.Slicer even
Count()
doesn't do that, if the source isICollection<T>
. \$\endgroup\$svick– svick2012年05月10日 19:29:00 +00:00Commented May 10, 2012 at 19:29 -
\$\begingroup\$ @svick so it does type checking every time it's called and picks the best counting method available? \$\endgroup\$Jesse C. Slicer– Jesse C. Slicer2012年05月10日 19:31:01 +00:00Commented May 10, 2012 at 19:31
-
2\$\begingroup\$ @JesseC.Slicer, exactly, see Jon Skeet's article about
Count()
. \$\endgroup\$svick– svick2012年05月10日 19:32:54 +00:00Commented May 10, 2012 at 19:32