I'm working through Sedgewick's Algorithms book for fun and educational purposes. I am implementing a simple Stack<T>
, using System.Collections.Generic.Stack<T>
as a guide:
[System.Diagnostics.DebuggerDisplay("Count = {Count}, Size = {_values.Length}")]
public class Stack<T>
{
// Constructors
public Stack(int capacity)
{
_values = new T[capacity];
_count = 0;
}
public Stack() :
this(DEFAULT_SIZE)
{
}
// Static variables
private static readonly int DEFAULT_SIZE = 4;
private static readonly float DEFAULT_TRIM_THRESHOLD = 0.90f;
// Variables
private T[] _values;
private int _count;
// Properties
public int Count
{
get { return _count; }
}
// Methods
public void Push(T value)
{
if (_count == _values.Length)
Array.Resize(ref _values, _values.Length << 1);
_values[_count++] = value;
}
public T Pop()
{
T result = _values[--_count];
_values[_count] = default(T);
return result;
}
public T Peek()
{
return _values[_count - 1];
}
public void Clear()
{
for (int i = 0; i < _count; i++)
_values[i] = default(T);
_count = 0;
}
public void Trim()
{
if (_count / (float)_values.Length < DEFAULT_TRIM_THRESHOLD)
Array.Resize(ref _values, Math.Max(_count, DEFAULT_SIZE));
}
}
A few things:
There were
#region
and<summary>
tags on everything, but I removed them to post here, since it seemed like it added a lot of visual noise and made the code otherwise harder to read.The class does not currently implement
IEnumerable<T>
orIEnumerable
, but will.I don't want to implement
ICollection<T>
, since I think Add() and Remove() would be inappropriate methods for this class. Doing something like hiding them behind an explicit interface implementation would be misrepresenting the class's intent.Performance. Premature optimization should be avoided obviously, but I would like the classes I implement from the book to be useful in future projects. It doesn't need to outperform the default implementation, but it should be comparable.
I compared my
Push()
method against the default implementation by adding 1,000,000 items to the stack. It takes ~68,000 ticks for mine, and ~25,000 ticks for the default. Even if I stripPush()
down to just the array assignment, using compile on Release with optimized code enabled, the default is still running at twice the speed mine is, and I have no idea why.
Any feedback or insight is appreciated!
Edit: As requested, here is the code I am using to benchmark the speed. This is what I usually do to get a sense of which of two things is faster; not sure how hackish it's considered:
int n = 10000000;
long r1, r2;
var stack1 = new Stack<int>();
{
Stopwatch w = new Stopwatch();
w.Start();
{
for (int i = 0; i < n; i++)
stack1.Push(i);
}
w.Stop();
r1 = w.ElapsedTicks;
}
var stack2 = new System.Collections.Generic.Stack<int>();
{
Stopwatch w = new Stopwatch();
w.Start();
{
for (int i = 0; i < n; i++)
stack2.Push(i);
}
w.Stop();
r2 = w.ElapsedTicks;
}
As a baseline, it takes r2 = ~37,015 ticks
for the default implementation to add 1,000,000 elements.
r1 = ~76,892 ticks
(compile to debug, w/ resizing)r1 = ~66,638 ticks
(compile to debug, no resize, keep resize logic)r1 = ~57,357 ticks
(compile to debug, no resize, resize logic omitted)r1 = ~51,420 ticks
(compile to release, w/ resizing)r1 = ~40,909 ticks
(compile to release, no resize, keep resize logic)r1 = ~33,821 ticks
(compile to release, no resize, resize logic omitted)
So, looking at the different results, it seems that I can achieve the same speed as the default implementation, but at the cost of making the class brittle and inflexible.
1 Answer 1
Your code looks very good, except for one bug in the constructor:
public Stack(int capacity)
You should really do a bounds check on capacity
. It should not be allowed to be negative, of course.
But also, if capacity
is zero, you will get an exception when you call Push
. It will try to resize to count * 2, which is zero.
_values.Length << 1
Code should be clear. If you mean to multiply by 2, say * 2
, not << 1
I would really implement a GetEnumerator()
method, so you can iterate over the Stack items, even if you're not going to implement any collection interface. Implementing IEnumerable<T>
is handy if you want to use LINQ.
-
\$\begingroup\$ I see that the default implementation throws an
ArgumentOutOfRangeException
if capacity is negative, so I suppose I'll do that too, as well as fixing the resize logic. Using the debugger, I see that they internally use a static array of size 0 - would you happen to know why, or is that a question better suited for SO? Also, doesICollection<T>
have any "specialness" attached to it likeIEnumerable<T>
does (LINQ, foreach)? I was going to make my ownICustomCollection<T>
interface, which has all the same functionality, but without the Add() and Remove() methods. \$\endgroup\$Kyle Baran– Kyle Baran2016年04月17日 05:45:56 +00:00Commented Apr 17, 2016 at 5:45 -
1\$\begingroup\$ They use a static array of size 0 to avoid having to create a new one every time a Stack is initiated with size 0. The only thing special about
ICollection<T>
(andICollection
) is that some LINQ-methods, likeCount()
check if the source implementsICollection
. If it does, it means they can simply access the Count property instead of having to enumerate the entire collection. I think it's pretty common to implementICollection<T>
and use explicit implementation for the methods you don't need. \$\endgroup\$Dennis_E– Dennis_E2016年04月17日 09:47:04 +00:00Commented Apr 17, 2016 at 9:47
Stack<T>
is also implemented using an array. ALinkedList<T>
will create aLinkedListNode<T>
object for every entry. \$\endgroup\$