I am developing a statistics library in .net. I have developed a Set<T>
data structure. Now, I want to derive a data structure named DescriptiveStatisticalSet<T>
, and I want this set to be able to operate only on integer and double types.
Problem:
I want to implement a generic type that is able to work with integer and double data types only.
Say, I have the following interfaces and classes:
public interface IIntegerDataType { int Data { get; set; } int Add(int other); } public interface IDoubleDataType { double Data { get; set; } double Add(double other); } public class IntegerDataType : IIntegerDataType { public int Data { get; set; } public int Add(int other) { return Data + other; } } public class DoubleDataType : IDoubleDataType { public double Data { get; set; } public double Add(double other) { return Data + other; } }
Is it possible to create a generic type
DataType<T>
so that both (and only)IntegerDataType
andDoubleDataType
could be accessed through that generic type?
Solution:
I have devised the following solution:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace DataTypeNamespace
{
public interface IDataType
{
object Add(object other);
void SetVal(object other);
}
public interface IDataType<T> where T : IDataType, new()
{
T Add(T other);
}
class IntegerDataType : IDataType
{
public int Data { get; set; }
public object Add(object other)
{
int o = ((IntegerDataType)other).Data;
return Data + o;
}
public void SetVal(object other)
{
Data = (int)other;
}
}
class DoubleDataType : IDataType
{
public double Data { get; set; }
public object Add(object other)
{
double o = ((DoubleDataType)other).Data;
return Data + o;
}
public void SetVal(object other)
{
Data = (double)other;
}
}
public class DataType<T> : IDataType<T> where T : IDataType, new()//https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/new-constraint
{
IDataType _item;
public DataType(IDataType item)
{
_item = item;
}
public T Add(T other)
{
object o = _item.Add(other);
T t = new T();
t.SetVal(o);
return t;
}
}
public class MainClass
{
public static void Main(string[] args)
{
//IntegerDataType item1 = new IntegerDataType();
//item1.Data = 10;
//IntegerDataType item2 = new IntegerDataType();
//item2.Data = 20;
//DataType<IntegerDataType> l1 = new DataType<IntegerDataType>(item1);
//IntegerDataType sum1 = l1.Add(item2);
DoubleDataType item3 = new DoubleDataType();
item3.Data = 10.5;
DoubleDataType item4 = new DoubleDataType();
item4.Data = 20.5;
DataType<DoubleDataType> l2 = new DataType<DoubleDataType>(item3);
DoubleDataType sum2 = l2.Add(item4);
}
}
}
Can someone review this? Or, maybe help me to improve?
2 Answers 2
I may be wrong, but I think you're trying to solve an XY-problem. If I understand the term "DescriptiveStatisticalSet
", you want to provide a data set class, that exposes some significant statistical properties and therefore the generic type parameter of the set must be a numeric type of some kind (int, double, ??). To fix that, you try to develop a certain int
and double
data type, that you can constraint the DescriptiveStatisticalSet<T>
's type parameter with. I think you'll get tired of that design rather quickly, because you'll have to convert to/from this "intermediate" type constantly, whenever you'll want to use the set.
I think I would go in another direction by only allow the creation of DescriptiveStatisticalSet<T>
for certain data types, which can be done in the following way:
// For this illustration, I've just implemented the Set<T> as a generic list sub class. Yours is surely more sophisticated.
public class Set<T> : List<T>
{
public Set()
{
}
public Set(IEnumerable<T> data) : base(data)
{
}
}
public abstract class DescriptiveStatisticalSet<T> : Set<T>
{
protected DescriptiveStatisticalSet(IEnumerable<T> data) : base(data)
{
}
public virtual T Average => default;
public virtual T Median => default;
public virtual T StdDev => default;
// TODO public other key values...
}
public static class DescriptiveStatisticalSet
{
private class IntDescriptiveStatisticalSet : DescriptiveStatisticalSet<int>
{
public IntDescriptiveStatisticalSet(IEnumerable<int> data) : base(data)
{
}
public override int Median
{
get
{
var ordered = this.OrderBy(v => v).ToArray();
if (Count % 2 == 0) return (ordered[Count / 2] + ordered[Count / 2 + 1]) / 2;
return ordered[Count / 2];
}
}
}
private class DoubleDescriptiveStatisticalSet : DescriptiveStatisticalSet<double>
{
public DoubleDescriptiveStatisticalSet(IEnumerable<double> data) : base(data)
{
}
public override double Median
{
get
{
var ordered = this.OrderBy(v => v).ToArray();
if (Count % 2 == 0) return (ordered[Count / 2] + ordered[Count / 2 + 1]) / 2.0;
return ordered[Count / 2];
}
}
}
public static DescriptiveStatisticalSet<int> Create(IEnumerable<int> data)
{
return new IntDescriptiveStatisticalSet(data);
}
public static DescriptiveStatisticalSet<double> Create(IEnumerable<double> data)
{
return new DoubleDescriptiveStatisticalSet(data);
}
}
Used as:
var doubleStatSet = DescriptiveStatisticalSet.Create(new[] { 1.2, 3.4, 5.6 });
Console.WriteLine(doubleStatSet.GetType().Name);
Console.WriteLine(doubleStatSet.Median);
var intStatSet = DescriptiveStatisticalSet.Create(new[] { 1, 2, 3, 4, 5 });
Console.WriteLine(intStatSet.GetType().Name);
Console.WriteLine(intStatSet.Median);
var decimalStatSet = DescriptiveStatisticalSet.Create(new[] { 1.2m, 3.4m, 5.6m }); // ERROR wont compile
A simpler construct that builds on the same principles is to always operate on double
in DescriptiveStatisticalSet
and then only provide two constructors: one that takes a double
data set and another taking an int
data set:
public class DescriptiveStatisticalSet : Set<double>
{
public DescriptiveStatisticalSet(IEnumerable<double> data) : base(data)
{
}
public DescriptiveStatisticalSet(IEnumerable<int> data) : this(data.Cast<double>())
{
}
public double Average => ((IEnumerable<double>)this).Average();
public double Median => default;
public double StdDev => default;
// TODO public other key values...
}
As has been said in the comments, it's unclear why you feel you need this behaviour to restrict the types used. There's an interesting discussion around constraining types here, which includes a link to example code.
So, there's a couple of questions. Why do you need to restrict it to double
/ integer
types? Is it really that you want to do is restrict it to types that support certain operations? What is the actual downside to using it for Decimal
/ float
?
You've presented an example of IDataType
, however suggested that really your goal is to put these items into a set. So this begs the question, are you planning on creating DescriptiveStatisticalSet<IntegerDataType>
or DescriptiveStatisticalSet<IDataType>
? If it's IntegerDataType
, what are you gaining over Int32
for example? If it's IDataType
, then are you expecting to have both IntegerDataType
and DoubleDataType
both present in the same set? If so, what are you planning on doing if an Integer has the same value as a Double? Do you keep both because they are different types, or keep whichever one was there first, or keep the Double because it's more flexible, or the Integer because it's faster?
Looking at your actual code, there's some aspects which are worth mentioning.
object Add(object other);
If we look at IntegerDataType
, this method takes in an other of type IntegerDataType
, but returns an int
. This it not at all obvious from the method signature. You are working around this to an extent in your DataType
class, which does some recasting by creating a new instance of the returned object, however this seems unnecessarily complex. Particularly since the implementation of Add
relies on casting other
to the correct type. Calling Add
with a DoubleDataType
for example throws a cast exception.
public int Data { get; set; }
Both your Integer
and Double
data types declare a Data
property that has both public getter and setter. Since you're implementing an interface to setVal
, does it really make sense to declare this public setter? It feels like it would be better for the set
to be private, in order to encourage the client to use the interface method, in case you decide you need to do additional checking in the future.
Your DataType
class seems to exist purely to create a new item from the object
version of Add
. This seems to add unnecessary complexity over just having the initial Add
return the correct datatype. So, it's a bit unclear to me what advantage your current approach has over simply doing this:
public interface IOtherDataType<T>
{
T Data { get;}
IOtherDataType<T> Add(IOtherDataType<T> other);
void SetVal(T other);
}
public class IntType : IOtherDataType<int>
{
public IntType(int data)
{
Data = data;
}
public int Data { get; private set; }
public IOtherDataType<int> Add(IOtherDataType<int> other)
{
return new IntType(Data + other.Data);
}
public void SetVal(int other)
{
Data = other;
}
}
Which can be used in a similar way...
IntType five = new IntType(5);
IOtherDataType<int> eleven = five.Add(new IntType(6));
IDataTypeeeeeeeeeee
\$\endgroup\$object other
) introduces a lot of unnecessary memory pressure (allocating a value type on a heap). Generics would be a much more preferable approach. \$\endgroup\$