My class takes functions as a parameter. There are 6 of them, along with numerous other parameters. Almost all of the fields have default values.
It also has a class method that creates an instance of the class with 2 of the fields initialized randomly. It has a subclass that cheats a bit by making 2 of the functions passed in as a parameter unnecessary.
All of this has a lot of redundant information. I have found bugs in my code caused by mistakes in the constructor calls because they were so long. I indeed found another while writing this question.
How can I improve this?
ParentClass
class NeuralNet:
def __init__(self,
weights,
biases,
learning_rate=0.01,
momentum=0.9, #Set to zero to nut use mementum
post_process=lambda i: i, #Post Process is applied to final output only (Not used when training), but is used when checkign error rate
topErrorFunc=squaredMean, d_topErrorFunc=dSquaredMean,
actFunc=sigmoid, d_actFunc=dSigmoid,
topActFunc = sigmoid, d_topActFunc = dSigmoid):
self.weights = weights
self.biases = biases
assert len(self.biases)==len(self.weights), "Must have as many bias vectors as weight matrixes"
self._prevDeltaWs = [np.zeros_like(w) for w in self.weights]
self._prevDeltaBs = [np.zeros_like(b) for b in self.biases]
self.learning_rate = learning_rate
self.momentum = momentum
self.post_process = post_process
self.topErrorFunc = topErrorFunc
self.d_topErrorFunc = d_topErrorFunc
self.actFunc = actFunc
self.d_actFunc=d_actFunc
self.topActFunc = topActFunc
self.d_topActFunc = d_topActFunc
@classmethod
def random_init(cls,
layer_sizes,
learning_rate=0.01,
momentum=0.9, #Set to zero to nut use mementum
post_process=lambda i: i, #Post Process is applied to final output only (Not used when training), but is used when checkign error rate
topErrorFunc=squaredMean, d_topErrorFunc=dSquaredMean,
actFunc=sigmoid, d_actFunc=dSigmoid,
topActFunc=sigmoid, d_topActFunc=dSigmoid):
weights = [np.random.normal(0,0.01,(szNext,szThis)) for (szThis,szNext) in pairwise(layer_sizes)]
biases = [np.random.normal(0,0.01,sz) for sz in layer_sizes[1:]]
return cls(weights, biases, learning_rate, momentum, post_process, topErrorFunc, d_topErrorFunc, actFunc, d_actFunc, topActFunc,d_topActFunc)
#... Class methods
Subclass
class FixedTopErrorSignalNeuralNet(NeuralNet):
def __init__(self,
weights,
biases,
learning_rate=0.01,
momentum=0.9, #Set to zero to nut use mementum
post_process=lambda i: i, #Post Process is applied to final output only (Not used when training), but is used when checkign error rate
actFunc=sigmoid, d_actFunc=dSigmoid,
topActFunc = softmax, d_topActFunc = dSoftmax):
NeuralNet.__init__(self,
weights=weights,
biases=biases,
momentum=momentum,
post_process=post_process,
topErrorFunc=None, d_topErrorFunc=None,
actFunc=actFunc, d_actFunc=d_actFunc,
topActFunc=topActFunc, d_topActFunc=d_topActFunc)
@classmethod
def random_init(cls,
layer_sizes,
learning_rate=0.01,
momentum=0.9, #Set to zero to nut use mementum
post_process=lambda i: i, #Post Process is applied to final output only (Not used when training), but is used when checkign error rate
actFunc=sigmoid, d_actFunc=dSigmoid,
topActFunc=softmax, d_topActFunc=dSoftmax):
weights = [np.random.normal(0,0.01,(szNext,szThis)) for (szThis,szNext) in pairwise(layer_sizes)]
biases = [np.random.normal(0,0.01,sz) for sz in layer_sizes[1:]]
return cls(weights, biases, learning_rate, momentum, post_process, actFunc, d_actFunc, topActFunc,d_topActFunc)
#... Overrides etc
What the parameters do
It has been suggested that this class is doing too many things, because of its large number of parameters. Here are what they do, which may provide insight.
The functional parameters are mostly there to avoid hard-coding them in. They are all very simple mathematical (pure) 1-2 liners.
actFunc
is a mathematical activation functiond_actFunc
is its derivativetop_actFunc
is a alternative activation function used at the top of the neural net (very common technique)d_top_actFunc
is its derivativeErrorFunc
is used to calculate the error for during trainingd_errorFunc
is its derivativepost_process
is simple mathematical transform used to get the final output, when used on nontraining data
They are all parameters of the model.
One alternative could be that rather than passing them in as a parameter, I make them abstract methods, and then for each variation I could override them. This doesn't feel right to me, though something like Java's anonymous classes might work.
-
1\$\begingroup\$ I know nothing of python, but in other languages having a class with lots of constructor parameters is often a sign our object is breaking SRP and basically does too many things. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年02月12日 01:33:18 +00:00Commented Feb 12, 2014 at 1:33
-
1\$\begingroup\$ I don't think it is breakign the SRP, as it has one responsibility, being a neural-net. It is just highly parametrised. \$\endgroup\$Frames Catherine White– Frames Catherine White2014年02月12日 01:45:37 +00:00Commented Feb 12, 2014 at 1:45
-
\$\begingroup\$ Well +1 anyway, looks like a good CR question... the code works right? \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年02月12日 02:16:13 +00:00Commented Feb 12, 2014 at 2:16
-
1\$\begingroup\$ It does. and indeed (/however) even with the terrible constructor bugs i had, it still worked because neural nets are super fault tolerant, so i was usign the wrong derivative and it was just fine... \$\endgroup\$Frames Catherine White– Frames Catherine White2014年02月12日 02:32:53 +00:00Commented Feb 12, 2014 at 2:32
2 Answers 2
There's three ways I could see this going.
stronger hierarchy
Create what amounts to an abstract base class for NeuralNet with stubs for the math functions and then make subclasses to override the methods.
class NeuralNetBase(object):
def __init__(self, biases, learning_rate=0.01, momentum=0.9):
self.weights = weights
self.biases = biases
assert len(self.biases)==len(self.weights), "Must have as many bias vectors as weight matrixes"
self._prevDeltaWs = [np.zeros_like(w) for w in self.weights]
self._prevDeltaBs = [np.zeros_like(b) for b in self.biases]
self.learning_rate = learning_rate
self.momentum = momentum
def act_func(self):
raise NotImplemented
def d_act_func(self):
raise NotImplemented
def top_act_func(self):
raise NotImplemented
def d_top_act_func(self):
raise NotImplemented
class SigmoidNeuralNet(NeuralNetBase):
def act_func(self):
# magic here
def d_act_func(self):
# more magic
def top_act_func(self):
# even more...
def d_top_act_func(self):
# like hogwarts!
This would work well if there is a high correlation between the optional functions: if they tend to cluster together they'd make a natural class hierarchy (and you'd have an easy way to see which nodes were using which function sets just by looking at their concrete classes). OTOH this won't work well if the functions are not correlated.
Kwargs to the rescue
You can simplify the constructor logic by using kwargs and by including the defaults in the init method (for sanity's sake I'd move the pure data into required parameters, but that's just aesthetics);
class NeuralNet(object):
def __init__(self, biases, learning_rate, momentum, **kwargs):
self.weights = weights
self.biases = biases
assert len(self.biases)==len(self.weights), "Must have as many bias vectors as weight matrixes"
self._prevDeltaWs = [np.zeros_like(w) for w in self.weights]
self._prevDeltaBs = [np.zeros_like(b) for b in self.biases]
self.learning_rate = learning_rate
self.momentum = momentum
# assuming the default implementations 'private' class methods
# defined below
self.act = kwargs.get('act_func', self._default_act_func)
self.top_act = kwargs.get('top_act_func', self._default_top_act_func)
self.d_act = kwargs.get('d_act_func', self._default_d_act_func)
self.d_top_act_func = kwargs.get('top_d_act_func', self._default_d_top_act_func)
self.postprocess = kwags.get('post', self._default_postprocess)
function components
If a lot of reasoning goes into the choice or relationship of the functions, you could just put them all into an object and work with those objects instead of loose funcs:
class NeuralNet(object):
def __init__(self, biases, learning_rate, momentum, funcs = DefaultFuncs):
self.weights = weights
self.biases = biases
#... etc
self.Functions = funcs(self)
class DefaultFFuncs(object):
def __init__(self, target, act_func, top_func, d_act_func, d_top_func);
self.Target = target
self._act_func = act_func
self._act_func = act_func
self._d_top_func = d_top_func
self._d_top_func = d_top_func
def act_func(self):
return self._act_func(self.Target)
def d_act_func(self):
return self._d_act_func(self.Target)
def top_act_func(self):
return self._top_act_func(self.Target)
def d_top_act_func(self):
return self._d_top_act_func(self.Target)
This would let you compose and reuse a collection of funcs into a DefaultFuncs and then reuse it -- default funcs is really just an elaborate Tuple tricked out so you can call into the functions from the owning NeuralNet instance
All of these are functionally identical approaches (it sounds like you've already got the thing working and just want to clean it up). The main reasons for choosing among them amount to where you want to put the work #1 is good if the functions correlate and you want to easily tell when a given net instance is using a set; #2 is just syntax sugar on what you've already got; #3 is really just #2 except that you compose a function set as class (perhaps with error checking or more sophisticated reasoning) instead of a a dictionary
-
\$\begingroup\$ This is a great answer, useful to consider in many contexts. Bundling args into a helper class, using base/subs, etc. I like. \$\endgroup\$Erik Aronesty– Erik Aronesty2020年02月25日 20:37:29 +00:00Commented Feb 25, 2020 at 20:37
A few suggestions:
- You can simplify
random_init
as:def random_init(self, **kwargs):
; this allows you to pass any provided arguments straight through toreturn cls(weights, biases, **kwargs)
without explicitly specifying them. This prevents the current duplication of default values, which could lead to problems if you change one and forget to change the other. random_init
inFixedTopErrorSignalNeuralNet
appears identical to that inNeuralNet
, so just use the inherited version; don't define it again.- More broadly,
FixedTopErrorSignalNeuralNet
appears to only provide two different arguments to theNeuralNet.__init__
(note: should access this viasuper(FixedTopErrorSignalNeuralNet, self).__init__
); unless it has different methods, too, you could just use another@classmethod
to create this variant. Again, you could use**kwargs
to reduce duplication. - Could you pass only the standard functions (e.g.
actFunc
) then differentiate them locally? That would save passing pairs of functions.
-
\$\begingroup\$ Re Point 3: FixedTopErrorSignalNeuralNet, has other methods, (as to does NeuralNet) not show. \$\endgroup\$Frames Catherine White– Frames Catherine White2014年02月13日 00:03:44 +00:00Commented Feb 13, 2014 at 0:03
-
\$\begingroup\$ Re 4. No It is not worth integrating a CAS system just to skip out on a few arguments. CAS systems would stuggle with the differentiation. \$\endgroup\$Frames Catherine White– Frames Catherine White2014年02月13日 00:05:02 +00:00Commented Feb 13, 2014 at 0:05
-
\$\begingroup\$ Re Super: I though super was not to be used in python 2, stackoverflow.com/a/5066411/179081 \$\endgroup\$Frames Catherine White– Frames Catherine White2014年02月13日 00:09:31 +00:00Commented Feb 13, 2014 at 0:09