Python neural network: arbitrary number of hidden nodes

Question 1

I'm trying to write a neural network that only requires the user to specify the dimensionality of the network. Concretely, the user might define a network like this:

nn = NN([2, 10, 20, 15, 2]) # 2 input, 2 output, 10 in hidden 1, 20 in hidden 2...

To do this, I'm trying to adapt some basic code. Please let me know what improvements can be made to improve readability (for example, I've considered cleaning up the distinction between wDims (weight dimensions) and layer dims because these variables seem redundant).

I would also appreciate any tips on how to implement a neural network as a graph. I've tried this but can't agree on what classes need to be defined or even how the graph would be stored (as a python dictionary?). Basically, some suggestions relating to good representation would be much appreciated.

First, auxiliary methods the network uses:

import random, math
random.seed(0)
def r_matrix(m, n, a = -0.5, b = 0.5):
 return [[random.uniform(a,b) for j in range(n)] for i in range(m)]
def sigmoid(x):
 return 1.0/ (1.0 + math.exp(-x))
def d_sigmoid(y):
 return y * (1.0 - y)

Definition and construction of the network:

class NN:
 def __init__(self, dims):
 self.dims = dims
 self.nO = self.dims[-1]
 self.nI = self.dims[0]
 self.nLayers = len(self.dims)
 self.wDims = [ (self.dims[i-1], self.dims[i])\
 for i in range(1, self.nLayers) ]
 self.nWeights = len(self.wDims)
 self.__initNeurons()
 self.__initWeights()
 def __initWeights(self):
 self.weights = [0.0] * self.nWeights
 for i in range(self.nWeights):
 n_in, n_out = self.wDims[i]
 self.weights[i] = r_matrix(n_in, n_out)
 def __initNeurons(self):
 self.layers = [0.0] * self.nLayers
 for i in range(self.nLayers):
 self.layers[i] = [0.0] * self.dims[i]

Implementation of back propagation and forward propagation:

 def __activateLayer(self, i):
 prev = self.layers[i-1]
 n_in, n_out = self.dims[i-1], self.dims[i]
 for j in range(n_out):
 total = 0.0
 for k in range(n_in):
 total += prev[k] * self.weights[i-1][k][j] # num weights is always one less than num layers
 self.layers[i][j] = sigmoid(total)
 def __backProp(self, i, delta):
 n_out, n_in = self.dims[i], self.dims[i+1]
 next_delta = [0.0] * n_out
 for j in range(n_out):
 error = 0.0
 for k in range(n_in):
 error += delta[k] * self.weights[i][j][k]
 pred = self.layers[i][j]
 next_delta[j] = d_sigmoid(pred) * error
 return next_delta
 def __updateWeights(self, i, delta, alpha = .7):
 n_in, n_out = self.wDims[i]
 for j in range(n_in):
 for k in range(n_out):
 change = delta[k] * self.layers[i][j]
 self.weights[i][j][k] += alpha * change
 def feedForward(self, x):
 if len(x) != self.nI:
 raise ValueError('length of x must be same as num input units')
 for i in range(self.nI):
 self.layers[0][i] = x[i]
 for i in range(1, self.nLayers):
 self.__activateLayer(i)
 def backPropLearn(self, y):
 if len(y) != self.nO:
 raise ValueError('length of y must be same as num output units')
 delta_list = []
 delta = [0.0] * self.nO
 for k in range(self.nO):
 pred = self.layers[-1][k]
 error = y[k] - pred
 delta[k] = d_sigmoid(pred) * error
 delta_list.append(delta)
 for i in reversed(range(1, self.nLayers-1)):
 next_delta = self.__backProp(i, delta)
 delta = next_delta
 delta_list = [delta] + delta_list
 # now perform the update
 for i in range(self.nWeights):
 self.__updateWeights(i, delta_list[i])

Predict and train methods. Yes I'm planning on improving train to allow the user to specify the maximum number of iterations and alpha:

 def predict(self, x):
 self.feedForward(x)
 return self.layers[-1]
 def train(self, T):
 i, MAX = 0, 5000
 while i < MAX:
 for t in T:
 x, y = t
 self.feedForward(x)
 self.backPropLearn(y)
 i += 1

Sample data. Teach the network the numbers 0 through 4:

# no. 0
t0 = [ 0,1,1,1,0,\
 0,1,0,1,0,\
 0,1,0,1,0,\
 0,1,0,1,0,\
 0,1,1,1,0 ]
# no. 1
t1 = [ 0,1,1,0,0,\
 0,0,1,0,0,\
 0,0,1,0,0,\
 0,0,1,0,0,\
 1,1,1,1,1 ]
# no. 2
t2 = [ 0,1,1,1,0,\
 0,0,0,1,0,\
 0,1,1,1,0,\
 0,1,0,0,0,\
 0,1,1,1,0 ]
# no. 3
t3 = [ 0,1,1,1,0,\
 0,0,0,1,0,\
 0,1,1,1,0,\
 0,0,0,1,0,\
 0,1,1,1,0 ]
# no. 4
t4 = [ 0,1,0,1,0,\
 0,1,0,1,0,\
 0,1,1,1,0,\
 0,0,0,1,0,\
 0,0,0,1,0 ]
T = [(t0, [1,0,0,0,0]), (t1, [0,1,0,0,0]), (t2, [0,0,1,0,0]), (t3, [0,0,0,1,0]), (t4, [0,0,0,0,1])]
nn = NN([25, 50, 50, 20, 5])
nn.train(T)

Question 2

1. Comments on your code

There's no documentation! What do these functions do? How do I call them? If someone has to maintain this code in a couple of years' time, how will they know what to do?
alpha should be a property of the class (and thus an optional argument to the constructor), not an optional argument to the __updateWeights method (where the user has no way to adjust it).
You don't need to write \ at the end of a line if you're in the middle of a parenthesis (because the statement can't end until the parenthesis is closed). So there's no need for the \ here:
```
self.wDims = [ (self.dims[i-1], self.dims[i])\
 for i in range(1, self.nLayers) ]
```
nor in the definitions of t0 and so on.
The names could use a lot of work. NN should be NeuralNetwork. MAX should be something like rounds. dims should be something like layer_sizes.
You use lots of private method names (starting with __). Why do you do that? The purpose of private names in Python is to "avoid name clashes of names with names defined by subclasses" but that's obviously not necessary here. All that the private names achieve here is to make your code a bit harder to debug:
```
>>> network = NN((1,1))
>>> network.__backProp(0, [0.5])
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
AttributeError: 'NN' object has no attribute '__backProp'
>>> network._NN__backProp(0, [0.5])
[0.0]
```
If you want to indicate that a method is for internal use by a class, then the convention is to prefix it with a single underscore.
Generally in Python you should prefer iterating over sequences rather than over indexes. So instead of:
```
self.wDims = [ (self.dims[i-1], self.dims[i])\
 for i in range(1, self.nLayers) ]
```
you should write something like:
```
self.wDims = list(zip(self.dims[:-1], self.dims[1:]))
```
But in practice you'd do better just to drop wDims array altogether, since it just contains the same information as the dims aray. Whenever you write:
```
n_in, n_out = self.wDims[i]
```
you could write instead:
```
n_in, n_out = self.dims[i:i+2]
```
which seems clearer to me.
In the train method, surely you should pass in MAX as a parameter?
Instead of looping using while:
```
i = 0
while i < MAX:
 # ...
 i += 1
```
prefer looping using for:
```
for i in range(MAX):
 # ...
```
(But actually here you don't use i in the loop body, so it would be conventional to write _ instead.)

Instead of:

for t in T:
 x, y = t
 # ...

write:

for x, y in T:
 # ...

The array layers is not actually a permanent property of the neural networks. It's only used temporarily in feedForward and backPropLearn. And in fact only one layer is really used at a time. It would be better for layer to be a local variable in these methods.
The random seed used to initialize the weights should surely be something that you choose each time you create an instance, not just once.

2. Rewriting using NumPy

This code would benefit enormously from using NumPy, a library for fast numerical array operations.

Your function r_matrix could be replaced with numpy.random.uniform. Here I create a ×ばつ4 array of random numbers chosen uniformly in the half-open range [2, 3):

>>> numpy.random.uniform(2, 3, (4, 4))
array([[ 2.95552338, 2.75158213, 2.22088904, 2.95417241],
 [ 2.59129035, 2.29089095, 2.16007186, 2.64646486],
 [ 2.39729966, 2.96208642, 2.12305994, 2.68911969],
 [ 2.64394815, 2.21609217, 2.69556204, 2.35376118]])

so the whole of your __initWeights function could become:

self.weights = [numpy.random.uniform(-0.5, 0.5, size)
 for size in zip(self.dims[:-1], self.dims[1:])]

Similarly, the whole of your __initNeurons function could become:
```
self.layers = [numpy.zeros((size,)) for size in self.dims]
```
But as explained in §1.9 above, we don't actually want self.layers, so __initNeurons can simply be omitted.
The _activateLayer method multiplies the vector in self.layers[i-1] by the matrix self.weights[i-1] and then applies the sigmoid function to each element in the result. So in NumPy, _activateLayer becomes:
```
def _activateLayer(self, i):
 self.layers[i] = sigmoid(numpy.dot(self.layers[i-1], self.weights[i-1]))
```
We don't even need the check on the length of the input any more, because if we pass in an input array of the wrong length, we'll get:
```
ValueError: matrices are not aligned
```
Similarly, the _backProp method multiplies the vector delta by the transpose of the matrix self.weights[i], and then applies the d_sigmoid function to each element. So in Numpy this becomes:
```
def _backprop2(self, i, delta):
 return d_sigmoid(self.layers[i]) * numpy.dot(delta, self.weights[i].T)
```
Similarly, the _updateWeights method becomes:
```
def _updateWeights(self, i, delta, alpha = .7):
 self.weights[i] += alpha * numpy.outer(self.layers[i], delta)
```
And so on. See the revised code below for more improvements.

3. Revised code

This answer was getting quite long, so you'll need to add the documentation yourself.

from itertools import product
import numpy
def sigmoid(x):
 return 1 / (1 + numpy.exp(-x))
def d_sigmoid(y):
 return y * (1 - y)
class NeuralNetwork(object):
 def __init__(self, layer_sizes, alpha=0.7, seed=None):
 self.alpha = alpha
 state = numpy.random.RandomState(seed)
 self.weights = [state.uniform(-0.5, 0.5, size)
 for size in zip(layer_sizes[:-1], layer_sizes[1:])]
 def _feed_forward(self, x):
 yield x
 for w in self.weights:
 x = sigmoid(numpy.dot(x, w))
 yield x
 def _deltas(self, layers, output):
 delta = d_sigmoid(layers[-1]) * (output - layers[-1])
 for layer, w in zip(layers[-2::-1], self.weights[::-1]):
 yield delta
 delta = d_sigmoid(layer) * numpy.dot(delta, w.T)
 def _learn(self, layers, output):
 deltas = reversed(list(self._deltas(layers, output)))
 return [w + self.alpha * numpy.outer(layer, delta)
 for w, layer, delta in zip(self.weights, layers, deltas)]
 def train(self, training_data, rounds=5000):
 for _, (input, output) in product(range(rounds), training_data):
 layers = self._feed_forward(numpy.array(input))
 self.weights = self._learn(list(layers), numpy.array(output))
 def predict(self, input):
 for layer in self._feed_forward(input): pass
 return layer

4. Finally

Have you considered using neurolab instead of writing your own?

Question 3

I think my code could be improved further (there's a bit too much fiddling about with list slices) but I ran out of time; maybe some other reviewer here can clean it up.

Question 4

Thank you Gareth! I wrote this code mainly to figure out how neural networks work. I have considered using numpy (having installation issues for python 3) and in any professional application I would surely use a more stable implementation.

Question 5

Here are a bunch of relatively minor comments:

put your imports on separate lines.
call random.seed in an if __name__ == '__main__': section, not in the global section.
r_matrix is not a very obvious name; random_matrix seems better; the parameters of that function also have pretty non-obvious names; rows, cols, min, max seem to be more descriptive.
self.layers = [0.0] * self.nLayers is odd when self.layers is a list of lists. Actually, this is a perfect case to use a list comprehension: self.layers = [[0.0] * dim for dim in self.dims]. Similarly for weights.
In __activateLayer: self.layers[i][j] = sigmoid(sum(prev[k] * self.weights[i-1][k][j] for k in xrange(n_in)))

Question 6

Thanks ruds. Are any of the ideas in the code really unclear? Could you please also offer some ideas about how I might use Neuron and Weight objects, to make the implementation cleaner?

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2013-12-21 14:21:10Z

1. Comments on your code

There's no documentation! What do these functions do? How do I call them? If someone has to maintain this code in a couple of years' time, how will they know what to do?
alpha should be a property of the class (and thus an optional argument to the constructor), not an optional argument to the __updateWeights method (where the user has no way to adjust it).
You don't need to write \ at the end of a line if you're in the middle of a parenthesis (because the statement can't end until the parenthesis is closed). So there's no need for the \ here:
```
self.wDims = [ (self.dims[i-1], self.dims[i])\
 for i in range(1, self.nLayers) ]
```
nor in the definitions of t0 and so on.
The names could use a lot of work. NN should be NeuralNetwork. MAX should be something like rounds. dims should be something like layer_sizes.
You use lots of private method names (starting with __). Why do you do that? The purpose of private names in Python is to "avoid name clashes of names with names defined by subclasses" but that's obviously not necessary here. All that the private names achieve here is to make your code a bit harder to debug:
```
>>> network = NN((1,1))
>>> network.__backProp(0, [0.5])
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
AttributeError: 'NN' object has no attribute '__backProp'
>>> network._NN__backProp(0, [0.5])
[0.0]
```
If you want to indicate that a method is for internal use by a class, then the convention is to prefix it with a single underscore.
Generally in Python you should prefer iterating over sequences rather than over indexes. So instead of:
```
self.wDims = [ (self.dims[i-1], self.dims[i])\
 for i in range(1, self.nLayers) ]
```
you should write something like:
```
self.wDims = list(zip(self.dims[:-1], self.dims[1:]))
```
But in practice you'd do better just to drop wDims array altogether, since it just contains the same information as the dims aray. Whenever you write:
```
n_in, n_out = self.wDims[i]
```
you could write instead:
```
n_in, n_out = self.dims[i:i+2]
```
which seems clearer to me.
In the train method, surely you should pass in MAX as a parameter?
Instead of looping using while:
```
i = 0
while i < MAX:
 # ...
 i += 1
```
prefer looping using for:
```
for i in range(MAX):
 # ...
```
(But actually here you don't use i in the loop body, so it would be conventional to write _ instead.)

Instead of:

for t in T:
 x, y = t
 # ...

write:

for x, y in T:
 # ...

The array layers is not actually a permanent property of the neural networks. It's only used temporarily in feedForward and backPropLearn. And in fact only one layer is really used at a time. It would be better for layer to be a local variable in these methods.
The random seed used to initialize the weights should surely be something that you choose each time you create an instance, not just once.

2. Rewriting using NumPy

This code would benefit enormously from using NumPy, a library for fast numerical array operations.

Your function r_matrix could be replaced with numpy.random.uniform. Here I create a ×ばつ4 array of random numbers chosen uniformly in the half-open range [2, 3):

>>> numpy.random.uniform(2, 3, (4, 4))
array([[ 2.95552338, 2.75158213, 2.22088904, 2.95417241],
 [ 2.59129035, 2.29089095, 2.16007186, 2.64646486],
 [ 2.39729966, 2.96208642, 2.12305994, 2.68911969],
 [ 2.64394815, 2.21609217, 2.69556204, 2.35376118]])

so the whole of your __initWeights function could become:

self.weights = [numpy.random.uniform(-0.5, 0.5, size)
 for size in zip(self.dims[:-1], self.dims[1:])]

Similarly, the whole of your __initNeurons function could become:
```
self.layers = [numpy.zeros((size,)) for size in self.dims]
```
But as explained in §1.9 above, we don't actually want self.layers, so __initNeurons can simply be omitted.
The _activateLayer method multiplies the vector in self.layers[i-1] by the matrix self.weights[i-1] and then applies the sigmoid function to each element in the result. So in NumPy, _activateLayer becomes:
```
def _activateLayer(self, i):
 self.layers[i] = sigmoid(numpy.dot(self.layers[i-1], self.weights[i-1]))
```
We don't even need the check on the length of the input any more, because if we pass in an input array of the wrong length, we'll get:
```
ValueError: matrices are not aligned
```
Similarly, the _backProp method multiplies the vector delta by the transpose of the matrix self.weights[i], and then applies the d_sigmoid function to each element. So in Numpy this becomes:
```
def _backprop2(self, i, delta):
 return d_sigmoid(self.layers[i]) * numpy.dot(delta, self.weights[i].T)
```
Similarly, the _updateWeights method becomes:
```
def _updateWeights(self, i, delta, alpha = .7):
 self.weights[i] += alpha * numpy.outer(self.layers[i], delta)
```
And so on. See the revised code below for more improvements.

3. Revised code

This answer was getting quite long, so you'll need to add the documentation yourself.

from itertools import product
import numpy
def sigmoid(x):
 return 1 / (1 + numpy.exp(-x))
def d_sigmoid(y):
 return y * (1 - y)
class NeuralNetwork(object):
 def __init__(self, layer_sizes, alpha=0.7, seed=None):
 self.alpha = alpha
 state = numpy.random.RandomState(seed)
 self.weights = [state.uniform(-0.5, 0.5, size)
 for size in zip(layer_sizes[:-1], layer_sizes[1:])]
 def _feed_forward(self, x):
 yield x
 for w in self.weights:
 x = sigmoid(numpy.dot(x, w))
 yield x
 def _deltas(self, layers, output):
 delta = d_sigmoid(layers[-1]) * (output - layers[-1])
 for layer, w in zip(layers[-2::-1], self.weights[::-1]):
 yield delta
 delta = d_sigmoid(layer) * numpy.dot(delta, w.T)
 def _learn(self, layers, output):
 deltas = reversed(list(self._deltas(layers, output)))
 return [w + self.alpha * numpy.outer(layer, delta)
 for w, layer, delta in zip(self.weights, layers, deltas)]
 def train(self, training_data, rounds=5000):
 for _, (input, output) in product(range(rounds), training_data):
 layers = self._feed_forward(numpy.array(input))
 self.weights = self._learn(list(layers), numpy.array(output))
 def predict(self, input):
 for layer in self._feed_forward(input): pass
 return layer

4. Finally

Have you considered using neurolab instead of writing your own?

I think my code could be improved further (there's a bit too much fiddling about with list slices) but I ran out of time; maybe some other reviewer here can clean it up.
Thank you Gareth! I wrote this code mainly to figure out how neural networks work. I have considered using numpy (having installation issues for python 3) and in any professional application I would surely use a more stable implementation.

Stack Exchange Network

Python neural network: arbitrary number of hidden nodes

2 Answers 2

1. Comments on your code

2. Rewriting using NumPy

3. Revised code

4. Finally

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Python neural network: arbitrary number of hidden nodes

2 Answers 2

1. Comments on your code

2. Rewriting using NumPy

3. Revised code

4. Finally

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions