I'm trying to write a neural network that only requires the user to specify the dimensionality of the network. Concretely, the user might define a network like this:
nn = NN([2, 10, 20, 15, 2]) # 2 input, 2 output, 10 in hidden 1, 20 in hidden 2...
To do this, I'm trying to adapt some basic code. Please let me know what improvements can be made to improve readability (for example, I've considered cleaning up the distinction between wDims
(weight dimensions) and layer dims
because these variables seem redundant).
I would also appreciate any tips on how to implement a neural network as a graph. I've tried this but can't agree on what classes need to be defined or even how the graph would be stored (as a python dictionary?). Basically, some suggestions relating to good representation would be much appreciated.
First, auxiliary methods the network uses:
import random, math
random.seed(0)
def r_matrix(m, n, a = -0.5, b = 0.5):
return [[random.uniform(a,b) for j in range(n)] for i in range(m)]
def sigmoid(x):
return 1.0/ (1.0 + math.exp(-x))
def d_sigmoid(y):
return y * (1.0 - y)
Definition and construction of the network:
class NN:
def __init__(self, dims):
self.dims = dims
self.nO = self.dims[-1]
self.nI = self.dims[0]
self.nLayers = len(self.dims)
self.wDims = [ (self.dims[i-1], self.dims[i])\
for i in range(1, self.nLayers) ]
self.nWeights = len(self.wDims)
self.__initNeurons()
self.__initWeights()
def __initWeights(self):
self.weights = [0.0] * self.nWeights
for i in range(self.nWeights):
n_in, n_out = self.wDims[i]
self.weights[i] = r_matrix(n_in, n_out)
def __initNeurons(self):
self.layers = [0.0] * self.nLayers
for i in range(self.nLayers):
self.layers[i] = [0.0] * self.dims[i]
Implementation of back propagation and forward propagation:
def __activateLayer(self, i):
prev = self.layers[i-1]
n_in, n_out = self.dims[i-1], self.dims[i]
for j in range(n_out):
total = 0.0
for k in range(n_in):
total += prev[k] * self.weights[i-1][k][j] # num weights is always one less than num layers
self.layers[i][j] = sigmoid(total)
def __backProp(self, i, delta):
n_out, n_in = self.dims[i], self.dims[i+1]
next_delta = [0.0] * n_out
for j in range(n_out):
error = 0.0
for k in range(n_in):
error += delta[k] * self.weights[i][j][k]
pred = self.layers[i][j]
next_delta[j] = d_sigmoid(pred) * error
return next_delta
def __updateWeights(self, i, delta, alpha = .7):
n_in, n_out = self.wDims[i]
for j in range(n_in):
for k in range(n_out):
change = delta[k] * self.layers[i][j]
self.weights[i][j][k] += alpha * change
def feedForward(self, x):
if len(x) != self.nI:
raise ValueError('length of x must be same as num input units')
for i in range(self.nI):
self.layers[0][i] = x[i]
for i in range(1, self.nLayers):
self.__activateLayer(i)
def backPropLearn(self, y):
if len(y) != self.nO:
raise ValueError('length of y must be same as num output units')
delta_list = []
delta = [0.0] * self.nO
for k in range(self.nO):
pred = self.layers[-1][k]
error = y[k] - pred
delta[k] = d_sigmoid(pred) * error
delta_list.append(delta)
for i in reversed(range(1, self.nLayers-1)):
next_delta = self.__backProp(i, delta)
delta = next_delta
delta_list = [delta] + delta_list
# now perform the update
for i in range(self.nWeights):
self.__updateWeights(i, delta_list[i])
Predict and train methods. Yes I'm planning on improving train
to allow the user to specify the maximum number of iterations and alpha:
def predict(self, x):
self.feedForward(x)
return self.layers[-1]
def train(self, T):
i, MAX = 0, 5000
while i < MAX:
for t in T:
x, y = t
self.feedForward(x)
self.backPropLearn(y)
i += 1
Sample data. Teach the network the numbers 0 through 4:
# no. 0
t0 = [ 0,1,1,1,0,\
0,1,0,1,0,\
0,1,0,1,0,\
0,1,0,1,0,\
0,1,1,1,0 ]
# no. 1
t1 = [ 0,1,1,0,0,\
0,0,1,0,0,\
0,0,1,0,0,\
0,0,1,0,0,\
1,1,1,1,1 ]
# no. 2
t2 = [ 0,1,1,1,0,\
0,0,0,1,0,\
0,1,1,1,0,\
0,1,0,0,0,\
0,1,1,1,0 ]
# no. 3
t3 = [ 0,1,1,1,0,\
0,0,0,1,0,\
0,1,1,1,0,\
0,0,0,1,0,\
0,1,1,1,0 ]
# no. 4
t4 = [ 0,1,0,1,0,\
0,1,0,1,0,\
0,1,1,1,0,\
0,0,0,1,0,\
0,0,0,1,0 ]
T = [(t0, [1,0,0,0,0]), (t1, [0,1,0,0,0]), (t2, [0,0,1,0,0]), (t3, [0,0,0,1,0]), (t4, [0,0,0,0,1])]
nn = NN([25, 50, 50, 20, 5])
nn.train(T)
2 Answers 2
1. Comments on your code
There's no documentation! What do these functions do? How do I call them? If someone has to maintain this code in a couple of years' time, how will they know what to do?
alpha
should be a property of the class (and thus an optional argument to the constructor), not an optional argument to the__updateWeights
method (where the user has no way to adjust it).You don't need to write
\
at the end of a line if you're in the middle of a parenthesis (because the statement can't end until the parenthesis is closed). So there's no need for the\
here:self.wDims = [ (self.dims[i-1], self.dims[i])\ for i in range(1, self.nLayers) ]
nor in the definitions of
t0
and so on.The names could use a lot of work.
NN
should beNeuralNetwork
.MAX
should be something likerounds
.dims
should be something likelayer_sizes
.You use lots of private method names (starting with
__
). Why do you do that? The purpose of private names in Python is to "avoid name clashes of names with names defined by subclasses" but that's obviously not necessary here. All that the private names achieve here is to make your code a bit harder to debug:>>> network = NN((1,1)) >>> network.__backProp(0, [0.5]) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'NN' object has no attribute '__backProp' >>> network._NN__backProp(0, [0.5]) [0.0]
If you want to indicate that a method is for internal use by a class, then the convention is to prefix it with a single underscore.
Generally in Python you should prefer iterating over sequences rather than over indexes. So instead of:
self.wDims = [ (self.dims[i-1], self.dims[i])\ for i in range(1, self.nLayers) ]
you should write something like:
self.wDims = list(zip(self.dims[:-1], self.dims[1:]))
But in practice you'd do better just to drop
wDims
array altogether, since it just contains the same information as thedims
aray. Whenever you write:n_in, n_out = self.wDims[i]
you could write instead:
n_in, n_out = self.dims[i:i+2]
which seems clearer to me.
In the
train
method, surely you should pass inMAX
as a parameter?Instead of looping using
while
:i = 0 while i < MAX: # ... i += 1
prefer looping using
for
:for i in range(MAX): # ...
(But actually here you don't use
i
in the loop body, so it would be conventional to write_
instead.)Instead of:
for t in T: x, y = t # ...
write:
for x, y in T: # ...
The array
layers
is not actually a permanent property of the neural networks. It's only used temporarily infeedForward
andbackPropLearn
. And in fact only one layer is really used at a time. It would be better forlayer
to be a local variable in these methods.The random seed used to initialize the weights should surely be something that you choose each time you create an instance, not just once.
2. Rewriting using NumPy
This code would benefit enormously from using NumPy, a library for fast numerical array operations.
Your function
r_matrix
could be replaced withnumpy.random.uniform
. Here I create a ×ばつ4 array of random numbers chosen uniformly in the half-open range [2, 3):>>> numpy.random.uniform(2, 3, (4, 4)) array([[ 2.95552338, 2.75158213, 2.22088904, 2.95417241], [ 2.59129035, 2.29089095, 2.16007186, 2.64646486], [ 2.39729966, 2.96208642, 2.12305994, 2.68911969], [ 2.64394815, 2.21609217, 2.69556204, 2.35376118]])
so the whole of your
__initWeights
function could become:self.weights = [numpy.random.uniform(-0.5, 0.5, size) for size in zip(self.dims[:-1], self.dims[1:])]
Similarly, the whole of your
__initNeurons
function could become:self.layers = [numpy.zeros((size,)) for size in self.dims]
But as explained in §1.9 above, we don't actually want
self.layers
, so__initNeurons
can simply be omitted.The
_activateLayer
method multiplies the vector inself.layers[i-1]
by the matrixself.weights[i-1]
and then applies thesigmoid
function to each element in the result. So in NumPy,_activateLayer
becomes:def _activateLayer(self, i): self.layers[i] = sigmoid(numpy.dot(self.layers[i-1], self.weights[i-1]))
We don't even need the check on the length of the input any more, because if we pass in an input array of the wrong length, we'll get:
ValueError: matrices are not aligned
Similarly, the
_backProp
method multiplies the vectordelta
by the transpose of the matrixself.weights[i]
, and then applies thed_sigmoid
function to each element. So in Numpy this becomes:def _backprop2(self, i, delta): return d_sigmoid(self.layers[i]) * numpy.dot(delta, self.weights[i].T)
Similarly, the
_updateWeights
method becomes:def _updateWeights(self, i, delta, alpha = .7): self.weights[i] += alpha * numpy.outer(self.layers[i], delta)
And so on. See the revised code below for more improvements.
3. Revised code
This answer was getting quite long, so you'll need to add the documentation yourself.
from itertools import product
import numpy
def sigmoid(x):
return 1 / (1 + numpy.exp(-x))
def d_sigmoid(y):
return y * (1 - y)
class NeuralNetwork(object):
def __init__(self, layer_sizes, alpha=0.7, seed=None):
self.alpha = alpha
state = numpy.random.RandomState(seed)
self.weights = [state.uniform(-0.5, 0.5, size)
for size in zip(layer_sizes[:-1], layer_sizes[1:])]
def _feed_forward(self, x):
yield x
for w in self.weights:
x = sigmoid(numpy.dot(x, w))
yield x
def _deltas(self, layers, output):
delta = d_sigmoid(layers[-1]) * (output - layers[-1])
for layer, w in zip(layers[-2::-1], self.weights[::-1]):
yield delta
delta = d_sigmoid(layer) * numpy.dot(delta, w.T)
def _learn(self, layers, output):
deltas = reversed(list(self._deltas(layers, output)))
return [w + self.alpha * numpy.outer(layer, delta)
for w, layer, delta in zip(self.weights, layers, deltas)]
def train(self, training_data, rounds=5000):
for _, (input, output) in product(range(rounds), training_data):
layers = self._feed_forward(numpy.array(input))
self.weights = self._learn(list(layers), numpy.array(output))
def predict(self, input):
for layer in self._feed_forward(input): pass
return layer
4. Finally
Have you considered using neurolab
instead of writing your own?
-
\$\begingroup\$ I think my code could be improved further (there's a bit too much fiddling about with list slices) but I ran out of time; maybe some other reviewer here can clean it up. \$\endgroup\$Gareth Rees– Gareth Rees2013年12月21日 14:40:13 +00:00Commented Dec 21, 2013 at 14:40
-
\$\begingroup\$ Thank you Gareth! I wrote this code mainly to figure out how neural networks work. I have considered using numpy (having installation issues for python 3) and in any professional application I would surely use a more stable implementation. \$\endgroup\$rookie– rookie2013年12月22日 07:44:59 +00:00Commented Dec 22, 2013 at 7:44
Here are a bunch of relatively minor comments:
- put your imports on separate lines.
- call
random.seed
in anif __name__ == '__main__':
section, not in the global section. r_matrix
is not a very obvious name;random_matrix
seems better; the parameters of that function also have pretty non-obvious names;rows
,cols
,min
,max
seem to be more descriptive.self.layers = [0.0] * self.nLayers
is odd whenself.layers
is a list of lists. Actually, this is a perfect case to use a list comprehension:self.layers = [[0.0] * dim for dim in self.dims]
. Similarly forweights
.- In
__activateLayer
:self.layers[i][j] = sigmoid(sum(prev[k] * self.weights[i-1][k][j] for k in xrange(n_in)))
-
\$\begingroup\$ Thanks ruds. Are any of the ideas in the code really unclear? Could you please also offer some ideas about how I might use Neuron and Weight objects, to make the implementation cleaner? \$\endgroup\$rookie– rookie2013年12月21日 14:02:17 +00:00Commented Dec 21, 2013 at 14:02
Explore related questions
See similar questions with these tags.