3

I am trying to build a very simple multilayer perceptron (MLP) in keras:

model = Sequential()
model.add(Dense(16, 8, init='uniform', activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(8, 2, init='uniform', activation='tanh'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=1000, batch_size=50)
score = model.evaluate(X_test, y_test, batch_size=50)

My training data shape: X_train.shape gives (34180, 16)

The labels belong to binary class with shape: y_train.shape gives (34180,)

So my keras code should produce the network with following connection: 16x8 => 8x2

which produces the shape mismatch error:

ValueError: Input dimension mis-match. (input[0].shape[1] = 2, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{sub,no_inplace}(Elemwise{Composite{tanh((i0 + i1))}}[(0, 0)].0, <TensorType(float64, matrix)>)
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(50, 2), (50, 1)]
Inputs strides: [(16, 8), (8, 8)]

At Epoch 0 at line model.fit(X_train, y_train, nb_epoch=1000, batch_size=50). Am I overseeing something obvious in Keras?

EDIT: I have gone through the question here but does not solve my problem

asked Aug 13, 2015 at 20:00

1 Answer 1

10

I had the same problem and then found this thread;

https://github.com/fchollet/keras/issues/68

It appears for you to state a final output layer of 2 or for any number of categories the labels need to be of a categorical type where essentially this is a binary vector for each observation e.g a 3 class output vector [0,2,1,0,1,0] becomes [[1,0,0],[0,0,1],[0,1,0],[1,0,0],[0,1,0],[1,0,0]].

The np_utils.to_categorical function solved this for me;

from keras.utils import np_utils, generic_utils
y_train, y_test = [np_utils.to_categorical(x) for x in (y_train, y_test)]
Michael Crook
1,5352 gold badges16 silver badges38 bronze badges
answered Oct 4, 2015 at 20:26
Sign up to request clarification or add additional context in comments.

1 Comment

Another option, which would help you "unmap" your one-hot vectors, is to use sklearn.preprocessing.LabelBinarizer. scikit-learn.org/stable/modules/generated/…

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.