2

Subsetting a column of a DataFrame gives me the y (dependent) variable in form of a NumPy array.

y = train['Survived']

But printing the .shape of the variable y (y.shape) outputs (891,) (notice it's not (891, 1), a column vector).

I would like to perform matrix multiplication of y with a variable with size (1 x 10) using np.matmul, but it's throwing me this error:

Exception: Dot product shape mismatch, (891,) vs (1, 10)

How can I force the y variable to be a column vector with size (891, 1) instead of just (891, )?

NoDataDumpNoContribution
10.9k9 gold badges70 silver badges113 bronze badges
asked Nov 17, 2020 at 5:33
4
  • 1
    just use y=train['Survived'].values[:,None] Commented Nov 17, 2020 at 5:35
  • 3
    or use y=train['Survived'].to_numpy().reshape(-1, 1) Commented Nov 17, 2020 at 5:36
  • So the goal is (891,10) array? Commented Nov 17, 2020 at 5:49
  • 1
    y = train[['Survived']] will also have that column vector shape. However that is quite a bit slower, since it makes a new dataframe (as opposed to extracting a series/column). Commented Nov 17, 2020 at 5:53

1 Answer 1

3

Just use y[:,None]. This will have the correct shape

answered Nov 17, 2020 at 5:37
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.