0

I wanted to know the difference between these two lines of code

 X_train = training_dataset.iloc[:, 1].values
 X_train = training_dataset.iloc[:, 1:2].values

My guess is that the latter is a 2-D numpy array and the former is a 1-D numpy array. For inputs in a neural network, the latter is the proper way for input data, is there are specific reason for that?

Please help!

asked Apr 3, 2020 at 11:49

2 Answers 2

1

Not quite that, they have both have ndim=2, just check by doing this:

X_train.ndim

The difference is that in the second one it doesn't have a defined second dimension if you want to see the difference between the shapes I suggest reading this: Difference between numpy.array shape (R, 1) and (R,)

answered Apr 3, 2020 at 12:07

Comments

1

Difference is iloc returns a Series with a single row or column is selected but a Dataframe with a multiple row or column ranges reference

Although they both refer to column 1, 1 and 1:2 are different types, with 1 representing an int and 1:2 representing a slice.

With,

X_train = training_dataset.iloc[:, 1].values

You specify a single column so training_dataset.iloc[:, 1] is a Pandas Series, so .values is a 1D Numpy array

Vs.,

X_train = training_dataset.iloc[:, 1:2].values

Although it becomes one column, [1:2] is a slice you represents a column range so training_dataset.iloc[:, 1:2] is a Pandas Dataframe. Thus, .values is a 2D Numpy array

Test as follows:

Create training_dataset Dataframe

data = {'Height':[1, 14, 2, 1, 5], 'Width':[15, 25, 2, 20, 27]} 
training_dataset = pd.DataFrame(data)

Using .iloc[:, 1]

print(type(training_dataset.iloc[:, 1]))
print(training_dataset.iloc[:, 1].values)
# Result is: 
<class 'pandas.core.series.Series'>
# Values returns a 1D Numpy array
0 15
1 25
2 2
3 20
4 27
Name: Width, dtype: int64, 

Using iloc[:, 1:2]

print(type(training_dataset.iloc[:, 1:2]))
print(training_dataset.iloc[:, 1:2].values)
# Result is: 
<class 'pandas.core.frame.DataFrame'>
# Values is a 2D Numpy array (since values of Pandas Dataframe)
[[15]
 [25]
 [ 2]
 [20]
 [27]], 
X_train Values Var Type <class 'numpy.ndarray'>
answered Apr 3, 2020 at 13:49

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.