import numpy as np
from sklearn.linear_model import LinearRegression
X = np.array([1, 2, 3, 4, 5]) # Features
y = np.array([2, 4, 6, 8, 10]) # Target
model = LinearRegression()
model.fit(X, y) # <-- Error occurs here
When I try to fit my LinearRegression model, I get the following error:
ValueError: Expected 2D array, got 1D array instead:
array=[1. 2. 3. 4. 5.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature
or array.reshape(1, -1) if it contains a single sample.
I understand the error is about the input shape, but I’m not sure why it happens here and what’s the proper way to fix it.
Why does scikit-learn expect a 2D array for X?
What’s the correct way to reshape the data in this case?
-
did this answer your question?Bending Rodriguez– Bending Rodriguez2025年08月27日 08:58:40 +00:00Commented Aug 27 at 8:58
2 Answers 2
use .reshape(-1, 1)
import numpy as np
from sklearn.linear_model import LinearRegression
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Features
y = np.array([2, 4, 6, 8, 10]).reshape(-1, 1) # Target
model = LinearRegression()
model.fit(X, y)
print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
If you have a one dimensional array this step is mandatory.
result:
> Coefficients: [[2.]]
> Intercept: [-1.77635684e-15]
Comments
Answering for the question "why":
Scikit-learn's LinearRegression expects the input features (X) to be a 2D array of shape (n_samples, n_features), even for a simple linear regression with one feature (i.e., a single x variable predicting y). This is because scikit-learn is designed to handle multiple features (multivariate regression), so the input must always be 2D.
Because the model.fit() API is designed for general use case, not just one dimension X and one dimension Y prediction, the API is more difficult to use for its simplest use case.
Comments
Explore related questions
See similar questions with these tags.