我在使用 dataframe(X.inverseTransform(data_out), columns=data_out.columns) 时遇到问题
原文标题 :I am having trouble working with dataframe(X.inverseTransform(data_out), columns=data_out.columns)
我一直在研究这个线性回归案例,在验证我的工作时遇到了困难。为了验证我必须使用:
sns.regplot(x=X_2["pk"], y=y_2)
scaler_2 = StandardScaler()
scaler_2.fit(df)
# type(scaler_2)
X_2 = df.drop(['prijs'], axis=1)
# print(X_2.shape)
# type(X_2)
y_2 = df['prijs']
# print(y_2.shape)
# type(y_2)
#======================
test_data = 0.30
X_train_2, X_test_2, y_train_2, y_test_2 = train_test_split(X_2,y_2, test_size=test_data, random_state=12)
# print(f"formaat X_train_2 {X_train_2.shape}")
# print(f"formaat y_train_2 {y_train_2.shape}")
# print(f"formaat X_test_2 {X_test_2.shape}")
# print(f"formaat y_test_2 {y_test_2.shape}")
# X_train_2 = None
# X_test_2 = None
# y_train_2 = None
# y_test_2 = None
model_2 = LinearRegression()
X_train_simpel = X_train_2[['pk']]
X_test_simpel = X_test_2[['pk']]
fit_2 = model_2.fit(X_train_simpel, y_train_2)
uitkomst_2 = fit_2.predict(X_train_simpel)
uitkomst_3 = fit_2.predict(X_test_simpel)
data_out = X_train_2
data_out = pd.DataFrame(scaler_2.inverse_transform(data_out),columns=data_out.columns)
data_out['groep'] = uitkomst_2
data_out.head(5)
但是在运行最后两行代码时出现此错误:
————————————————– ————————- ValueError Traceback (most recent calllast) Input In [136], in 1 #haal de originele ongeschaalde waardes terug— -> 2 data_out = pd.DataFrame(scaler_2.inverse_transform(data_out),columns=data_out.columns)3 data_out[‘groep’] = uitkomst_24 data_out.head(5)
FileC:\Python310\lib\site-packages\sklearn\preprocessing_data.py:1035, inStandardScaler.inverse_transform(self, X, copy) 1033 else: 1034if self.with_std:-> 1035 X *= self.scale_ 1036 if self. with_mean: 1037 X += self.mean_
ValueError:操作数无法与形状一起广播(11484,7)(8,)(11484,7)
回复
我来回复-
Masoud 评论
该回答已被采纳!
‘scaler_2’ 适合所有列,但 ‘scaler_2.inverse_transform(data_out)’ 希望转换具有较少列的数据帧
我的意思是在 ‘scaler_2’ 拟合之后删除 ‘prijs’ 列,它稍后会在 ‘scaler_2.inverse_transform(data_out)’ 处产生错误,因此您必须首先删除 ‘prijs’ 列并将数据拟合到 scaler_2
以下代码可以解决您的问题:
... scaler_2 = StandardScaler() X_2 = df.drop(['prijs'], axis=1) scaler_2.fit(X_2 ) ...
2年前