我在使用 dataframe(X.inverseTransform(data_out), columns=data_out.columns) 时遇到问题

原文标题I am having trouble working with dataframe(X.inverseTransform(data_out), columns=data_out.columns)

我一直在研究这个线性回归案例,在验证我的工作时遇到了困难。为了验证我必须使用:

sns.regplot(x=X_2["pk"], y=y_2)

scaler_2 = StandardScaler()
scaler_2.fit(df)
# type(scaler_2)

X_2 = df.drop(['prijs'], axis=1)
# print(X_2.shape)
# type(X_2)

y_2 = df['prijs']
# print(y_2.shape)
# type(y_2)

#======================
test_data = 0.30
X_train_2, X_test_2, y_train_2, y_test_2 = train_test_split(X_2,y_2, test_size=test_data, random_state=12)
# print(f"formaat X_train_2 {X_train_2.shape}")
# print(f"formaat y_train_2 {y_train_2.shape}")
# print(f"formaat X_test_2  {X_test_2.shape}")
# print(f"formaat y_test_2  {y_test_2.shape}")

# X_train_2 = None
# X_test_2 = None
# y_train_2 = None
# y_test_2 = None
model_2 = LinearRegression()
X_train_simpel = X_train_2[['pk']]
X_test_simpel = X_test_2[['pk']]
fit_2 = model_2.fit(X_train_simpel, y_train_2)
uitkomst_2 = fit_2.predict(X_train_simpel)
uitkomst_3 = fit_2.predict(X_test_simpel)

data_out = X_train_2
data_out = pd.DataFrame(scaler_2.inverse_transform(data_out),columns=data_out.columns)
data_out['groep'] = uitkomst_2

data_out.head(5)

但是在运行最后两行代码时出现此错误:

————————————————– ————————- ValueError Traceback (most recent calllast) Input In [136], in 1 #haal de originele ongeschaalde waardes terug— -> 2 data_out = pd.DataFrame(scaler_2.inverse_transform(data_out),columns=data_out.columns)3 data_out[‘groep’] = uitkomst_24 data_out.head(5)

FileC:\Python310\lib\site-packages\sklearn\preprocessing_data.py:1035, inStandardScaler.inverse_transform(self, X, copy) 1033 else: 1034if self.with_std:-> 1035 X *= self.scale_ 1036 if self. with_mean: 1037 X += self.mean_

ValueError:操作数无法与形状一起广播(11484,7)(8,)(11484,7)

原文链接:https://stackoverflow.com//questions/71451083/i-am-having-trouble-working-with-dataframex-inversetransformdata-out-columns

回复

我来回复
  • Masoud的头像
    Masoud 评论

    ‘scaler_2’ 适合所有列,但 ‘scaler_2.inverse_transform(data_out)’ 希望转换具有较少列的数据帧

    我的意思是在 ‘scaler_2’ 拟合之后删除 ‘prijs’ 列,它稍后会在 ‘scaler_2.inverse_transform(data_out)’ 处产生错误,因此您必须首先删除 ‘prijs’ 列并将数据拟合到 scaler_2

    以下代码可以解决您的问题:

    ...
    scaler_2 = StandardScaler()
    X_2 = df.drop(['prijs'], axis=1)
    scaler_2.fit(X_2 )
    ...
    
    2年前 0条评论