Table of Contents

顾客购买服装的分析与预测

【实验内容】

采用决策树算法，对“双十一”期间顾客是否买服装的数据集进行分析与预测。

顾客购买服装数据集：包含review（商品评价变量）、discount（打折程度）、needed（是否必需）、shipping（是否包邮）、buy（是否购买）。

【实验要求】

1.读取顾客购买服装的数据集（数据集路径：data/data76088/3_buy.csv），探索数据。

2.分别用ID3算法和CART算法进行决策树模型的配置、模型的训练、模型的预测、模型的评估。

3.扩展内容（选做）：对不同算法生成的决策树结构图进行可视化。

import pandas as pd
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import numpy as np
from sklearn import tree # 导入决策树包
from jupyterthemes import jtplot
jtplot.style(theme='monokai')  # 选择一个绘图主题

读取顾客购买服装的数据集

data = pd.read_csv("./datasets/3_buy.csv")
data

review	discount	needed	shipping	buy
0	3	3	0	1	1
1	3	3	0	0	1
2	2	3	0	1	0
3	1	2	0	1	0
4	1	1	1	1	0
5	1	1	1	0	1
6	2	1	1	0	0
7	3	2	0	1	1
8	3	1	1	1	0
9	1	2	1	1	0
10	3	2	1	0	0
11	2	2	0	0	1
12	2	3	1	1	0
13	1	2	0	0	1

分别用ID3算法和CART算法进行决策树模型的配置、模型的训练、模型的预测、模型的评估

数据集分割

x, y = np.split(data, indices_or_sections=(4,), axis=1) 
# print(x)
# print(y)

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.30)
print("x_train.shape:", x_train.shape)
print("y_train.shape:", y_train.shape)
print("x_test.shape:", x_test.shape)
print("y_test.shape:", y_test.shape)

x_train.shape: (9, 4)
y_train.shape: (9, 1)
x_test.shape: (5, 4)
y_test.shape: (5, 1)

配置模型

clf_CART = tree.DecisionTreeClassifier(criterion = 'gini',max_depth=4) #CART基尼系数
clf_ID3 = tree.DecisionTreeClassifier(criterion = 'entropy',max_depth=4) #ID3信息熵

训练模型

clf_CART.fit(x_train, y_train) #模型训练
clf_ID3.fit(x_train, y_train) #模型训练

DecisionTreeClassifier(criterion='entropy', max_depth=4)

模型预测

predictions_CART = clf_CART.predict(x_test) # 模型测试
print("predictions_CART",predictions_CART)
predictions_ID3 = clf_ID3.predict(x_test) # 模型测试
print("predictions_ID3",predictions_ID3)

predictions_CART [0 0 1 0 0]
predictions_ID3 [0 0 1 0 0]

模型评估

from sklearn.metrics import accuracy_score # 导入准确率评价指标
print('Accuracy of CART: %s'% accuracy_score(y_test, predictions_CART))
from sklearn.metrics import accuracy_score # 导入准确率评价指标
print('Accuracy of ID3: %s'% accuracy_score(y_test, predictions_ID3))

Accuracy of CART: 0.8
Accuracy of ID3: 0.8

机器学习实验之顾客购买服装的分析与预测

文章出处登录后可见！

已经登录？立即刷新