在 Python Keras CNN 模型上呈现可理解预测的问题
原文标题 :Issue with presenting understandable predictions on a Python Keras CNN model
如果这是在错误的地方或格式不正确,请提前道歉。
遇到一个我无法找到答案的问题,因为我可能在搜索过程中措辞不正确。我创建了一个模型并且工作正常——在 6 个类中实现了 91.5% 的准确率。无论如何总结我的问题:
目标是对垃圾图像进行分类,模型必须预测它看到的垃圾类型。 6类,透明和彩色塑料瓶,透明和彩色塑料袋,罐头和玻璃瓶。我的预期结果是检索模型预测它在 6 个类别中看到的内容,因此 67% 确定它是彩色瓶子,21% 确定它是罐头等等。
我得到的实际结果是 6 个指数浮点数的范围,这并不理想,也不能真正表明它们属于哪个类!至于错误我没有得到任何。我开发分类代码的方式是否存在问题,会阻止更易读的结果,还是我遗漏了什么?
我使用 Google Colab 作为我的 IDE,我的模型是 DenseNet-201。
提前谢谢,杰克
这是我用来使用训练有素的模型对真实世界收集的数据进行分类的代码。下面是显示分配到废物数组中的标签的代码。我的问题是我不能相信这些标签的顺序与我接收浮点数的顺序相同!还要注意的是,这些图像是从 Google Drive 上的一个文件夹中循环输入的。我尝试了单个图像,但得到了相同的结果。
测试图像分类代码
# Morning Test
import numpy as np
from keras.preprocessing import image
width = 100
height = 100
new_dimensions = (width, height)
counter=0
print("Morning Test - Experiment 1 - Clear Bottle \n")
# Morning Test
# Cycle Throgh Images
for x in range (0,10):
exp1_morning_waste1 = cv2.imread('/content/gdrive/My Drive/Rivers V2/Test Set/New Images/Exp 1/Morn/' + 'MorningBottleClExp1_' + str(x+1) +'.jpg')
# Check for existence
if exp1_morning_waste is not None:
# Count the classifications add one
counter+=1
# Resize
exp1_morning_waste = cv2.resize(exp1_morning_waste1, new_dimensions)
# Add image to array
exp1_morning_waste = image.img_to_array(exp1_morning_waste)
# Axis, Dimens
exp1_morning_waste = np.expand_dims(exp1_morning_waste, axis=0)
exp1_morning_waste= exp1_morning_waste/255
# Predict image
prediction_prob = model.predict(exp1_morning_waste)
# Print Predictions
print(f'Probability that image is a: {prediction_prob} ')
# Image Number
print("Waste Item No." + str(x+1) +"\n")
# No Directory or image present
else:
print("File not Contacted")
break
输出>>早晨测试 – 实验 1 – 透明瓶
图像的概率为:[[9.9152815e-01 1.2046337e-03 1.4043533e-03 5.7380428e-03 6.7023984e-061.1799879e-04]]Waste Item No.1
等等…..
用于训练模型的原始数据集标签
# Create dataset and label arrays
wastedata=[]
labels=[]
# Set Random Number generator
random.seed(42)
# Access waste images directory
wasteDirectory = sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/")))
# Shuffle the directory
random.shuffle(wasteDirectory)
# Print directory class names
print(wasteDirectory)
# Resize and sort images in directory in the case they haven't already
for img in wasteDirectory:
pathDir=sorted(list(os.listdir("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img)))
for i in pathDir:
imagewaste = cv2.imread("/content/gdrive/My Drive/Rivers V2/Datasets/Waste Dataset - Pre-processing (image resizing 100x100 (Aspect Ratio + Augmentation)(V2))/"+img+'/'+i)
imagewaste = cv2.resize(imagewaste, (100,100))
imagewaste = img_to_array(imagewaste)
# Assign dataset to data array
wastedata.append(imagewaste)
l = label = img
# Append to labels array
labels.append(l)
输出>>[‘透明塑料瓶’, ‘透明玻璃瓶’, ‘透明塑料袋’, ‘彩色塑料袋’, ‘罐头’, ‘彩色塑料瓶’]
回复
我来回复-
Michael Hodel 评论
模型的预测,即这些浮点数,是各个类的概率(例如。值 6.734e-1 = 6.734 * 10 ** (-1) 表示概率为 67.34%)。那么你的预测是你的类数组中的元素在你的概率数组中最大值的索引处,这意味着,你想要预测你的模型分配给哪个类的概率最高。例如:
classes = ['Clear Plastic Bottle', 'Clear Glass Bottle', 'Clear Plastic Bags', 'Coloured Plastic Bags', 'Cans', 'Coloured Plastic Bottle'] probs = [9.9152815e-01, 1.2046337e-03, 1.4043533e-03, 5.7380428e-03, 6.7023984e-06, 1.1799879e-04] max_prob = max(probabilities) pred = classes[probabilities.index(max_prob)] print(f'Model predicts a {max_prob*100:.2f}% chance of the item on the image being "{pred}".')
输出
Model predicts a 99.15% probability of the item on the image being "Clear Plastic Bottle".
2年前