为什么将张量输入神经网络无法获得输出?

扎眼的阳光 pytorch 213

原文标题Why does inputting a tensor into a neural network fail to get an output?

我是深度学习的新手,并试图重现神经渲染程序。

FCN()是一个神经渲染器网络,在128×128的画布上将10维的笔画参数渲染成笔画。renderer.pkl是作者训练的网络参数,输入的大小是batchsize x 10,这里我假设batchsize ==1;最后,一次在画布上渲染五笔。

笔画生成我使用随机生成,当print(action)时,可以看到它是一个非零的[5,10]形状张量。

但是不知道为什么canvas1还处于canvas0的状态(全为0),说明渲染器根本不工作???

如果您能提供帮助,我将不胜感激。

import cv2
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

class FCN(nn.Module): 
    def __init__(self):
        super(FCN, self).__init__()
        self.fc1 = (nn.Linear(10, 512))
        self.fc2 = (nn.Linear(512, 1024))
        self.fc3 = (nn.Linear(1024, 2048))
        self.fc4 = (nn.Linear(2048, 4096))
        self.conv1 = (nn.Conv2d(16, 32, 3, 1, 1))
        self.conv2 = (nn.Conv2d(32, 32, 3, 1, 1))
        self.conv3 = (nn.Conv2d(8, 16, 3, 1, 1))
        self.conv4 = (nn.Conv2d(16, 16, 3, 1, 1))
        self.conv5 = (nn.Conv2d(4, 8, 3, 1, 1))
        self.conv6 = (nn.Conv2d(8, 4, 3, 1, 1))
        self.pixel_shuffle = nn.PixelShuffle(2)

    def forward(self, x): # b x 10
        x = F.relu(self.fc1(x)) #512
        x = F.relu(self.fc2(x)) #1024
        x = F.relu(self.fc3(x)) #2048
        x = F.relu(self.fc4(x)) #4096
        x = x.view(-1, 16, 16, 16)  #reshape b x16x16x16
        x = F.relu(self.conv1(x))        # b x16x16x32
        x = self.pixel_shuffle(self.conv2(x))  # b x16x16x32 (8x2x2) -> b x32x32x8
        # (*, C×r^2, H, W)(∗,C×r,H×r,W×r)
        x = F.relu(self.conv3(x))        # b x32x32x16
        x = self.pixel_shuffle(self.conv4(x))  # b x32x32x16 -> b x64x64x4
        x = F.relu(self.conv5(x))        # b x64x64x8
        x = self.pixel_shuffle(self.conv6(x))  # b x64x64x4 -> b x128x128x1
        x = torch.sigmoid(x)
        return 1 - x.view(-1, 128, 128)

Decoder = FCN() 
Decoder.load_state_dict(torch.load('renderer.pkl'))

def decode(x, canvas): # b * 10
    x = x.view(-1, 10)
    stroke = 1 - Decoder(x[:, :10]) 
    stroke = stroke.view(-1, 128, 128, 1) #bsz x 128 x 128 x 1
    stroke = stroke.permute(0, 3, 1, 2) # b x1x128x128
    stroke = stroke.view(-1, 5, 1, 128, 128) 
    for i in range(5):
        canvas = canvas * (1 - stroke[:, i])
    return canvas

width = 128 
canvas0 = torch.zeros([ 1, width, width], dtype=torch.uint8).to(device)
action=[]
for i in range(5): # 5 x 10
    f = np.random.uniform(0,1,10)
    action.append(f)
action = torch.tensor(action).float()
action = action.unsqueeze(0)  # 1 x 5 x 10

canvas1 = decode(action, canvas0)  # 1 x 1 x 128 x 128
canvas1 = torch.squeeze(canvas1, dim=0)
canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128
canvas1 = canvas1.detach().numpy()
Image.fromarray(np.uint8(canvas1))

print(canvas1)

原文链接:https://stackoverflow.com//questions/71955311/why-does-inputting-a-tensor-into-a-neural-network-fail-to-get-an-output

回复

我来回复
  • DerekG的头像
    DerekG 评论

    我怀疑这个问题可能是由np.uint8线引起的。大多数神经网络都经过参数化,使得它们的输出落在 [0,1] 范围内。在这种情况下,你的是(sigmoid 激活函数范围是 [0,1])。将此范围内的任何浮点数转换为 int 都会将其截断为 0。首先尝试乘以 255,以便在转换为整数后输出范围超过一个非平凡的空间。

    ...
    canvas1 = decode(action, canvas0)  # 1 x 1 x 128 x 128
    canvas1 = torch.squeeze(canvas1, dim=0)
    canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128
    canvas1 = canvas1.detach().numpy()  * 255
    Image.fromarray(np.uint8(canvas1))
    
    2年前 0条评论