为什么将张量输入神经网络无法获得输出?
pytorch 278
原文标题 :Why does inputting a tensor into a neural network fail to get an output?
我是深度学习的新手,并试图重现神经渲染程序。
FCN()是一个神经渲染器网络,在128×128的画布上将10维的笔画参数渲染成笔画。renderer.pkl是作者训练的网络参数,输入的大小是batchsize x 10,这里我假设batchsize ==1;最后,一次在画布上渲染五笔。
笔画生成我使用随机生成,当print(action)
时,可以看到它是一个非零的[5,10]形状张量。
但是不知道为什么canvas1还处于canvas0的状态(全为0),说明渲染器根本不工作???
如果您能提供帮助,我将不胜感激。
import cv2
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
class FCN(nn.Module):
def __init__(self):
super(FCN, self).__init__()
self.fc1 = (nn.Linear(10, 512))
self.fc2 = (nn.Linear(512, 1024))
self.fc3 = (nn.Linear(1024, 2048))
self.fc4 = (nn.Linear(2048, 4096))
self.conv1 = (nn.Conv2d(16, 32, 3, 1, 1))
self.conv2 = (nn.Conv2d(32, 32, 3, 1, 1))
self.conv3 = (nn.Conv2d(8, 16, 3, 1, 1))
self.conv4 = (nn.Conv2d(16, 16, 3, 1, 1))
self.conv5 = (nn.Conv2d(4, 8, 3, 1, 1))
self.conv6 = (nn.Conv2d(8, 4, 3, 1, 1))
self.pixel_shuffle = nn.PixelShuffle(2)
def forward(self, x): # b x 10
x = F.relu(self.fc1(x)) #512
x = F.relu(self.fc2(x)) #1024
x = F.relu(self.fc3(x)) #2048
x = F.relu(self.fc4(x)) #4096
x = x.view(-1, 16, 16, 16) #reshape b x16x16x16
x = F.relu(self.conv1(x)) # b x16x16x32
x = self.pixel_shuffle(self.conv2(x)) # b x16x16x32 (8x2x2) -> b x32x32x8
# (*, C×r^2, H, W)(∗,C×r,H×r,W×r)
x = F.relu(self.conv3(x)) # b x32x32x16
x = self.pixel_shuffle(self.conv4(x)) # b x32x32x16 -> b x64x64x4
x = F.relu(self.conv5(x)) # b x64x64x8
x = self.pixel_shuffle(self.conv6(x)) # b x64x64x4 -> b x128x128x1
x = torch.sigmoid(x)
return 1 - x.view(-1, 128, 128)
Decoder = FCN()
Decoder.load_state_dict(torch.load('renderer.pkl'))
def decode(x, canvas): # b * 10
x = x.view(-1, 10)
stroke = 1 - Decoder(x[:, :10])
stroke = stroke.view(-1, 128, 128, 1) #bsz x 128 x 128 x 1
stroke = stroke.permute(0, 3, 1, 2) # b x1x128x128
stroke = stroke.view(-1, 5, 1, 128, 128)
for i in range(5):
canvas = canvas * (1 - stroke[:, i])
return canvas
width = 128
canvas0 = torch.zeros([ 1, width, width], dtype=torch.uint8).to(device)
action=[]
for i in range(5): # 5 x 10
f = np.random.uniform(0,1,10)
action.append(f)
action = torch.tensor(action).float()
action = action.unsqueeze(0) # 1 x 5 x 10
canvas1 = decode(action, canvas0) # 1 x 1 x 128 x 128
canvas1 = torch.squeeze(canvas1, dim=0)
canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128
canvas1 = canvas1.detach().numpy()
Image.fromarray(np.uint8(canvas1))
print(canvas1)
回复
我来回复-
DerekG 评论
我怀疑这个问题可能是由
np.uint8
线引起的。大多数神经网络都经过参数化,使得它们的输出落在 [0,1] 范围内。在这种情况下,你的是(sigmoid 激活函数范围是 [0,1])。将此范围内的任何浮点数转换为 int 都会将其截断为 0。首先尝试乘以 255,以便在转换为整数后输出范围超过一个非平凡的空间。... canvas1 = decode(action, canvas0) # 1 x 1 x 128 x 128 canvas1 = torch.squeeze(canvas1, dim=0) canvas1 = torch.squeeze(canvas1, dim=0) # 128 x 128 canvas1 = canvas1.detach().numpy() * 255 Image.fromarray(np.uint8(canvas1))
2年前