获得正确的张量大小

乘风 2年前 pytorch 186

原文标题 ：Get the right tensor size

我想要正确的张量大小，因为在loss = criterion(out,target)行中出现以下错误：

预期输入 batch_size (4200) 与目标 batch_size (64) 匹配。

我该如何解决挑战？

我的输出张量有大小（[4200, 2]）和我的目标张量（[64,2]）。用例是图像分类。有两个类。我的批量大小是64，图像大小是180 x 115px 灰度。请不要混淆：有一些’break’来测试早期开发状态的代码。我加载了四批，所以256张图片。

使用这种方法，我将加载我的图像：

def dataPrep(list_of_data, data_path, category, quantity):
    global train_data
    target_list = []
    train_data_list = []
    
    transform = transforms.Compose([
    transforms.ToTensor(),
        ])
    
    len_data = len(train_data)
    print('Len_data: ', len_data)
    for item in list_of_data:
        f = random.choice(list_of_data)
        list_of_data.remove(f)
        print(data_path + f)
        try:
            img = Image.open(data_path +f)
        except:
            continue
        img_crop = img.crop((310,60,425,240))
        img_tensor = transform(img_crop)
        print(img_tensor.size())
        train_data_list.append(img_tensor)
        isPseudo = 0
        isTrue = 1
        if category == True:
            target = [isPseudo,isTrue]
        else:
            isPseudo =1
            isTrue = 0        
            target = [isPseudo, isTrue]
        
        target_list.append(target)
        if len(train_data_list) >=64:
            train_data.append((torch.stack(train_data_list), target_list))
            train_data_list = []
            target_list = []
            
        if (len_data*64 + quantity) <= len(train_data)*64:
            break
    print(len(train_data) *64)    
    return list_of_data

加载图像后，我创建了模型和优化器。

model = net.Netz()
optimizer = optim.SGD(model.parameters(), lr= 0.1, momentum = 0.8)

我的班级“Netz”看起来像这样：

class Netz(nn.Module):
    def __init__(self):
        super(Netz, self).__init__()
        self.conv1 = nn.Conv2d(1,10, kernel_size=5)
        self.conv2 = nn.Conv2d(10,20, kernel_size = 5)
        self.conv_dropout = nn.Dropout2d() 
        self.fc1 = nn.Linear(320,60)
        self.fc2 = nn.Linear(60,2)
    
    def forward(self,x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = self.conv_dropout(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = x.view(-1,320)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x, -1)

最后，我将训练我的 CNN：

def trainM(epoch):
    model.train()
    batch_id = 0
    for batch_id, (data, target) in enumerate(net.train_data):
        #data = data.cuda()
        #target = target.cuda()
        target = torch.Tensor(target[64*batch_id:64*(batch_id+1)])
        data = Variable(data)
        target = Variable(target)
        optimizer.zero_grad()
        out = model(data)
        criterion = F.nll_loss
        print('Size of out:', out.size())
        print('Size of target:', target.size())
        loss = criterion(out,target)
        loss.backward()
        optimizer.step()
        print('Tain Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(epoch,batch_id*len(data), len(net.train_data.dataset), 100*batch_id/len(net.train_data), loss.item()))
        batch_id += 1
        break

for item in range(0,10):
    trainM(item)
    break

原文链接：https://stackoverflow.com//questions/71950692/get-the-right-tensor-size

我来回复

D. ACAR 评论
该回答已被采纳！

主要问题是在 Netzx = x.view(-1,320)你有 64 个批次 20 个通道 42 x 25 宽度和高度，如果你将它重塑为 -1，320 将得到 4200 x 320。

我可以建议 3 个可能的选项来保留批量大小；
1. （一般是做什么的）将输入填充为正方形，更新卷积部分，使其在FC层之前的输出具有大量的通道和少量的高度和宽度。例如获得 x.shape = (batchsize, 128,2,2) 然后使 fc1 = Linear(512, 60) 并在此之前执行 x = x.reshape(x.shape[0], -1) 。（在应用 fc1 之前，您可以进行 1×1 卷积）。
2. 使卷积结束时的通道数为 1，即得到类似 x.shape = (batchsize,1,42,25) 的内容，然后相应地采用 fc1。
3. 做 x=reshape(*x.shape[:2], -1) 换句话说，保留 chanel 和 batchsize。添加另一个 FC 层 fc_e = Linear(20,1) 以压缩您的通道。
```
class Netz(nn.Module):
    def __init__(self):
        super(Netz, self).__init__()
        self.conv1 = nn.Conv2d(1,10, kernel_size=5)
        self.conv2 = nn.Conv2d(10,20, kernel_size = 5)
        self.conv_dropout = nn.Dropout2d() 
        self.fc1 = nn.Linear(1050,60)
        self.fc2 = nn.Linear(60,2)
        self.fce = nn.Linear(20,1)
    
    def forward(self,x):
        x = self.conv1(x)
        x = F.max_pool2d(x, 2)
        x = F.relu(x)
        x = self.conv2(x)
        x = self.conv_dropout(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = x.reshape(x.shape[0],x.shape[1], -1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        x = self.fce(x.permute(0,2,1)).squeeze(-1)
        return F.log_softmax(x, -1)
```
请记住，您要在要表示的信息量（应该很高）和线性层的输入数量（不是那么高）之间进行权衡。最后，它的问题是你选择如何解决这个问题。第三个最接近您的解决方案，但我建议找出符合第一种方法的模型
2年前 0条评论
Kilian 评论
从我所见，您需要一条线来拆分数据，就像您在火车循环中对目标所做的那样：
```
target = torch.Tensor(target[64*batch_id:64*(batch_id+1)])
data   = torch.Tensor(data[64*batch_id:64*(batch_id+1)])
```
2年前 0条评论

获得正确的张量大小

回复

相关问题