Pytorch:如何为 CNN 制作自定义 Dataloader?

原文标题Pytorch: How to make a custom Dataloader for CNN?

我正在尝试从 CNN 的自定义数据集创建自己的 Dataloader。最初的 Dataloader 是通过以下方式创建的:

train_loader = torch.utils.data.DataLoader(mnist_data, batch_size=64)

如果我检查上面的形状,我会得到

i1, l1 = next(iter(train_loader))
print(i1.shape)   # torch.Size([64, 1, 28, 28]) 
print(l1.shape)   # torch.Size([64]) 

当我将这个 train_loader 输入我的 CNN 时,它运行良好。但是,我有一个自定义数据集。我做了以下事情:

mnist_data = datasets.MNIST('data', train=True, download=True, transform=transforms.ToTensor())

trainset = mnist_data
testset = mnist_data

x_train = np.array(trainset.data)
y_train = np.array(trainset.targets)

# modify x_train/y_train

现在,我怎样才能将 x_train、y_train 变成类似于第一个的 Dataloader?我做了以下事情:

train_data = []
for i in range(len(x_train)):
   train_data.append([x_train[i], y_train[i]])

train_loader = torch.utils.data.DataLoader(train_data, batch_size=64)

for i, (images, labels) in enumerate(train_loader):
    images = images.unsqueeze(1)

但是,我仍然缺少通道列(应该是 1)。我将如何解决这个问题?

原文链接:https://stackoverflow.com//questions/71453455/pytorch-how-to-make-a-custom-dataloader-for-cnn

回复

我来回复
  • Sadra Naddaf的头像
    Sadra Naddaf 评论

    我无权访问您的 x_train 和 y_train ,但这可能有效:

    from torch.utils.data import TensorDataset, DataLoader
    
    # use x_train and y_train as numpy array without further modification
    x_train = np.array(trainset.data)
    y_train = np.array(trainset.targets)
    
    # convert to numpys to tensor
    tensor_x = torch.Tensor(x_train) 
    tensor_y = torch.Tensor(y_train)
    # create the dataset
    custom_dataset = TensorDataset(tensor_x,tensor_y) 
    # create your dataloader
    my_dataloader = DataLoader(custom_dataset,batch_size=1) 
    
    #check if you can get the desired things
    i1, l1 = next(iter(my_dataloader))
    print(i1.shape)   # torch.Size([1, 1, 28, 28]) 
    print(l1.shape)   # torch.Size([1]) 
    
    2年前 0条评论