一种方法是实现一个包装器 Dataset 类,该类将转换应用于 ImageFolder 数据集的输出。例如
class WrapperDataset:
def __init__(self, dataset, transform=None, target_transform=None):
self.dataset = dataset
self.transform = transform
self.target_transform = target_transform
def __getitem__(self, index):
image, label = self.dataset[index]
if self.transform is not None:
image = self.transform(image)
if self.target_transform is not None:
label = self.target_transform(label)
return image, label
def __len__(self):
return len(self.dataset)
然后,您可以通过使用不同的转换包装更大的数据集来在代码中使用它。
total_set = datasets.ImageFolder(ROOT)
# Eventually I plan to run cross-validation as such:
splits = KFold(cv = 5, shuffle = True, random_state = 42)
for train_idx, valid_idx in splits.split(total_set):
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)
train_loader = torch.utils.data.DataLoader(
WrapperDataset(total_set, transform=data_transforms['train_transforms']),
batch_size=32, sampler=train_sampler)
valid_loader = torch.utils.data.DataLoader(
WrapperDataset(total_set, transform=data_transforms['valid_transforms']),
batch_size=32, sampler=valid_sampler)
# train/validate now
我没有测试过这段代码,因为我没有完整的代码/模型,但概念应该很清楚。