Detailed explanation of the use of imagefolder in pytorch


Imagefolder of pytorch

Torchvision has implemented common datasets in advance, including the previously used cifar-10 and datasets such as Imagenet, coco, MNIST and lsun, which can be called through torchvision.datasets.cifar10. Here is a frequently used dataset – imagefolder.

Imagefolder assumes that all files are saved in folders. Pictures of the same category are stored in each folder. The folder name is class name, and its constructor is as follows:

ImageFolder(root, transform=None, target_transform=None, loader=default_loader)

It mainly has four parameters:

Root: find the image in the path specified by root

Transform: the conversion operation of PIL image. The input of transform is the return object of the image read by the loader

target_ Transform: conversion of label

Loader: how to read pictures after a given path is given. By default, it reads PIL image objects in RGB format

Labels are sorted according to the order of folder names and stored in a dictionary, that is, {class name: class serial number (starting from 0)}. Generally speaking, it is best to directly name the folder as a number starting from 0, which will be consistent with the actual label of imagefolder. If it is not such a naming specification, it is recommended to see self.class_ to_ IDX attribute to understand the mapping relationship between label and folder name.

The picture structure is as follows:

from torchvision import transforms as T
import matplotlib.pyplot as plt
from torchvision.datasets import ImageFolder

dataset = ImageFolder('data/dogcat_2/')

#The picture in cat folder corresponds to label 0 and dog corresponds to 1

#Paths of all pictures and corresponding labels

#There is no transform, so the PIL image object is returned
#Print (dataset [0] [1]) # the first dimension is the number of images, the second dimension is 1, and label is returned
#Print (dataset [0] [0]) # is 0 and returns picture data

Plus transform

normalize = T.Normalize(mean=[0.4, 0.4, 0.4], std=[0.2, 0.2, 0.2])
transform = T.Compose([
dataset = ImageFolder('data1/dogcat_2/', transform=transform)

#In deep learning, the picture data is generally saved as cxhxw, that is, the number of channels x the picture height x the picture width

to_img = T.ToPILImage()
#0.2 and 0.4 are approximations of the standard deviation and mean

The above detailed explanation on the use of image folder of pytorch is all the content shared by Xiaobian. I hope it can give you a reference and support developeppaer.

Recommended Today

Implementation example of go operation etcd

etcdIt is an open-source, distributed key value pair data storage system, which provides shared configuration, service registration and discovery. This paper mainly introduces the installation and use of etcd. Etcdetcd introduction etcdIt is an open source and highly available distributed key value storage system developed with go language, which can be used to configure sharing […]