Image classification task

Time:2021-5-4

This paper is based on the completion of the “image scene classification challenge” released by AI Institute on October 30, 2020. It is mainly written from the perspective of how Xiaobai can run an image classification task with code. It is also a learning process of his own.

Image scene classification challenge: https://imgs.developpaper.com/imgs/lili The label file is a CSV file with the contents of file name (‘0. JPG ‘) and label (‘forest’). The two columns are of string type.

  • Then take a look at the format of the submitted result file: the result file is a CSV file, and the content does not need title. The first column is the image serial number (0), int type, and the second column is the category name (‘street ‘), string type.

After the analysis of the basic competition content and conditions, we can start to use the code to achieve, this paper uses the colab platform to achieve.   The implementation steps are as follows:

  • Successful use of Google cloud disk and colab(colab is a jupyter notebook environment, which has been installed with Python by default. It can be used without any settings, and it runs completely in the cloud. Please refer to:https://www.cnblogs.com/lfri/p/10471852.html , At present, colab can not be accessed in China, so some software can be installed to access it, such as ghelper:  http://googlehelper.net/ )
  • Upload the official data set compression package to Google cloud disk(pay attention to upload the compressed package. Don’t upload after decompression. It’s very slow to upload after decompression.)
  • Code implementation (preface work)
  1. Mount Google drive(load Google cloud disk in colab)
  2. Unzip file(decompress the dataset package file to the current running environment)
  3. Create a folder to store the trained model
  • Code implementation (formal work)
  1. Guide bag(to import all the packages to be used, one needs to be supplemented in the process of writing code)
  2. Check whether to use GPU
  3. Read label file(read the labeled file of training set, here is the file in CSV format)
  4. Defines the class that reads the dataset(including training set and test set)
  5. Pretreatment(preprocessing the dataset)
  6. Call the class that reads the dataset(including training set and test set)
  7. Initialization of pre training model
  8. Define training methods
  9. train(call pre training model and training method for training)
  10. test(use the trained model to test and get the result file in CSV format)

       

Mount Google drive(load Google cloud disk in colab)

from google.colab import drive
drive.mount('/content/drive')

Unzip file(decompress the dataset package file to the current running environment)

! cp -r /content/drive/My\ Drive/Scene/Image_ Classification. Zip. / # copy the compressed file of data set in Google cloud disk to the current running environment
! unzip Image_ Classification. Zip # decompresses the dataset compressed file, and obtains the 'train' folder, 'test' folder and 'train. CSV' file in the current running environment

       Create a folder to store the trained model

! mkdir /content/drive/My\ Drive/Scene/checkpoint

Guide bag(to import all the packages to be used, one needs to be supplemented in the process of writing code)

import torch
import pandas as pd
from PIL import Image
from torchvision import transforms, models
from torch.utils.data import random_split, DataLoader
import os
import torch.nn as nn
import time
import torch.optim as optim

Check if the GPU is in use

device = torch.device('cuda:0' if torch.cuda.is_available() else 'CPU')
print(device)

Read label file(read the labeled file of training set, here is the file in CSV format)

def readLabelFile():
  label_file = pd.read_csv('train.csv')
  return label_file['filename'],label_file['label']

filename,filelabel = readLabelFile()
map = ['buildings', 'street', 'forest', 'sea', 'mountain', 'glacier']
num_class = len(map)
#Convert string in label to number
for i in range(len(map)):
  filelabel[filelabel==map[i]] = i
#Convert objects to lists  
filename = filename.values
filelabel = filelabel.values

Defines the class that reads the dataset(including training set and test set)

class TrainDataset(torch.utils.data.Dataset):

  def __init__(self, root, img_list, label_list, transform = None):
    self.root = root
    self.img_list = img_list
    self.label_list = label_list
    self.transform = transform
  
  def __getitem__(self, index):
    img = Image.open(self.root + self.img_list[index]).convert('RGB')
    label = self.label_list[index]
    if self.transform:
      img = self.transform(img)
    return img,label
  
  def __len__(self):
    return len(self.img_list)


class TestDataset(torch.utils.data.Dataset):

  def __init__(self, img_path, transform = None):
    self.img_path = img_path
    self.transform = transform
  
  def __getitem__(self, index):
    img = Image.open(self.img_path[index]).convert('RGB')
    if self.transform:
      img = self.transform(img)
    return img,index

  def __len__(self):
    return len(self.img_path)

Pretreatment(preprocessing the dataset)

transform = {
    'train': transforms.Compose([
          transforms.Resize((224, 224),interpolation=2),
          transforms.RandomHorizontalFlip(p=0.5),
          transforms.ToTensor(),
          transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
    ]),
    'val': transforms.Compose([
          
    ])
}

Call the class that reads the dataset(including training set and test set)

train_dataset = TrainDataset('./train/', filename, filelabel, transform['train'])
tra_dataset, val_dataset = random_split(train_dataset, [10000, 3627])
test_dataset = TestDataset([x.path for x in os.scandir('./test/')], transform['train'])

tra_loader = DataLoader(tra_dataset, batch_size=64, shuffle=True, num_workers=2)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False, num_workers=2)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False, num_workers=2)

tra_dataset_num = tra_dataset.__len__()

Initialization of pre training model

def initializeModel(model_name, num_class, finetuning=False, pretrained=True):

  if model_name == 'alexnet':
    model = models.alexnet(pretrained=pretrained)
  elif model_name == 'vgg11':
    model = models.vgg11(pretrained=pretrained)
  elif model_name == 'vgg11_bn':
    model = models.vgg11_bn(pretrained=pretrained)
  elif model_name == 'vgg13':
    model = models.vgg13(pretrained=pretrained)
  elif model_name == 'vgg13_bn':
    model = models.vgg13_bn(pretrained=pretrained)
  elif model_name == 'vgg16':
    model = models.vgg16(pretrained=pretrained)
  elif model_name == 'vgg16_bn':
    model = models.vgg11(pretrained=pretrained)
  elif model_name == 'vgg19':
    model = models.vgg11(pretrained=pretrained)
  elif model_name == 'vgg19_bn':
    model = models.vgg11(pretrained=pretrained)
  elif model_name == 'resnet18':
    model = models.resnet18(pretrained=pretrained)
  elif model_name == 'resnet34':
    model = models.resnet34(pretrained=pretrained)
  elif model_name == 'resnet50':
    model = models.resnet50(pretrained=pretrained)
  elif model_name == 'resnet101':
    model = models.resnet101(pretrained=pretrained)
  elif model_name == 'resnet152':
    model = models.resnet152(pretrained=pretrained)
  elif model_name == 'squeezenet1_0':
    model = models.squeezenet1_0(pretrained=pretrained)
  elif model_name == 'squeezenet1_1':
    model = models.squeezenet1_1(pretrained=pretrained)
  elif model_name == 'densenet121':
    model = models.densenet121(pretrained=pretrained)
  elif model_name == 'densenet169':
    model = models.densenet169(pretrained=pretrained)
  elif model_name == 'densenet161':
    model = models.densenet161(pretrained=pretrained)
  elif model_name == 'densenet201':
    model = models.densenet201(pretrained=pretrained)
  elif model_name == 'inception_v3':
    model = models.inception_v3(pretrained=pretrained)
  elif modle_name == 'googlenet':
    model = models.googlenet(pretrained=pretrained)
  elif model_name == 'shufflenet_v2_x0_5':
    model = models.shufflenet_v2_x0_5(pretrained=pretrained)
  elif model_name == 'shufflenet_v2_x1_0':
    model = models.shufflenet_v2_x1_0(pretrained=pretrained)
  elif model_name == 'shufflenet_v2_x1_5':
    model = models.shufflenet_v2_x1_5(pretrained=pretrained)
  elif model_name == 'shufflenet_v2_x2_0':
    model = models.shufflenet_v2_x2_0(pretrained=pretrained)
  elif model_name == 'mobilenet_v2':
    model = models.mobilenet_v2(pretrained=pretrained)
  elif model_name == 'resnext50_32x4d':
    model = models.resnext50_32x4d(pretrained=pretrained)
  elif model_name == 'resnext101_32x8d':
    model = models.resnext101_32x8d(pretrained=pretrained)
  elif model_name == 'wide_resnet50_2':
    model = models.wide_resnet50_2(pretrained=pretrained)
  elif model_name == 'wide_resnet101_2':
    model = models.wide_resnet101_2(pretrained=pretrained)
  elif model_name == 'mnasnet0_5':
    model = models.mnasnet0_5(pretrained=pretrained)
  elif model_name == 'mnasnet0_75':
    model = models.mnasnet0_75(pretrained=pretrained)
  elif model_name == 'mnasnet1_0':
    model = models.mnasnet1_0(pretrained=pretrained)
  elif model_name == 'mnasnet1_3':
    model = models.mnasnet1_3(pretrained=pretrained)
  else:
    raise ValueError('No such Model %s' % model_name)

  if finetuning:
    for param in model.parameters():
      param.requires_grad = True
  else:
    for param in model.parameters():
      param.requires_grad = False

  fc_ features = model.fc.in_ Features # extracting fixed parameters in FC layer of pre training network model
  model.fc = nn.Linear(fc_ features, num_ Class) # change the number of final categories in the FC layer of the pre training network model to the number of categories in the dataset
  Model = model. To (device) # load the model to the specified device (GPU)
  return model

Define training methods

def traWay(model, criterion, optimizer, epochs):
  begin_time = time.time()
  once_begin_time = begin_time
  for epoch in range(epochs):
    print('Epoch {}/{}'.format(epoch+1, epochs))
    print('-' * 10)

    running_loss = 0.0
    running_corrects = 0.0

    #Traversing data sets
    for img, labels in tra_loader:
      img = img.to(device)
      labels = labels.to(device)

      optimizer.zero_ Grad() # initializes the gradient to zero
      Outputs = model (IMG) # the predicted value is obtained by forward propagation
      preds = torch.argmax(outputs, dim=1)
      loss = criterion(outputs, labels)
      loss.backward()
      Optimizer. Step() # update parameters

      running_loss += loss.item() * img.size(0)
      running_corrects += torch.sum(preds == labels.data)
    
    epoch_loss = running_loss/tra_dataset_num
    epoch_acc = running_corrects/tra_dataset_num

    print('Loss: {:.4f} Acc: {:.4f}'.format(epoch_loss, epoch_acc))
    print('Training Time per Epoch {}'.format(time.time() - once_begin_time))
    once_begin_time = time.time()
  
  end_time = time.time() - begin_time
  print('Training complete in {:.0f}m {:.0f}s'.format(end_time // 60, end_time % 60))
  return model

train

model = initializeModel('resnet152', num_class, True)
optimizer = optim.Adam(model.fc.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

pre_check_name = r'/content/drive/My Drive/Scene/checkpoint/152_state_best.tar'

if '152_state_best.tar' in os.listdir(r'/content/drive/My Drive/Scene/checkpoint'):
  print('loading previous state......')
  checkpoint = torch.load(pre_check_name)
  model.load_state_dict(checkpoint['model_state_dict'])
  optimizer.load_state_dict(checkpoint['optimizer_state_dict'])

model = traWay(model, criterion, optimizer, 1)
check_name = r'/content/drive/My Drive/Scene/checkpoint/152_state_best.tar'

torch.save({
    'model_state_dict':model.state_dict(),
    'optimizer_state_dict':optimizer.state_dict()
},check_name)

test

model = initializeModel('resnet152', num_class, False)
check_name = r'/content/drive/My Drive/Scene/checkpoint/152_state_best.tar'
checkpoint = torch.load(check_name)
model.load_state_dict(checkpoint['model_state_dict'])

with open('./result.txt', mode='w') as result_file:
  for img, index in test_loader:
    img = img.to(device)

    outputs = model(img)
    preds = torch.argmax(outputs, dim=1)

    for i in range(index.shape[0]):
      print(str(np.array(index)[i].item())+','+str(map[preds[i]]))
      result_file.write(str(index[i].item())+','+str(map[preds[i]])+'\n')