Detailed explanation of convolution operation of Sobel operator realized by Python

Time:2021-3-3

Convolution has two implementations in Python, one is convolution torch.nn.Conv2d (), one is torch.nn.functional . conv2d (), these two methods are essentially convolution operations, and the input requirements are the same. The first thing you need to input is a convolution operation torch.autograd.Variable The type and size of () are (batch, channel, H. Where batch represents the number of input batch data and channel represents the number of input channels.

Generally, a color image is 3, and the gray image is 1. However, the number of channels in the convolution network process is relatively large, and there will be dozens to hundreds of channels. H and W denote the height and width of the input image. For example, if a batch has 32 images, each image has three channels, and the height and width are 50 and 100 respectively, then the input size is (32, 3, 50100).

The following code is the implementation of convolution execution able edge detection operator:

import torch
import numpy as np
from torch import nn
from PIL import Image
from torch.autograd import Variable
import torch.nn.functional as F
 
 
def nn_conv2d(im):
  #Use nn.Conv2d Define convolution operations
  conv_op = nn.Conv2d(1, 1, 3, bias=False)
  #Define Sobel operator parameters
  sobel_kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype='float32')
  #Transform Sobel operator into convolution kernel of adaptive convolution operation
  sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))
  #Assigning convolution kernels to convolution operations
  conv_op.weight.data = torch.from_numpy(sobel_kernel)
  #Convolute the image
  edge_detect = conv_op(Variable(im))
  #Convert output to picture format
  edge_detect = edge_detect.squeeze().detach().numpy()
  return edge_detect
 
def functional_conv2d(im):
  sobel_kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype='float32') #
  sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3))
  weight = Variable(torch.from_numpy(sobel_kernel))
  edge_detect = F.conv2d(Variable(im), weight)
  edge_detect = edge_detect.squeeze().detach().numpy()
  return edge_detect
 
def main():
  #Read in a picture and convert it to grayscale
  im = Image.open('./cat.jpg').convert('L')
  #Convert image data to matrix
  im = np.array(im, dtype='float32')
  #The image matrix is transformed into a torch tensor, and the convolution input is adapted
  im = torch.from_numpy(im.reshape((1, 1, im.shape[0], im.shape[1])))
  #Edge detection operation
  # edge_detect = nn_conv2d(im)
  edge_detect = functional_conv2d(im)
  #Converting array data to image
  im = Image.fromarray(edge_detect)
  #Image data is converted to gray mode
  im = im.convert('L')
  #Save picture
  im.save('edge.jpg', quality=95)
 
if __name__ == "__main__":
  main()

Original picture: cat.jpg

Results picture: edge.jpg

The above detailed explanation of convolution operation of Sobel operator implemented by Python is the whole content shared by Xiaobian. I hope it can give you a reference and support developer.