Convolution has two implementations in Python, one is convolution torch.nn.Conv2d (), one is torch.nn.functional . conv2d (), these two methods are essentially convolution operations, and the input requirements are the same. The first thing you need to input is a convolution operation torch.autograd.Variable The type and size of () are (batch, channel, H. Where batch represents the number of input batch data and channel represents the number of input channels.
Generally, a color image is 3, and the gray image is 1. However, the number of channels in the convolution network process is relatively large, and there will be dozens to hundreds of channels. H and W denote the height and width of the input image. For example, if a batch has 32 images, each image has three channels, and the height and width are 50 and 100 respectively, then the input size is (32, 3, 50100).
The following code is the implementation of convolution execution able edge detection operator:
import torch import numpy as np from torch import nn from PIL import Image from torch.autograd import Variable import torch.nn.functional as F def nn_conv2d(im): #Use nn.Conv2d Define convolution operations conv_op = nn.Conv2d(1, 1, 3, bias=False) #Define Sobel operator parameters sobel_kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype='float32') #Transform Sobel operator into convolution kernel of adaptive convolution operation sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3)) #Assigning convolution kernels to convolution operations conv_op.weight.data = torch.from_numpy(sobel_kernel) #Convolute the image edge_detect = conv_op(Variable(im)) #Convert output to picture format edge_detect = edge_detect.squeeze().detach().numpy() return edge_detect def functional_conv2d(im): sobel_kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]], dtype='float32') # sobel_kernel = sobel_kernel.reshape((1, 1, 3, 3)) weight = Variable(torch.from_numpy(sobel_kernel)) edge_detect = F.conv2d(Variable(im), weight) edge_detect = edge_detect.squeeze().detach().numpy() return edge_detect def main(): #Read in a picture and convert it to grayscale im = Image.open('./cat.jpg').convert('L') #Convert image data to matrix im = np.array(im, dtype='float32') #The image matrix is transformed into a torch tensor, and the convolution input is adapted im = torch.from_numpy(im.reshape((1, 1, im.shape, im.shape))) #Edge detection operation # edge_detect = nn_conv2d(im) edge_detect = functional_conv2d(im) #Converting array data to image im = Image.fromarray(edge_detect) #Image data is converted to gray mode im = im.convert('L') #Save picture im.save('edge.jpg', quality=95) if __name__ == "__main__": main()
Original picture: cat.jpg
Results picture: edge.jpg
The above detailed explanation of convolution operation of Sobel operator implemented by Python is the whole content shared by Xiaobian. I hope it can give you a reference and support developer.