페이지

2022년 5월 17일 화요일

1.4.1 Computer Vision

 Image classification is a common classification problem. The input of the neural network is pictures, and the output value is the probability that the current sample belongs to each category. Generally, the category with the highest probability is selected as the predicted category of the sample.

Image recognition is one of the earliest successful applications of deep learning. Classic neural network models include VGG series, Inception series, and ResNet series.

Object detection refers to the automatic detection of the approximate locationof common objects in a picture by an algorithm. It is usually represented by a bounding box and classifies the category information of objects in the bounding box, as shown in Figure 1-15. Common object detection algorithms are RCNN, Fast RCNN, Faster RCNN, Mask RCNN, SDD, and YOLO eries.

Semantic segmentation is an algorithm to automatically segment and identify the content in a picture. We can understand semantic segmentation as the classification of each pixel and analyze the category information of each pixel, as shown in Figure 1-16. Common semantic segmentationi models include FCN, U-net, SegNet, and DeepLab series.

Video Understanding. As Deep learning achieves better result on 2D picture-related tasks, 3D video understanding tasks with temporal dimention information (the third dimention is sequence of frames) are receiving more and more attention. Common video understanding tasks include video classification, behavior detection, and video subject extraction. Common models are C3D, TSN, DOVF, and TS_LSTM.

Image generation learns the distribution of real pictures and samples from the learned distribution to obtain highly realistic generated pictures. At present, common image generation models include VAE series and GAN series. Among them, the GAN series of algorithms have made great progress in recent years. The picture effect produced by the latest GAN model has reached a level where it is difficult to distingush the authenticity with the naked eye, as shown in Figure 1-17.

In addition to the preceding applications, deep learning has also achieved significant results in other areas, such as artistic style transfer(Figure 1-18), super-resolution, picture de-nosing/hazing, grayscale picture coloring and many others.


댓글 없음: