Pixel-Level Matching for Video Object Segmentation using Convolutional Neural Networks
¹UMN, Minneapolis, MN ²KAIST, South Korea
We propose a novel video object segmentation algorithm based on pixel-level matching using Convolutional Neural Networks (CNN). Our network aims to distinguish the target area from the background on the basis of the pixel-level similarity between two object units. The proposed network represents a target object using features from different depth layers in order to take advantage of both the spatial details and the category-level semantic information. Furthermore, we propose a feature compression technique that drastically reduces the memory requirements while maintaining the capability of feature representation. Two-stage training (pretraining and fine-tuning) allows our network to handle any target object regardless of its category (even if the object’s type does not belong to the pre-training data) or of variations in its appearance through a video sequence. Experiments on large datasets demonstrate the effectiveness of our model - against related methods - in terms of accuracy, speed, and stability. Finally, we introduce the transferability of our network to different domains, such as the infrared data domain.
Pixel-level Matching for Video Object Segmentation using Convolutional Neural Networks
Jae Shin Yoon, Franc¸ois Rameau, Junsik Kim, Seokju Lee, Seunghak Shin, and In So Kweon
IEEE International Conference on Computer Vision (ICCV). 2017.
Personal Home page : Click-here
Email : jsyoon4325 [at] gmail [dot] com, jsyoon [at] umn [dot] edu