高级检索
徐佳宇, 张冬明, 靳国庆, 包秀国, 袁庆升, 张勇东. PNET: 像素级台标识别网络[J]. 计算机辅助设计与图形学学报, 2018, 30(10): 1878-1889. DOI: 10.3724/SP.J.1089.2018.16944
引用本文: 徐佳宇, 张冬明, 靳国庆, 包秀国, 袁庆升, 张勇东. PNET: 像素级台标识别网络[J]. 计算机辅助设计与图形学学报, 2018, 30(10): 1878-1889. DOI: 10.3724/SP.J.1089.2018.16944
Xu Jiayu, Zhang Dongming, Jin Guoqing, Bao Xiuguo, Yuan Qingsheng, Zhang Yongdong. PNET: Pixel-wise TV Logo Recognition Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2018, 30(10): 1878-1889. DOI: 10.3724/SP.J.1089.2018.16944
Citation: Xu Jiayu, Zhang Dongming, Jin Guoqing, Bao Xiuguo, Yuan Qingsheng, Zhang Yongdong. PNET: Pixel-wise TV Logo Recognition Network[J]. Journal of Computer-Aided Design & Computer Graphics, 2018, 30(10): 1878-1889. DOI: 10.3724/SP.J.1089.2018.16944

PNET: 像素级台标识别网络

PNET: Pixel-wise TV Logo Recognition Network

  • 摘要: 台标识别是典型的细微目标识别问题,针对台标区域小、信息量低,且镂空、半透明台标极易受到画面背景影响的难题,提出一个基于端到端全卷积网络的像素级台标识别网络——PNET.首先构建一个像素级标注的台标数据集,通过视频抽帧和图像预处理获得台标图像集,并提出一种逐图像的像素级半自动标注方法获得二值标签图像集;然后提出一个像素级台标识别网络,在典型分类网络AlexNet,VGG的基础上,通过微调,将分类网络在分类任务中学习到的网络参数转换为像素级台标识别网络在台标分割任务中的所需的网络参数;最后引入跨层架构,融合来自网络深层的全局信息和浅层的局部信息.实验结果表明PNET实现了准确的像素级分割,准确率高达98.3%,在NVIDIA Tesla K80上单幅图像识别时间不超过1.5 s.

     

    Abstract: TV logo recognition is a typical fine object recognition problem,referring to the problem that TV logo region is small and contains low amount of information,hollow-out and translucent logos are easily influenced by background in video frame,a pixel-wise TV logo recognition network based on an end to end fully convolutional network was proposed.Firstly a pixel-wise annotated TV logo dataset was constructed,a TV logo image set was obtained by extracting and preprocessing video frames,and a binary label image set was obtained by proposing a pixel-wise semi-automatic annotation method.Then a pixel-wise TV logo recognition network PNET was proposed based on a typical classification network AlexNet or VGG,and network parameters learned by a classification network in a classification task were converted to the network parameters required by a pixel-wise TV logo recognition network in a segmentation task.Finally a skip architecture was introduced in network combining global information from deep layers and local information from shallow layers.The experiment results show that PNET achieves accurate pixel-wise segmentation.The accuracy is up to 98.3%and inference time for per image on NVIDIA Tesla K80 is less than 1.5 s.

     

/

返回文章
返回