残差增强的图像描述符

魏本昌; 郑丽; 管涛

doi:10.3724/SP.J.1089.2019.17134

残差增强的图像描述符

Residual Enhanced Image Descriptor

摘要

摘要: 针对增大视觉码书在提高图像全局描述符——局部特征聚合描述符(VLAD)精度的同时会增加VLAD存储开销的问题,提出一种基于2层结构层次视觉码书生成残差增强的图像全局描述符EVLAD.离线码书生成阶段,首先通过K-means算法生成第1层视觉码书,然后基于量化残差最小化原则非均匀地生成第2层各视觉子码书.在线EVLAD生成阶段,图像局部特征首先面向细粒度的第2层视觉子码书生成量化残差;然后面向第1层视觉码书进行聚集生成各子向量,EVLAD即为各子向量的串联结果,为了抑制特征空间爆发现象,各子向量和串联结果分别进行了L2归一化.实验结果表明,EVLAD精度优于VLAD和其他各种改进方法.

Abstract: For VLAD, higher search accuracy will be obtained by increasing the size of visual codebook, but more memory usage is entailed. To solve the contradiction between search quality and memory usage, a global image descriptor called EVLAD, aggregating finer residual by use of hierarchical visual codebook with two-layer structure, is proposed. In the offline preprocessing stage, firstly, the first layer visual codebook is learned with K-means in the local descriptor space, and then each visual sub-codebook of the second layer is generated non-uniformly based on the quantization residual minimization criterion. In the online generation stage, the idea of EVLAD is associating the residual generation and accumulation process to different layer visual words, i.e., for a local descriptor, the residual, which is generated by subtracting the second layer nearest visual word from the local descriptor, is summed to a vector corresponding to one of the first layer visual word, and then EVLAD is the concatenation of all vector. In order to suppress the burst phenomena in feature space, L2 -normalization is employed for each subsector and the final concatenation vector. The experimental result shows our EVLAD outperforms VLAD and other modified strategies.

HTML全文

参考文献(0)

施引文献

资源附件(0)