高级检索
苗建辉, 栗志扬, 周泽艳, 杨传福, 刘朝斌, 刘卫江. 比特串划分多索引的近邻搜索算法[J]. 计算机辅助设计与图形学学报, 2019, 31(5): 771-779. DOI: 10.3724/SP.J.1089.2019.17341
引用本文: 苗建辉, 栗志扬, 周泽艳, 杨传福, 刘朝斌, 刘卫江. 比特串划分多索引的近邻搜索算法[J]. 计算机辅助设计与图形学学报, 2019, 31(5): 771-779. DOI: 10.3724/SP.J.1089.2019.17341
Miao Jianhui, Li Zhiyang, Zhou Zeyan, Yang Chuanfu, Liu Zhaobin, Liu Weijiang. Nearest Neighbor Search Based on Bit String Partition and Multiple Index[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(5): 771-779. DOI: 10.3724/SP.J.1089.2019.17341
Citation: Miao Jianhui, Li Zhiyang, Zhou Zeyan, Yang Chuanfu, Liu Zhaobin, Liu Weijiang. Nearest Neighbor Search Based on Bit String Partition and Multiple Index[J]. Journal of Computer-Aided Design & Computer Graphics, 2019, 31(5): 771-779. DOI: 10.3724/SP.J.1089.2019.17341

比特串划分多索引的近邻搜索算法

Nearest Neighbor Search Based on Bit String Partition and Multiple Index

  • 摘要: 哈希表示的比特串是解决海量数据相似性搜索问题最有效的方法之一.针对比特串索引方式导致搜索效果低下的问题,提出一种基于比特串划分多索引的近邻搜索算法.首先由于比特串划分本质是一个组合优化问题,采用贪婪的思想给出该问题的近似解;其次在近邻查询阶段,结合多索引结构提出新的查询扩展和融合机制;最后通过采用一种查询自适应的办法优化多索引之间的不平衡性.在MNIST, CIFAR-10, SIFT-1M和GIST-1M数据集上使用Matlab软件进行实验的结果表明,该算法在基于哈希表示的索引结构以及在近邻搜索方面具有有效性和通用性.

     

    Abstract: The bit string represented by hash was one of the most effective methods to solve the similarity search problem of massive data. In view of the problem that the bit string indexing method lead to low search effect, a neighbor search algorithm based on bit string partitioning and multiple index is proposed.Firstly, the essence of bit string partitioning is a combinatorial optimization problem. In this paper, a greedy idea is used to give an approximate solution to the problem. Secondly, in the neighborhood query phase, a new query expansion and fusion mechanism is proposed based on the multi-index structure. Finally, a query adaptive method is used to optimize the imbalance between multiple indexes. On the MNIST, CIFAR-10,SIFT-1M and GIST-1M datasets, experiments and results are presented using Matlab software. The results demonstrate that the proposed method is effective and general in neighborhood search on the index structure based on hash representation.

     

/

返回文章
返回