一种基于离散度平衡的降维算法
A Dimension Reduction Algorithm Based on Divergence Balance
-
摘要: 已有的监督维数约简算法大都通过最大化类间离散度总和等相关手段选取判别能力较强的子空间,使得原始空间中距离较小的一些类易被忽略而在子空间中出现不同类的融合现象.为此,提出一种基于离散度平衡的降维算法——离散度平衡投影.该算法利用对称相对熵来衡量样本间的离散度,将对称相对熵与离散度平衡的概念结合,使得算法在降维过程中保持较大类间离散度的同时更加注重较小的类间离散度,以实现类间散度平衡的目的;为了充分使用现实生活中大量无标签样本,通过保持所有样本间拉普拉斯图结构进一步提出了半监督离散度平衡投影.对Soybean,Isolet,COIL20等标准数据集进行维数约简的实验结果表明,文中算法具有较好的降维效果.Abstract: Most existing traditional supervised dimension reduction methods always maximize the between-class divergences to select discriminant subspace. This causes that some small divergences between two classes in original space are ignored easily because samples in these classes will mix together after they are projected in the subspace. To this end, this paper proposed a novel dimension reduction algorithm based on divergence balance which is called divergence balance projection(DBP). This method utilizes symmetric KL divergence to measure divergences between classes and combines symmetric KL divergence with the concept of divergence balance. It pays more attention to small divergences while maintaining some large divergences, which achieves the goal to balance all divergences. In order to utilize abundant unlabeled samples in the real world, this paper utilized Laplacian graph and further proposed semi-supervised divergence balance projection(SDBP). Various experiments on Soybean, Isolet and COIL20 have shown that our proposed method can achieve better performances to reduce dimensions of samples.