Dimensionality Reduction Method for Manifold Learning Based on Variational Autoencoder
-
Graphical Abstract
-
Abstract
Given the rapidly growing scale and complexity of scientific datasets, existing dimensionality reduction methods suffer from the “crowding problem” and the inability to embed new samples. A data dimensionality reduction method based on variational autoencoder uniform manifold approximation and projection (VAE-UMAP) has been proposed. First, to reduce the coupling between the high-dimensional data, the data is compressed into latent variables using a variational autoencoder (VAE). Then, the uniform manifold approximation and projection (UMAP) is used to further reduce the dimensionality of the latent variables, so that the low-dimensional embedding better maintains the similarity relationship within the original data. Finally, the proposed method is fitted with a training set and embedded in an out-of-sample test set to evaluate the generalization ability to the new data. Experimental results show that on the MNIST and Fashion-MNIST datasets, compared to four prominent dimensionality reduction methods UMAP, DensMAP, VAE and AE, the proposed method achieved trustworthiness scores of 0.994 4 and 0.993 9, surpassing the best current method UMAP by 0.031 6 and 0.014 1, respectively. Additionally, there were significant improvements in visualization, Kendall rank correlation coefficient, and classification accuracy metrics.
-
-