高级检索

数据聚合与力排斥融合的散点图去重叠方法

Scatterplot Overlap Removal Method by Integrating Data Aggregation and Force Repulsion

  • 摘要: 散点图是常见的可视化分析工具,随着数据规模的增大,数据重叠是影响用户从散点图中获取知识的关键问题。针对现有的去重叠算法在处理大规模高密度数据集时容易出现数据信息丢失和可视化清晰度下降的问题,提出融合数据聚合与力排斥的散点图去重叠方法——DAFR。首先将空间划分为规则化网格,依据网格数量、数据规模、点类别和点分布确定待聚合点;然后将待聚合点归并至最近邻同类点,并按聚合量分配视觉单元尺寸;最后根据力排斥原理和位移约束迭代数据点的位置,直到重叠率小于重叠阈值。在29个规模和分布方式不同的数据集上进行实验的结果表明,DAFR方法在位移最小化、K近邻维持、形状维持、密度维持和整体相似性5个客观指标上均表现出色;通过案例分析和主观实验,进一步验证了该方法在可视化分析任务中的实用性和有效性。

     

    Abstract: Scatterplots are a common visualization analysis tool. With the increase in data scale, data overlap becomes a key issue affecting users’ ability to extract insights from scatterplots. To address the problems of data information loss and reduced visualization clarity that existing overlap removal algorithms encounter when handling large-scale high-density datasets, we propose a scatterplot overlap removal method that integrates data aggregation and force-repulsion—DAFR. First, the space is divided into a regular grid, and points to be aggregated are determined based on the number of grids, data scale, point categories, and point distribution. Then, the points to be aggregated are merged with their nearest neighbors of the same type, and visual unit sizes are assigned according to the aggregation amount. Finally, the positions of the data points are iteratively adjusted according to the force-repulsion principle and displacement constraints until the overlap rate falls below the overlap threshold. Experimental results on 29 datasets with varying scales and distributions show that the DAFR method performs excellently in five objective metrics: displacement minimization, K-nearest neighborhood maintenance, shape preservation, density preservation, and overall similarity. Case studies and subjective experiments further confirm the method’s practicality and effectiveness in visual analysis tasks.

     

/

返回文章
返回