高级检索

业务规则引导的数据资产图化简方法

Business Rule-Driven Simplification Method for Data Asset Graph

  • 摘要: 数据资产图规模大且业务结构复杂, 可视化结果常出现点边聚集、重叠和覆盖等视觉混杂. 图采样方法可以从数据层面对数据资产图进行化简, 从而减少视觉混杂, 但已有图采样方法没有考虑数据资产图的业务特性, 导致采样结果容易丢失重要资产和破坏业务结构. 针对上述问题, 提出业务规则引导的数据资产图化简方法. 首先提炼图采样需要遵循的5种数据资产图业务规则, 并给出规则的计算形式; 然后设计一种新图采样方法, 结合可计算的业务规则提出新种子节点筛选策略和新有偏随机游走采样策略; 最后设计3个图采样方法评价指标, 评价图采样结果对数据资产图业务特性的保持效果. 在12个图数据上的实验结果表明, 所提方法在保持数据资产图业务特性方面明显优于15种参考方法, 并且在传统图统计特性保持方面也表现优秀.

     

    Abstract: Due to the large scale of a data asset graph and the complexity of business structures, visualizations often suffer from visual clutter, such as node overlapping and edge crossing. Graph sampling methods can simplify a data asset graph at the node and edge scales to reduce visual clutter, but existing sampling methods fail to consider the business characteristics contained in a data asset. Therefore, important assets may be omitted and business structures may be disrupted. To address the issues, this paper proposes a business rule-driven simplification method for data asset graph. First, it identifies five business rules that data asset graph sampling should follow and provides computational forms of these rules. Next, a new graph sampling method is designed, integrating computable business rules, a seed nodes selection strategy, and a biased random walk sampling strategy. Finally, three new metrics are designed to assess how well a sampled graph preserves the business characteristics of a data asset graph. Comparative experimental results on 12 graph data show that the new method outperforms 15 reference methods in preserving the business characteristics of a data asset graph, while also performing well in maintaining traditional graph statistical properties.

     

/

返回文章
返回