Business Rule-Driven Simplification Method for Data Asset Graph
-
Graphical Abstract
-
Abstract
Due to the large scale of a data asset graph and the complexity of business structures, visualizations often suffer from visual clutter, such as node overlapping and edge crossing. Graph sampling methods can simplify a data asset graph at the node and edge scales to reduce visual clutter, but existing sampling methods fail to consider the business characteristics contained in a data asset. Therefore, important assets may be omitted and business structures may be disrupted. To address the issues, this paper proposes a business rule-driven simplification method for data asset graph. First, it identifies five business rules that data asset graph sampling should follow and provides computational forms of these rules. Next, a new graph sampling method is designed, integrating computable business rules, a seed nodes selection strategy, and a biased random walk sampling strategy. Finally, three new metrics are designed to assess how well a sampled graph preserves the business characteristics of a data asset graph. Comparative experimental results on 12 graph data show that the new method outperforms 15 reference methods in preserving the business characteristics of a data asset graph, while also performing well in maintaining traditional graph statistical properties.
-
-