高级检索
艾志玮, 冷珏琳, 夏芳, 王华维, 曹轶. 面向精度可控的大规模结构化数据集约减方法[J]. 计算机辅助设计与图形学学报, 2021, 33(12): 1795-1802. DOI: 10.3724/SP.J.1089.2021.19263
引用本文: 艾志玮, 冷珏琳, 夏芳, 王华维, 曹轶. 面向精度可控的大规模结构化数据集约减方法[J]. 计算机辅助设计与图形学学报, 2021, 33(12): 1795-1802. DOI: 10.3724/SP.J.1089.2021.19263
Ai Zhiwei, Leng Juelin, Xia Fang, Wang Huawei, Cao Yi. Error-Controlled Data Reduction Approach for Large-Scale Structured Datasets[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(12): 1795-1802. DOI: 10.3724/SP.J.1089.2021.19263
Citation: Ai Zhiwei, Leng Juelin, Xia Fang, Wang Huawei, Cao Yi. Error-Controlled Data Reduction Approach for Large-Scale Structured Datasets[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(12): 1795-1802. DOI: 10.3724/SP.J.1089.2021.19263

面向精度可控的大规模结构化数据集约减方法

Error-Controlled Data Reduction Approach for Large-Scale Structured Datasets

  • 摘要: 科学与工程模拟产生的数据规模可达TB甚至PB量级,数据约减已成为降低I/O时间开销和存储成本的重要手段.为实现高精度的科学可视化和数据分析,面向大规模结构化数据集,提出一个精度可控的数据约减方法.该方法首先以可视分析数据的插值误差为约束条件,根据物理场量的空间分布特征构造多层嵌套的自适应背景网格;然后,将原始数据插值映射到轻量化的背景网格,减少冗余数据的存储;最后,将约减后的数据集并行输出至高效存储的可视化文件.数据约减算法基于并行编程框架JASMIN实现,能够无缝对接基于JASMIN框架研发的数值模拟程序.经测试,并行算法可扩展至上万CPU核.数据约减方法已成功应用于无人机辐照电磁模拟,在不超过10%的相对误差范围内,将千亿结构网格数据集的规模降低了99.8%.基于约减数据绘制的图像与原始数据图像之间的峰值信噪比为47.08 dB,具有较高的相似度,满足可视分析的分辨率要求.

     

    Abstract: The massive datasets generated by scientific or engineering simulations have reached terabytes(TB)or even petabytes(PB).Data reduction has thus become one of the most important tools for saving I/O and storage costs.In order to achieve high-precision visualization and analysis,an error-controlled data reduction approach is proposed for reducing structured large-scale datasets.Firstly,taken the difference between the resulting data and the original one as a constraint,a multi-level structured adaptively-refined background grid is constructed,according to the spatial distribution characteristics of the underlying physical fields.Secondly,the original data is interpolated and mapped to the background grid,and as a result,the data with much less cells is obtained and the storage cost is reduced.Finally,the reduced data is exported to the parallel file system in real time.The proposed data reduction algorithm is implemented based on the parallel programming framework named JASMIN.In this way,the algorithm can be directly coupled with the numerical simulation programs developed with JASMIN.Test results demonstrate that the parallel algorithm can be extended to tens of thousands of CPU cores in parallel.The proposed algorithm has been successfully applied to the electromagnetic simulation of unmanned aerial vehicle irradiation.The cell number of a structured dataset with one hundred billions cells is reduced by 99.8%,with the relative error less than 10%.The peak signal-to-noise ratio between the two im-ages,rendered using the reduced data and the original one respectively,is equal to 47.08 dB,which means a high similarity and thus satisfies the precision requirement of visualization.

     

/

返回文章
返回