高级检索
段逸凡, 肖明军. 基于傅里叶的全局-局部联合感知时空预测模型[J]. 计算机辅助设计与图形学学报. DOI: 10.3724/SP.J.1089.2023-00812
引用本文: 段逸凡, 肖明军. 基于傅里叶的全局-局部联合感知时空预测模型[J]. 计算机辅助设计与图形学学报. DOI: 10.3724/SP.J.1089.2023-00812
Yifan Duan, Mingjun Xiao. Fourier-BasedGlobal-LocalJointPerceptionSpatiotemporalPredictionModel[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2023-00812
Citation: Yifan Duan, Mingjun Xiao. Fourier-BasedGlobal-LocalJointPerceptionSpatiotemporalPredictionModel[J]. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089.2023-00812

基于傅里叶的全局-局部联合感知时空预测模型

Fourier-BasedGlobal-LocalJointPerceptionSpatiotemporalPredictionModel

  • 摘要: 时空序列的预测学习旨在通过从历史数据背景中学习来生成未来的图像数据, 时空预测的挑战性在于需要捕捉物理世界中的复杂空间关联和时间演化. 针对现有研究多用于处理特定具体任务, 并且通常关注整体变化特征趋势而忽视的局部细节, 以及存在不可并行的递归单元模型效率较低的问题, 设计了一个通用的时空数据预测模型. 首先结合局部和全局空间特征, 利用基于傅里叶的变换捕捉全局依赖关系, 并与Swin Transformer提取的局部关系进行融合, 实现空间上全局-局部联合感知; 最后通过多尺度全卷积模块提取时间上不同尺寸的特征, 并再次通过傅里叶变换将时间域转换为频域, 全面获取连续演化的时间堆栈中特征. 在SEVIR, KTH和MovingMNIST数据集上的实验结果表明, 所提模型表现出优异的通用性、有效性和可扩展性. 该模型不仅保留数据的长期依赖性, 还提高了模型的计算效率.

     

    Abstract: Abstract: The spatio-temporal sequence prediction learning aims to generate future image data by learning from the context of historical data. The challenge of spatio-temporal prediction lies in capturing complex spatial correlations and temporal evolutions in the physical world. Current research is often geared towards specific tasks and tends to focus on the overall trend of change while neglecting local details. Additionally, there are inefficiencies related to non-parallel recursive unit models. To address these issues, this paper presents a versatile framework for spatio-temporal data prediction. The framework integrates local and global spatial features, utilizing Fourier-based transformations to capture global dependencies, which are then fused with local relationships extracted by the Swin-Transformer to achieve joint global-local spatial awareness. Temporal features of various scales are extracted through a multi-scale fully convolutional module, and a Fourier transformation is applied again to convert the time domain into the frequency domain, capturing features within the continuous evolutionary time stack comprehensively. This not only preserves the long-term dependencies of the data but also enhances the computational efficiency of the model. Experimental results demonstrate the model's superior generality, effectiveness, and scalability on the SEVIR, KTH, and MovingMNIST datasets

     

/

返回文章
返回