基于傅里叶的全局-局部联合感知时空预测模型

段逸凡; 肖明军

doi:10.3724/SP.J.1089.2023-00812

基于傅里叶的全局-局部联合感知时空预测模型

Fourier-Based Global-Local Joint Perception Spatiotemporal Prediction Model

摘要

摘要: 时空序列的预测学习旨在通过从历史数据背景中学习来生成未来的图像数据, 时空预测的挑战性在于需要捕捉物理世界中的复杂空间关联和时间演化. 针对现有研究多用于处理特定具体任务, 并且通常关注整体变化特征趋势而忽视的局部细节, 以及存在不可并行的递归单元模型效率较低的问题, 设计了一个通用的时空数据预测模型. 首先结合局部和全局空间特征, 利用基于傅里叶的变换捕捉全局依赖关系, 并与Swin Transformer提取的局部关系进行融合, 实现空间上全局-局部联合感知; 最后通过多尺度全卷积模块提取时间上不同尺寸的特征, 并再次通过傅里叶变换将时间域转换为频域, 全面获取连续演化的时间堆栈中特征. 在SEVIR, KTH和MovingMNIST数据集上的实验结果表明, 所提模型表现出优异的通用性、有效性和可扩展性. 该模型不仅保留数据的长期依赖性, 还提高了模型的计算效率.

Abstract: The study of spatiotemporal sequence forecasting aims to generate future image data by learning from historical contexts, with the challenge lying in capturing the complex spatial correlations and temporal evolutions of the physical world. Current research often focuses on specific tasks and tends to emphasize overall change trends at the expense of local details. Additionally, the inefficiency of non-parallel recursive unit models remains a concern. To address these issues, this paper proposes a universal spatiotemporal data prediction model. It first integrates local and global spatial features, employing Fourier-based transformations to capture global dependencies, which are then fused with local relations extracted by the Swin Transformer, achieving joint global-local spatial awareness. Subsequently, a multi-scale fully convolutional module extracts temporal features of varying sizes. A further Fourier transformation converts the time domain into the frequency domain, thoroughly capturing features within the continuous time-evolving stack. Experimental results on the SEVIR, KTH, and MovingMNIST datasets demonstrate the model’s superior generalizability, effectiveness, and scalability. Not only does the model preserve long-term dependencies, but it also enhances computational efficiency.

HTML全文

参考文献(37)

施引文献

资源附件(0)

英文长摘要