高级检索

基于残差空间网络的建筑图像编辑

Architectural Image Editing Based on Residual Spatial Networks

  • 摘要: 建筑图像层次结构的复杂性给图像编辑带来了一定的挑战. 虽然部分工作结合重建和编辑的方法进行建筑图像编辑, 但由于图像编码到的空间维度较低, 容易丢失空间信息, 导致编辑后的建筑图像发生扭曲, 难以保持结构一致性. 为了解决上述问题, 提出一种基于StyleGAN的建筑图像编辑模型ABEditor. 首先提出多层次结构的残差空间网络, 用于学习多层空间特征和风格特征, 并连接到StyleGAN对应维度的高维特征空间, 通过逐层校正对输入图像进行精确重构, 表达丰富的空间信息, 在合成网络更接近于输出的后期层, 使用编码器将图像映射到低维隐空间, 保证图像的颜色细节; 为了更准确地评估编辑效果, 提出一种基于语义分割的平均像素编辑距离(APEDS)评价指标. 在LSUN Church数据集上与相关方法进行实验的结果表明, ABEditor在保留建筑图像细节和空间结构方面有所提升, 在SSIM, FID, L2, LPIPS, PSNR和APEDS定量分析指标中, 分别提升28.64%, 20.74%, 13.67%, 1.67%, 10.01%和21.26%.

     

    Abstract: The complexity of the hierarchical structure of architectural images brings certain challenges to architectural image editing. Some works combine reconstruction and editing for architectural image editing, however, low-dimensional encoding in hidden space often results in loss of spatial information, causing generated images to be distorted and making it challenging to maintain structural consistency. To address these issues, we propose an architectural image editing model based on StyleGAN, named ABEditor. Specifically, we introduce a multi-level residual spatial network to learn multi-layer spatial and style features, which are connected to the corresponding dimensions in StyleGAN’s high-dimensional feature space. The input image can be accurately reconstructed through layer-wise correction while preserving rich spatial information. An encoder in the early layers of the synthesis network maps images to low-dimensional latent space, ensuring fidelity in color details. Additionally, to more accurately evaluate editing performance, we propose the average pixel edit distance based on segmentation (APEDS) metric. Comparative evaluations on the LSUN Church dataset show that the ABEditor framework improves in retaining building image details and spatial structure, outperforming related methods by 28.64% in SSIM, 20.74% in FID, 13.67% in L2, 1.67% in LPIPS, 10.01% in PSNR, and 21.26% in APEDS.

     

/

返回文章
返回