高级检索

开放环境下的跨域物体检测综述

Survey on Cross-Domain Object Detection in Open Environment

  • 摘要: 传统的物体检测模型假设模型的训练和测试数据来自相同或相似的场景,然而该假设在实际运用中难以满足,即检测模型被要求在不同的环境或场景下进行工作,使得传统模型不可避免地受到影响,导致检测精度明显下降.为了解决这个问题,近年来跨域物体检测问题受到了广泛关注.文中介绍了近几年跨域物体检测问题的发展历程和相关方法,将跨域物体检测方法归纳为基于迁移学习的、自学习的和图像生成的3大类.其中,基于迁移学习的方法结合域适应和物体检测方法,提升模型对不同环境的适应能力;基于自学习的方法利用伪标签提升模型在目标域上的迁移能力;基于图像生成的方法利用生成式对抗网络生成相关的图像辅助模型训练,提升模型在目标域的效果.同时,介绍了用于跨域物体检测的相关数据集和代表性方法的性能.最后总结跨域物体检测现阶段的分类以及存在的不足,并指出对未知域泛化性能的探索、数据隐私问题的解决,以及视觉提示技术的应用等新发展方向.

     

    Abstract: object detection models assume that the training and test data come from the same or similar scenes, but the assumption is difficult to satisfy in practical applications, that is, detection models are often required to work in different environments or scenarios, which inevitably affects traditional models and leads to a significant decrease in detection accuracy. To address this issue, cross-domain object detection has received widespread attention in recent years. This survey presents the development history and relevant methods of cross-domain object detection in recent years, categorizing cross-domain object detection algorithms into three main classes: those based on transfer learning, self-learning, and image generation. Among these, transfer learning algorithms integrate domain adaptation and object detection algorithms to enhance the model's adaptability to different environments. Self-learning algorithms leverage pseudo-labels to improve the model's transferability on the target domain. Image generation-based algorithms utilize generative adversarial networks to generate relevant images for assisting model training, thereby enhancing the model's performance in the target domain. Additionally, relevant datasets used for cross-domain object detection and the performance of representative algorithms are introduced. Finally, the current classification of cross-domain object detection is summarized, and existing shortcomings are highlighted. Moreover, new directions for future research are indicated, including the exploration of generalization performance in unknown domains, the resolution of data privacy concerns, and the application of visual prompt techniques.

     

/

返回文章
返回