基于特征结合与权重对抗训练的鲁棒模型水印方案

高光勇; 徐梓棋; 方巍

doi:10.3724/SP.J.1089.2023-00805

基于特征结合与权重对抗训练的鲁棒模型水印方案

Robust Model Watermarking Scheme Based on Feature Combination and Weight Adversarial Training

摘要

摘要: 模型水印方案是用于保护深度神经网络模型所有者对其所有权的信息隐藏技术. 针对现有模型水印方案无法有效抵抗水印去除攻击的问题, 提出一种基于特征结合和权重对抗训练的模型水印方案. 该方案从特征结合角度融合原始训练数据集图像以构造触发集; 在水印嵌入阶段, 对模型进行权重对抗训练, 将扰动注入模型权重以模拟水印攻击环境, 从而有效提高水印鲁棒性. 在CIFAR-10数据集上的实验结果表明, 与传统黑盒后门水印方案相比, 所提方案在面对微调、覆写攻击时水印提取率可以达到100%. 同时针对各类水印去除攻击, 该方案水印提取率都可以保持在80%以上.

Abstract: The construction of deep neural network models requires not only the intellectual efforts of designers but also the support of annotated datasets and computational resources. Protecting the intellectual property of these models has become an urgent issue, leading to a growing research interest in effectively verifying model ownership through model watermarking. To address the limitations of existing model watermarking schemes, this paper proposes a novel watermarking scheme based on feature combination and weight adversarial training. Firstly, it constructs a trigger set by fusing images from the original training dataset from a feature combination perspective. Then, using weight adversarial training, it adds perturbations to the model weights during the watermark embedding stage to simulate the watermark removal attack environment so that enhance watermark robustness. Finally, the watermarking performance is tested and comparative experiments are conducted with state-of-the-art and classical approaches in the same domain. Experimental results and analysis demonstrate that the proposed scheme achieves a remarkably high watermark extraction rate and effectively withstands various watermark removal attacks. Moreover, the proposed approach significantly surpasses the comparative approaches in terms of robustness while maintaining high fidelity.

HTML全文

参考文献(0)

施引文献

资源附件(0)