Abstract:
Deep learning models are vulnerable to adversarial attack, and adding a small, perceptually indistinguishable perturbation to data can easily degrade its classification performance. Aiming at the problems of low efficiency and low success rate of existing black-box adversarial attacks, a black-box adversarial attack via topological adaptive particle swarm optimization is proposed. Firstly, a population containing initial adversarial samples is randomly generated according to the original image. Secondly, the perturbation of each sample is calculated according to the neighborhood information and iterate through the search space, and the dynamic penalty term coefficient is calculated to control the fitness. After multiple iterations, when the fitness of population does not improve, each sample performs neighborhood redistribution operation, and the sample state is adjusted according to the evolution trajectory. Finally, the redundant disturbance is pruned to obtain the final adversarial sample. Taking classification models such as InceptionV3 as the attack object,using MNIST, CIFAR-10 and ImageNet datasets, under the same number of samples and model access constraints, untargeted adversarial attack and target adversarial attack experiments are carried out. Compared with existing methods, the proposed attack has fewer visits to model and higher attack success rate, the average number of visits to the InceptionV3 model is 2 502, and the attack success rate is 94.30%.