Cross-modal Image Aesthetics Assessment with Scene Features

Yuzhen Niu; Shanshan Chen; Yuezhou Li; Wenxi Liu

doi:10.3724/SP.J.1089..2023-00477

Yuzhen Niu, Shanshan Chen, Yuezhou Li, Wenxi Liu. Cross-modal Image Aesthetics Assessment with Scene FeaturesJ. Journal of Computer-Aided Design & Computer Graphics. DOI: 10.3724/SP.J.1089..2023-00477

Citation:

Cross-modal Image Aesthetics Assessment with Scene Features

Graphical Abstract

Abstract

Abstract

Image aesthetics assessment aims to simulate human perception and cognition of beauty through computers, enabling the computers to automatically evaluate the aesthetic qualities of images. The images on social media are typically accompanied by comments, but existing image aesthetics assessment methods only focus on images but ignore user comments, thus limiting their performance. Since user comments contain rich image semantic information, some recent works have attempted to utilize user comments to assist in image aesthetics assessment. However, these methods fail to fully exploit image features or model the complex relationship between image features and text features, resulting in insufficient utilization of image information and partially-modeled interaction between image and text information. To solve the above problems, this paper proposes a cross-modal image aesthetics assessment method that integrates scene features (CIAASF). Since image scenes usually affect the aesthetics assessment of human perception for images, this paper first extracts scene features and aesthetic features from images and deeply fuses them using a multi-scale feature fusion module. Second, considering the intrinsic correlation between image features and text features, this paper uses the multi-headed cross-attention mechanism to compute the cross-attention between image features and text features, which can thus interact and fuse the image and text information. Finally, the fused cross-modal features are used for aesthetics assessment tasks. Extensive experimental results on a generic large image aesthetics assessment dataset AVA show that the performance of the proposed CIAASF model outperforms the state-of-the-art cross-modal image aesthetics assessment methods on both classification prediction and score prediction tasks for image aesthetics assessment.

FullText(HTML)

References (0)

Cited By

Extended English Abstract

Turn off MathJax

Article Contents

Cross-modal Image Aesthetics Assessment with Scene Features

Abstract

Catalog

Export File

Citation

Format

Content