PromptVis:面向文本生成图片的提示词的交互式可视分析方法

卢裕弘; 封颖超杰; 朱琳; 周海怡; 朱航; 喻晨昊; 陈为

doi:10.3724/SP.J.1089.2023-00344

PromptVis:面向文本生成图片的提示词的交互式可视分析方法

PromptVis: Prompt-Based Interactive Visual Analysis Method for Text-to-Image Creation

摘要

摘要: 高效地使用提示词实现文本到图片的生成是当前大模型的一个研究热点. 针对现有工作在提示词工程方面的不足, 提出一种面向文本生成图片的提示词的交互式可视分析方法——PromptVis, 帮助用户评估并迭代改进提示词, 以提升图片质量. 首先对用户输入的提示词语句进行成分解析, 并提供改进提示词的建议, 如推荐相关的提示词; 然后将用户输入与系统推荐的提示词集合进行聚类呈现, 并支持用户交互探索; 第三, 从多个维度自动评估文本提示词和生成的图片, 为用户修改提示词提供参考; 第四, 根据推荐的提示词对现有图片进行局部调整, 支持用户预览提示词的修改效果. 通过用户对比实验, 从提示词创作效率分析和实用性问卷评估 2 个角度, 证明了所提方法在辅助用户进行提示词创作上的实用性与有效性.

Abstract: It has become a research hotspot to help people write prompts for text to image creation. To tackle the limitations in existing work in prompt engineering, we propose PromptVis, an interactive visual analysis method for text to image generation, assisting users in evaluating and iteratively refining prompt to create images. First, the system parses the components of prompt phrases and provides suggestions for improving the prompts, such as recommendation-related prompts. Second, the user input and system-recommended prompts are visualized as clusters to support interactive exploration. Third, the system automatically evaluates the prompts and generates images from multiple dimensions, providing guidance for prompt improvement. Fourth, based on the recommended prompts, the system adjusts the local content on the existing images, enabling users to preview the effects of prompt modifications. Through creating prompts by diverse users for case studies and questionnaire evaluation show that this method can effectively assist users in creating prompts.

HTML全文

参考文献(22)

施引文献

资源附件(0)