KM2D: 舞蹈动作基元符号和音乐语义驱动的舞蹈动画生成方法

石敏; 孙碧莲; 朱登明; 禚心如

doi:10.3724/SP.J.1089.2024-00325

KM2D: 舞蹈动作基元符号和音乐语义驱动的舞蹈动画生成方法

KM2D: Method for Generating Dance Animation Driven by Dance Movement Primitives and Musical Semantics

摘要

摘要: 舞蹈创作的流畅性与表现力, 受到音乐节奏和舞蹈动作基元的直接驱动和影响. 本文提出了一种舞蹈动作基元符号和音乐语义驱动的舞蹈生成方法. 该方法旨在根据给定的音乐片段和舞蹈动作基元符号序列, 生成自然流畅且与音乐节奏协调一致的舞蹈动作序列. 首先, 构造了一个舞蹈数据集. 与现有的仅包含音乐和舞蹈类型标注的数据集不同, 我们的数据集在音乐和舞蹈序列的基础上, 增加了详细的舞蹈动作基元标注. 通过对音乐节奏和舞蹈动作的详细标注, 实现音乐、动作基元符号与实际舞蹈动作的有效映射. 其次, 提出了一种基于扩散模型的舞蹈序列生成方法, 该模型不仅能够捕捉动作之间的平滑过渡, 还能确保生成的舞蹈动作序列与音乐的节奏保持一致. 在指定舞蹈风格和音乐的同时, 通过指定细节动作以及不同动作基元的组合, 使得生成的舞蹈动作序列与预期风格和内容一致, 减少随机生成导致的不一致和不可预见性, 也使得舞蹈表现更加丰富和细腻. 在网络模型中, 增加了动作基元符号特征提取模块, 实现更高效的特征表示和融合. 为了优化生成效果, 我们设计了损失函数, 包括动作平滑性损失、音乐同步性损失、动作连续性损失和物理合理性损失. 实验结果表明, 基于扩散模型的方法能够生成多样性高、自然流畅且与音乐节奏匹配的舞蹈动作序列. 通过定量和定性评估, 我们验证了所提方法在舞蹈生成任务中的有效性和优越性. 此外, 本文方法也可扩展应用于不同风格和复杂度的舞蹈生成任务中.

Abstract: The fluidity and expressiveness of dance creation are directly driven and influenced by musical rhythm and dance movement primitives. This paper proposes a dance generation method driven by dance movement primitive symbols and musical semantics. This method aims to generate natural and fluid dance movement sequences that are in harmony with the musical rhythm, based on given music segments and sequences of dance movement primitive symbols. First, a dance dataset was constructed. Unlike existing datasets that only include annotations for music and dance types, our dataset adds detailed annotations for dance movement primitives on top of the music and dance sequences. By providing detailed annotations for musical rhythm and dance movements, effective mapping between music, movement primitive symbols, and actual dance movements is achieved. Secondly, a dance sequence generation method based on a diffusion model is proposed. This model not only captures smooth transitions between movements but also ensures that the generated dance movement sequences are consistent with the musical rhythm. By specifying detailed movements and combinations of different movement primitives, the generated dance movement sequences align with the intended style and content, reducing inconsistencies and unpredictability caused by random generation. This approach also enriches and refines the dance performance. In the network model, an action primitive symbol feature extraction module was added to achieve more efficient feature representation and integration. To optimize the generation effects, we designed a loss function that includes action smoothness loss, music synchronization loss, action continuity loss, and physical plausibility loss. Experimental results indicate that the diffusion model-based method can generate diverse, natural, and fluid dance movement sequences that match the musical rhythm. Through both quantitative and qualitative evaluations, we validated the effectiveness and superiority of the proposed method in dance generation tasks. Additionally, this method can be extended to generate dance sequences of different styles and complexities.

HTML全文

参考文献(0)

施引文献

资源附件(0)