Natural and expressive human motion generation is a challenging task in computer animation. Current generative solutions are either low-quality or limited in expressiveness. Diffusion models, such as the Motion Diffusion Model (MDM), show promise in the human motion domain but are resource hungry and hard to control. MDM is a transformer-based generative model that predicts the sample rather than the noise in each diffusion step. It incorporates geometric losses for accurate motion generation and achieves state-of-the-art results on leading benchmarks for text-to-motion and action-to-motion tasks. The MDM framework allows for different forms of conditioning and demonstrates superior performance in a classifier-free manner.
Signal | Change | 10y horizon | Driving force |
---|---|---|---|
Natural and expressive human motion generation | Improvement in quality and expressiveness | More realistic and diverse human motion generation | Desire for more realistic computer animation |
Diffusion models for human motion generation | Resource hungry and hard to control | More efficient and controllable diffusion models | Need for more efficient and controllable generative models |
MDM framework for motion generation | Generic design enabling different conditioning | More versatile and adaptable motion generation models | Desire for flexible and customizable motion generation |
Text-to-motion task | Generating motion from text prompts | Improved motion generation based on textual descriptions | Advancement in text-based motion synthesis |
Action-to-motion task | Generating motion from action classes | Improved motion generation based on action inputs | Advancement in action-based motion synthesis |
Completion and editing of motion | Filling in gaps and editing specific body parts | More accurate and semantically consistent motion completion and editing | Desire for precise and controlled motion editing techniques |