Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

OpenMotionLab/MotionChain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

3 Commits

Repository files navigation

Official repo for MotionChain

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Arxiv Paper • Demo • FAQCitation

Intro MotionChain

MotionChain is a unified vision-motion-language generative pre-trained model, which performs conversational generation tasks via multi-modal inputs with language models.

Technical details

Recent advancements in language models have demonstrated their adeptness in conducting multi-turn dialogues and retaining conversational context. However, this proficiency remains largely unexplored in other multimodal generative models, particularly in human motion models. By integrating multi-turn conversations in controlling continuous virtual human movements, generative human motion models can achieve an intuitive and step-by-step process of human task execution for humanoid robotics, game agents, or other embodied systems. In this work, we present MotionChain, a conversational human motion controller to generate continuous and long-term human motion through multimodal prompts. Specifically, MotionChain consists of multi-modal tokenizers that transform various data types such as text, image, and motion, into discrete tokens, coupled with a Vision-Motion-aware Language model. By leveraging large-scale language, vision-language, and vision-motion data to assist motion-related generation tasks, MotionChain thus comprehends each instruction in multi-turn conversation and generates human motions followed by these prompts. Extensive experiments validate the efficacy of MotionChain, demonstrating state-of-the-art performance in conversational motion generation, as well as more intuitive manners of controlling and interacting with virtual humans.

pipeline

🚩 News

  • [2024年07月15日] Conversation dataset released.
  • [2024年04月02日] Upload paper and init project 🔥🔥🔥

⚡ Quick Start

▶️ Demo

👀 Visualization

⚠️ FAQ

Question-and-Answer

📖 Citation

If you find our code or paper helps, please consider citing:

@misc{jiang2024motionchain,
 title={MotionChain: Conversational Motion Controllers via Multimodal Prompts},
 author={Biao Jiang and Xin Chen and Chi Zhang and Fukun Yin and Zhuoyuan Li and Gang YU and Jiayuan Fan},
 year={2024},
 eprint={2404.01700},
 archivePrefix={arXiv},
 primaryClass={cs.CV}
}

Acknowledgments

Thanks to BEDLAM, TMR, vector-quantize-pytorch, Motion-GPT, Motion-latent-diffusion, T2m-gpt, TEMOS, ACTOR, HumanML3D and joints2smpl, our code is partially borrowing from them.

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.

About

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /