Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/ MP5 Public

[CVPR2024] This is the official implement of MP5

Notifications You must be signed in to change notification settings

IranQin/MP5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

13 Commits

Repository files navigation

MP5

MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception

[Paper] [Project Page] [Demo] [Dataset]

News

We are currently organizing the code for MP5. If you are interested in our work, please star ⭐ our project.

  • (2024年2月28日) MP5 is accepted to CVPR 2024!
  • (2023年12月12日) MP5 is released on arXiv.
  • (2024年03月29日) Code is released!

The process of finishing the task ''kill a pig with a stone sward during the daytime near the water with grass next to it.''

MP5 Framework

Active Perception

Directory Structure:

.
├── README.md
├── MP5_agnet
│  ├── All agents of MP5.
├── LAMM
│  ├── Scripts and models for training and testing MineLLM.

Setup

Note: We provide all the code except the human designed interface code, which you can try to implement yourself or use MineDreamer as a low level control module.

Install the environment

Because MineLLM is deployed on the server side, we need two virtual environments to run MineLLM and MP5_agent.

MineLLM

We recommend running on linux using a conda environment, with python 3.10. You can install the environment following here.

# activate conda env for MineLLM
conda activete minellm
pip install -r requirement.txt

MP5_agent

MineDojo requires Python ≥ 3.9. We have tested on Ubuntu 20.04 and Mac OS X. Please follow this guide to install the prerequisites first, such as JDK 8 for running Minecraft backend. We highly recommend creating a new Conda virtual env to isolate dependencies. Alternatively, we have provided a pre-built Docker image for easier installation.

Firstly, installing the MineDojo stable version is as simple as:

# activate conda env for MP5_agent
conda activate MP5_agent
pip install minedojo

or you could check Minedojo for more details.

Then install other requirement:

pip install -r requirement.txt

Training

To train MineLLM from scratch, please execute the following steps:

  1. Download datasets from huggingface
  2. git clone the LAMM codebase: cd MP5 and git clone https://github.com/IranQin/LAMM
  3. Put the datasets into LAMM/datasets/2D_Instruct and cd LAMM.
  4. Download the MineCLIP weight mineclip_image_encoder_vit-B_196tokens.pth from here and put it into the model_zoo/mineclip_ckpt folder.
  5. Train the agent by running: . src/scripts/train_lamm2d_mc_v1.5_slurm.sh, need to be changed if you do not use slurm.

Running the agent

Note: We provide all the code except the human designed interface code, which you can try to implement yourself or use MineDreamer as a low level control module.

MineLLM service

To start the MineLLM service, please execute the following steps:

  1. Download checkpoints from web
  2. Put the checkpoints into LAMM/ckpt and cd LAMM.
  3. Start the MineLLM service by running: . src/scripts/mllm_api_slurm.sh, need to be changed if you do not use slurm.

MP5_agent

To run the MP5_agent, please execute the following steps:

cd MP5_agent/
bash scripts/run_agent.sh

Citation

If you find this repository useful for your work, please consider citing it as follows:

@inproceedings{qin2024mp5,
 title={MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception},
 author={Qin, Yiran and Zhou, Enshen and Liu, Qichang and Yin, Zhenfei and Sheng, Lu and Zhang, Ruimao and Qiao, Yu and Shao, Jing},
 booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
 pages={16307--16316},
 year={2024}
}
@article{zhou2024minedreamer,
 title={MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control},
 author={Zhou, Enshen and Qin, Yiran and Yin, Zhenfei and Huang, Yuzhou and Zhang, Ruimao and Sheng, Lu and Qiao, Yu and Shao, Jing},
 journal={arXiv preprint arXiv:2403.12037},
 year={2024}
}

About

[CVPR2024] This is the official implement of MP5

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

AltStyle によって変換されたページ (->オリジナル) /