Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
- 
 Updated
 Apr 30, 2025 
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
awesome grounding: A curated list of research papers in visual grounding
Autonomous Agents (LLMs) research papers. Updated Daily.
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
Democratization of RT-2 "RT-2: New model translates vision and language into action"
A curated list of awesome papers on Embodied AI and related research/industry-driven resources.
RAI is a vendor-agnostic agentic framework for robotics, utilizing ROS 2 tools to perform complex actions, defined scenarios, free interface execution, log summaries, voice interaction and more.
An open source framework for research in Embodied-AI from AI2.
Odyssey: Empowering Minecraft Agents with Open-World Skills
Seamlessly integrate state-of-the-art transformer models into robotics stacks
Embodied Co-Design for Rapidly Evolving Agents: Taxonomy, Frontiers, and Challenges
[arXiv 2023] Embodied Task Planning with Large Language Models
[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
[CVPR 2025 Highlight🔥] Official code repository for "Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning"
A collection of vision-language-action model post-training methods.
[IROS'25 Oral & NeurIPSw'24] Official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control "
[NeurIPS`25] TC-Light: Temporally Coherent Generative Rendering for Realistic World Transfer
Official Repo of LangSuitE
[NeurIPS 2024] GenRL: Multimodal-foundation world models enable grounding language and video prompts into embodied domains, by turning them into sequences of latent world model states. Latent state sequences can be decoded using the decoder of the model, allowing visualization of the expected behavior, before training the agent to execute it.
Add a description, image, and links to the embodied-agent topic page so that developers can more easily learn about it.
To associate your repository with the embodied-agent topic, visit your repo's landing page and select "manage topics."