Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

IBM/AssetOpsBench

Repository files navigation

AI Agents for Industrial Asset Operations & Maintenance

AssetOps MultiAgentBench EMNLP 2025 NeurIPS 2025 AAAI 2026

πŸ“˜ Tutorials: Learn more from our detailed guides β€”
ReActXen IoT Agent (EMNLP 2025) | FailureSensorIQ (NeurIPS 2025) | AssetOpsBench Lab (AAAI 2026) | Spiral (AAAI 2026) | AssetOpsBench Technical Material

πŸ“„ Paper | πŸ€— HF-Dataset | πŸ“’ IBM Blog | πŸ€— HF Blog | Contributors

Kaggle Hugging Face Open In Colab


πŸ“’ Call for Scenario Contribution

We are expanding AssetOpsBench to cover a broader range of industrial challenges. We invite researchers and practitioners to contribute new scenarios, particularly in the following areas:

  • Asset Classes: Turbines, HVAC Systems, Pumps, Transformers, CNC Machines, Robotics, and so on.
  • Task Domains: Prognostics and Health Management, Remaining Useful Life (RUL) estimation, or Root Cause Analysis (RCA), Diagnostic Analysis and Predictive Maintenance.

How to contribute:

  1. Study the Hugging Face dataset.
  2. Define your scenario following our Guideline.
  3. Submit a Pull Request or open an Issue with the tag new-scenario.
  4. Contact us via email if any question:

Resources


πŸ“‘ Table of Contents

  1. Announcements
  2. Introduction
  3. Datasets
  4. AI Agents
  5. Multi-Agent Frameworks
  6. System Diagram
  7. Leaderboards
  8. Docker Setup
  9. Talks & Events
  10. External Resources
  11. Contributors

Announcements (Papers, Invited Talks, etc)

  • πŸ“° AAAI-2026: SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search Authors
    Code

  • 🎯 AAAI-2026 Lab: From Inception to Productization: Hands-on Lab for the Lifecycle of Multimodal Agentic AI in Industry 4.0
    Website Authors AAAI 2026 Slides

  • πŸ“° AABA4ET/AAAI-2026: Agentic Code Generation for Heuristic Rules in Equipment Monitoring Authors

  • πŸ“° IAAI/AAAI-2026: Diversity Meets Relevancy: Multi-Agent Knowledge Probing for Industry 4.0 Applications Authors

  • πŸ“° IAAI/AAAI-2026: Deployed AI Agents for Industrial Asset Management: CodeReAct Framework for Event Analysis and Work Order Automation Authors

  • πŸ“° AAAI-2026 Demo: AssetOpsBench-Live: Privacy-Aware Online Evaluation of Multi-Agent Performance in Industrial Operations
    Authors Demo Video

  • πŸ“° NeurIPS-2025 Social β€” Evaluating Agentic Systems
    Talk: Building Reliable Agentic Benchmarks: Insights from AssetOpsBench Total Registered Users: 2000+ Conference
    Speaker
    Attend on Luma

  • πŸ•“ Past Event: 2025εΉ΄10月03ζ—₯ – 2-Hour Workshop: AI Agents and Their Role in Industry 4.0 Applications
    Event Host

  • πŸ† Accepted Papers: Parts of papers are accepted at NeurIPS 2025 , EMNLP 2025 Research Track , and EMNLP 2025 Industry Track .

  • πŸš€ 2025εΉ΄09月01ζ—₯: CODS 2025 Competition launched – Access AI Agentic Challenge AssetOpsBench-Live.

  • πŸ“¦ 2025εΉ΄06月01ζ—₯: AssetOpsBench v1.0 released with 141 industrial Scenarios.

✨ Stay tuned for new tracks, competitions, and community events.


Introduction

AssetOpsBench is a unified framework for developing, orchestrating, and evaluating domain-specific AI agents in industrial asset operations and maintenance.

It provides:

  • 4 domain-specific agents
  • 2 multi-agent orchestration frameworks

Designed for maintenance engineers, reliability specialists, and facility planners, it allows reproducible evaluation of multi-step workflows in simulated industrial environments.


Datasets: 141 Scenarios

AssetOpsBench scenarios span multiple domains:

Domain Example Task
IoT "List all sensors of Chiller 6 in MAIN site"
FSMR "Identify failure modes detected by Chiller 6 Supply Temperature"
TSFM "Forecast 'Chiller 9 Condenser Water Flow' for the week of 2020εΉ΄04月27ζ—₯"
WO "Generate a work order for Chiller 6 anomaly detection"

Some tasks focus on a single domain, others are multi-step end-to-end workflows.
Explore all scenarios HF-Dataset.


AI Agents

Domain-Specific Agents (Important tools)

  • IoT Agent: get_sites, get_history, get_assets, get_sensors
  • FMSR Agent: get_sensors, get_failure_modes, get_failure_sensor_mapping
  • TSFM Agent: forecasting, timeseries_anomaly_detection
  • WO Agent: generate_work_order

Multi-Agent Frameworks (Blue Prints)

  • MetaAgent : reAct-based single-agent-as-tool orchestration
  • AgentHive : plan-and-execute sequential workflow

MCP Environment

The mcp/ directory contains MCP servers and a plan-execute runner built on the Model Context Protocol. See INSTRUCTIONS.md for setup, usage, and testing.


Leaderboards

  • Evaluated with 7 Large Language Models
  • Trajectories scored using LLM Judge (Llama-4-Maverick-17B)
  • 6-dimensional criteria measure reasoning, execution, and data handling

Example: MetaAgent leaderboard

meta_agent_leaderboard


Run AssetOpsBench in Docker

  • Please Refer to the
  • Pre-built Docker Images: assetopsbench-basic (minimal) & assetopsbench-extra (full)
  • Conda environment: assetopsbench
  • Full setup guide
cd /path/to/AssetOpsBench
chmod +x benchmark/entrypoint.sh
docker-compose -f benchmark/docker-compose.yml build
docker-compose -f benchmark/docker-compose.yml up

External Resources


Star History Chart


Contributors

Thanks goes to these wonderful people ✨


AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /