Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fix bug with prometheus embeds #1044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
aantn wants to merge 1 commit into master
base: master
Choose a base branch
Loading
from fix-prometheus-embedding-bug
Draft

Conversation

Copy link
Contributor

@aantn aantn commented Oct 8, 2025

Sometimes holmes tries to create an embed based on Prometheus instant queries, which is wrong. It should only create embeds based on range queries.

Copy link
Contributor

coderabbitai bot commented Oct 8, 2025
edited
Loading

Walkthrough

Updated the Prometheus toolset instructions template by removing prior guidance about embedding execution results and partial-result snippets, adding a new section on embedding Prometheus result graphs with constraints, and introducing a labels section that iterates over config.additional_labels to append labels to all PromQL queries.

Changes

Cohort / File(s) Summary
Prometheus toolset instructions
`holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2`
- Removed requirement to embed execution results and specific partial-result snippet in answers
- Added "Embedding Graphs in Responses" section with format, allowed tool (only `execute_prometheus_range_query`), max graph count, and spacing rules
- Added "Labels to ALWAYS add to PromQL queries" section using templated iteration over `config.additional_labels` to append labels to all queries

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly summarizes the primary change by stating that the pull request fixes a bug related to Prometheus embeds, directly reflecting the PR’s focus on preventing embeds for instant queries and ensuring they only occur for range queries.
Description Check ✅ Passed The description clearly states the issue of embeds being generated from instant queries and specifies the intended behavior of only embedding range query results, which aligns directly with the changes introduced in the PR.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-prometheus-embedding-bug

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2 (2)

62-62: Clarify time-range bounds to avoid future-time queries

"15 min before and after the alert start time" can imply querying the future. Suggest bounding end at min(alert end, now).

Apply:

- * When doing queries, always extend the time range, to 15 min before and after the alert start time
+ * When doing queries, extend the time range to 15 min before the alert start time and up to the earlier of (alert end time, now)

81-87: Stabilize label listing order and simplify the condition

  • Use dictsort for deterministic output.
  • Use length filter directly on the dict.
-{%- if config and config.additional_labels and config.additional_labels.keys()|list|length > 0 %}
+{%- if config and (config.additional_labels | length) > 0 %}
 ## Labels to ALWAYS add to PromQL queries
 * ALWAYS add the following additional labels to ALL PromQL queries:
-{%- for key, value in config.additional_labels.items() %}
+{%- for key, value in (config.additional_labels | dictsort) %}
 * {{ key }}="{{ value }}"
 {%- endfor -%}
 {%- endif -%}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e6174e0 and e47172e.

📒 Files selected for processing (1)
  • holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2 (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
holmes/plugins/toolsets/**/*

📄 CodeRabbit inference engine (CLAUDE.md)

Toolsets must be located as holmes/plugins/toolsets/{name}.yaml or holmes/plugins/toolsets/{name}/

Files:

  • holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: llm_evals
  • GitHub Check: Pre-commit checks
🔇 Additional comments (1)
holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2 (1)

28-36: Metric hints look good

Preselected metrics guidance is clear and consistent with later rate/avg tips.

Comment on lines +63 to +72
## Embedding Graphs in Responses
* You are running in a GUI and can show users graphs so they can interpret the data themselves and confirm what you are saying
* When relevant to the user question OR if you think it can help users visually confirm what you are saying, ALWAYS embed a graph of Prometheus results into your answer so that users can see it as follows:
* Embed a graph by inserting the following format:

<< {"type": "promql", "tool_name": "execute_prometheus_range_query", "random_key": "92jf2hf"} >>

* ALWAYS include the "tool_name" and "random_key"
* You may ONLY embed graphs for execute_prometheus_range_query (and NOT execute_prometheus_instant_query)

Copy link
Contributor

@coderabbitai coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Tighten embed requirements: require type, unique random_key, and prior tool execution

  • Make "type" required (parser often relies on it).
  • State random_key must be unique per embed.
  • Require executing execute_prometheus_range_query with the exact PromQL before inserting the embed.
 ## Embedding Graphs in Responses
 * You are running in a GUI and can show users graphs so they can interpret the data themselves and confirm what you are saying
-* When relevant to the user question OR if you think it can help users visually confirm what you are saying, ALWAYS embed a graph of Prometheus results into your answer so that users can see it as follows:
+* When relevant to the user question OR if you think it can help users visually confirm what you are saying, ALWAYS embed a graph of Prometheus results into your answer so that users can see it as follows:
 * Embed a graph by inserting the following format:
 
 << {"type": "promql", "tool_name": "execute_prometheus_range_query", "random_key": "92jf2hf"} >>
 
-* ALWAYS include the "tool_name" and "random_key"
-* You may ONLY embed graphs for execute_prometheus_range_query (and NOT execute_prometheus_instant_query)
+* ALWAYS include "type", "tool_name", and a unique "random_key" per embed
+* You may ONLY embed graphs for execute_prometheus_range_query (and NOT execute_prometheus_instant_query)
+* ALWAYS execute execute_prometheus_range_query with the exact PromQL you want graphed before adding the embed placeholder

Please verify the embed parser tolerates the whitespace shown in the examples (spaces after ‘<<’ and before ‘>>’) and that no other templates allow instant-query embeds. Run:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo "1) Check parser patterns (tolerance for whitespace around << >> and required fields):"
rg -nP -C2 '<<\s*\{.*"type"\s*:\s*"promql".*"tool_name"\s*:\s*"execute_prometheus_range_query".*"random_key"\s*:\s*".+?"\s*\}\s*>>'
echo
echo "2) Search for any embed placeholders that reference execute_prometheus_instant_query:"
rg -nP '<<\s*\{[^}]*"tool_name"\s*:\s*"execute_prometheus_instant_query"'
echo
echo "3) Search for code that parses/prompts for promql embeds to confirm whitespace tolerance:"
rg -nP -C3 '<<\s*\{.*"type".*"tool_name".*"random_key".*\}\s*>>'

Length of output: 20947


Tighten PromQL embed requirements

  • Require each embed placeholder to include "type", "tool_name", and a unique "random_key".
  • Always execute execute_prometheus_range_query with the exact PromQL before inserting the embed placeholder.
  • Whitespace around << and >> is already tolerated—no changes needed to example spacing.
  • Remove or update any execute_prometheus_instant_query embeds (e.g. in FEATURES.md) so only range-query embeds remain.
🤖 Prompt for AI Agents
holmes/plugins/toolsets/prometheus/prometheus_instructions.jinja2 lines 63-72:
tighten embed handling so every Prometheus graph embed placeholder includes the
keys "type", "tool_name", and a unique "random_key"; ensure the code path always
executes execute_prometheus_range_query with the exact PromQL string prior to
inserting the placeholder (do not insert the placeholder unless that range query
was run); and find/replace any execute_prometheus_instant_query embeds
(including those in FEATURES.md) to use execute_prometheus_range_query
equivalents or remove them so only range-query embeds remain.

Copy link
Contributor

github-actions bot commented Oct 8, 2025

Results of HolmesGPT evals

  • ask_holmes: 34/35 test cases were successful, 0 regressions, 1 setup failures
Test suite Test case Status
ask 01_how_many_pods
ask 02_what_is_wrong_with_pod
ask 04_related_k8s_events
ask 05_image_version
ask 09_crashpod
ask 10_image_pull_backoff
ask 110_k8s_events_image_pull
ask 11_init_containers
ask 13a_pending_node_selector_basic
ask 14_pending_resources
ask 15_failed_readiness_probe
ask 17_oom_kill
ask 19_detect_missing_app_details
ask 20_long_log_file_search
ask 24_misconfigured_pvc
ask 24a_misconfigured_pvc_basic
ask 28_permissions_error 🚧
ask 39_failed_toolset
ask 41_setup_argo
ask 42_dns_issues_steps_new_tools
ask 43_current_datetime_from_prompt
ask 45_fetch_deployment_logs_simple
ask 51_logs_summarize_errors
ask 53_logs_find_term
ask 54_not_truncated_when_getting_pods
ask 59_label_based_counting
ask 60_count_less_than
ask 61_exact_match_counting
ask 63_fetch_error_logs_no_errors
ask 79_configmap_mount_issue
ask 83_secret_not_found
ask 86_configmap_like_but_secret
ask 93_calling_datadog[0]
ask 93_calling_datadog[1]
ask 93_calling_datadog[2]

Legend

  • ✅ the test was successful
  • :minus: the test was skipped
  • ⚠️ the test failed but is known to be flaky or known to fail
  • 🚧 the test had a setup failure (not a code regression)
  • 🔧 the test failed due to mock data issues (not a code regression)
  • 🚫 the test was throttled by API rate limits/overload
  • ❌ the test failed and should be fixed before merging the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@coderabbitai coderabbitai[bot] coderabbitai[bot] left review comments

At least 1 approving review is required to merge this pull request.

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

1 participant

AltStyle によって変換されたページ (->オリジナル) /