Exclude TodoWrite tool from numbering and display #887

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

pavangudiwada wants to merge 1 commit into master

from fix-todowrite-tool-numbering

Open

Exclude TodoWrite tool from numbering and display #887

pavangudiwada wants to merge 1 commit into master from fix-todowrite-tool-numbering

+78 −25

Conversation

pavangudiwada

Copy link

Contributor

@pavangudiwada pavangudiwada commented Aug 21, 2025

Skip numbering for TodoWrite tool calls in both regular and streaming modes
Filter TodoWrite tools from /show command history
Hide TodoWrite tools from /last command and auto-display output
Maintains correct sequential numbering for other tools

Before

CleanShot 2025年08月21日 at 18 45 48@2x

After

CleanShot 2025年08月21日 at 19 04 19@2x

@pavangudiwada


 Exclude TodoWrite tool from numbering and display

4f02c9c

- Skip numbering for TodoWrite tool calls in both regular and streaming modes
- Filter TodoWrite tools from /show command history
- Hide TodoWrite tools from /last command and auto-display output
- Maintains correct sequential numbering for other tools

@pavangudiwada pavangudiwada requested review from aantn and arikalon1

August 21, 2025 13:35

@coderabbitai coderabbitai

Copy link

Contributor

coderabbitai bot commented Aug 21, 2025 •

edited

Loading

Walkthrough

Adjusts tool numbering in tool_calling_llm to skip TodoWrite tools and uses computed tool_name for events; updates interactive UI to filter out TodoWrite tool calls from displays, history, and completions in multiple flows, including streaming and synchronous paths. No public API changes.

Changes

Cohort / File(s)	Summary
Tool numbering and event naming `holmes/core/tool_calling_llm.py`	Compute non_todo_write_count per batch; assign tool_number=None for TodoWrite, else sequential number excluding TodoWrite; update tool_number_offset by non_todo_write_count; derive tool_name from function/custom for START_TOOL and invoke; mirror logic in call_stream.
Interactive UI filtering `holmes/interactive.py`	Filter out TodoWrite tool calls in handle_last_command, display_recent_tool_outputs, run_interactive_loop, and get_bottom_toolbar; update usage counts, displays, history, and completer; show explicit message when only TodoWrite calls exist.

Sequence Diagram(s)

sequenceDiagram
 autonumber
 participant LLM as LLM
 participant TC as ToolCaller
 participant Tool as Tool.invoke
 LLM->>TC: tools_to_call (mixed, incl. TodoWrite)
 loop For each tool t
 TC->>TC: tool_name = t.function.name or t.custom.name
 alt tool_name == "TodoWrite"
 TC->>Tool: invoke(t, tool_number=None)
 else Non-TodoWrite
 TC->>Tool: invoke(t, tool_number=offset + non_todo_write_count)
 TC->>TC: non_todo_write_count++
 end
 TC-->>LLM: START_TOOL event with tool_name
 end
 TC->>TC: offset += non_todo_write_count
 TC-->>LLM: results (streaming or sync)

sequenceDiagram
 autonumber
 participant UI as Interactive UI
 participant Resp as LLM Response
 participant Hist as History/Completer
 Resp-->>UI: response.tool_calls (may include TodoWrite)
 UI->>UI: non_todo_write_tools = filter(tool_name != "TodoWrite")
 alt non_todo_write_tools empty
 UI-->>User: "No displayable tool calls..." (where relevant)
 else
 UI-->>User: Display panels, counts
 UI->>Hist: extend with non_todo_write_tools
 end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Fix tool call numbering to maintain sequential order across iterations #885 : Earlier introduction of tool_number_offset; this PR changes its increment logic by excluding TodoWrite.
fix parallel tool calling #661 : Adds tool_number to tool.invoke; this PR revises how that number is computed.
Interactive mode: add /run, /clear, /show, and /context commands + other improvements #644 : Adjusts interactive UI tool-call display and indexing; this PR further filters those displays to exclude TodoWrite.

Suggested reviewers

aantn
arikalon1

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix-todowrite-tool-numbering

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai[bot]

coderabbitai bot reviewed

Aug 21, 2025

View reviewed changes

Copy link

Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)

holmes/core/tool_calling_llm.py (4)
386-401: Skip-numbering logic for TodoWrite is correct; consider extracting a robust tool_name helper to avoid duplication and edge cases.

The conditional increment using non_todo_write_count achieves the intended numbering behavior. However, the tool_name extraction and TodoWrite check are duplicated here and in call_stream(), and the hasattr(...) else t.custom.name branch can still throw if a future SDK shape lacks both attributes. Extract a single helper and use safe getattr to make this future-proof and DRY.

Apply this diff in-place:
- for t in tools_to_call:
+ for t in tools_to_call:
 logging.debug(f"Tool to call: {t}")
- # Check if this is a TodoWrite tool
- tool_name = (
- t.function.name if hasattr(t, "function") else t.custom.name
- )
- is_todo_write = tool_name == "TodoWrite"
+ # Derive tool name in a robust way (handles function/custom/unknown)
+ tool_name = self._get_tool_name_from_call(t)
+ is_todo_write = tool_name == "TodoWrite"
Add this helper once in the class (outside the selected lines), then reuse it in both call() and call_stream():
# Place inside class ToolCallingLLM (e.g., near other @staticmethods)
@staticmethod
def _get_tool_name_from_call(tool_call: ChatCompletionMessageToolCall) -> str:
 # Prefer .function.name when available; fall back to .custom.name; else "unknown"
 fn = getattr(tool_call, "function", None)
 if fn and getattr(fn, "name", None):
 return fn.name
 custom = getattr(tool_call, "custom", None)
 if custom and getattr(custom, "name", None):
 return custom.name
 return "unknown"
709-723: Duplicate TodoWrite detection and numbering logic in call_stream — reuse the same helper.

Same suggestion as in call(): de-duplicate and harden name extraction to avoid attribute errors and keep behavior consistent.

Apply this diff:
- for t in tools_to_call: # type: ignore
- # Check if this is a TodoWrite tool
- tool_name = (
- t.function.name if hasattr(t, "function") else t.custom.name
- )
- is_todo_write = tool_name == "TodoWrite"
+ for t in tools_to_call: # type: ignore
+ tool_name = self._get_tool_name_from_call(t)
+ is_todo_write = tool_name == "TodoWrite"
724-736: Consider including tool_number in START_TOOL stream events for non-TodoWrite tools.

This can help stream consumers show consistent numbering during live updates (even if completion order differs from execution order). Keep it omitted for TodoWrite by adding only when not None.

Apply this diff:
- yield StreamMessage(
- event=StreamEvents.START_TOOL,
- data={"tool_name": tool_name, "id": t.id},
- )
+ event_data = {"tool_name": tool_name, "id": t.id}
+ if tool_num is not None:
+ event_data["tool_number"] = tool_num
+ yield StreamMessage(
+ event=StreamEvents.START_TOOL,
+ data=event_data,
+ )
434-435: Type-hint tool_number as Optional[int] for mypy and caller clarity.

This parameter is now semantically optional. Add the type hint to satisfy the project’s typing guideline.
- tool_number=None,
+ tool_number: Optional[int] = None,
holmes/interactive.py (2)
748-759: Good: TodoWrite calls are hidden from /last; centralize the hidden-set to avoid string duplication.

Filtering here achieves the PR goal. To prevent drift, define a constant set of hidden tool names once and reuse it across the file.

Apply this diff in place:
- non_todo_write_tools = [
- tool_call
- for tool_call in last_response.tool_calls
- if tool_call.tool_name != "TodoWrite"
- ]
+ non_todo_write_tools = [
+ tool_call
+ for tool_call in last_response.tool_calls
+ if tool_call.tool_name not in HIDDEN_TOOL_NAMES
+ ]
Add this once near the imports (outside the selected lines):
# Tools hidden from numbering/display flows
HIDDEN_TOOL_NAMES = {"TodoWrite"}
785-806: DRY the TodoWrite filtering with a small helper to keep behavior consistent across code paths.

The same filter appears here and in handle_last_command and run_interactive_loop. Consider a helper filter_displayable_tool_calls(...) to reduce duplication and future mistakes.

Apply this diff in place:
- non_todo_write_tools = [
- tool_call for tool_call in tool_calls if tool_call.tool_name != "TodoWrite"
- ]
+ non_todo_write_tools = filter_displayable_tool_calls(tool_calls)
Add this helper once (outside selected lines):
def filter_displayable_tool_calls(tool_calls: List[ToolCallResult]) -> List[ToolCallResult]:
 return [tc for tc in tool_calls if tc.tool_name not in HIDDEN_TOOL_NAMES]

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9746fa6 and 4f02c9c.

📒 Files selected for processing (2)

holmes/core/tool_calling_llm.py (4 hunks)
holmes/interactive.py (3 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: ALWAYS place Python imports at the top of the file, not inside functions or methods
Type hints are required (project is type-checked with mypy)
Use Ruff for formatting and linting (configured in pyproject.toml)

Files:

holmes/core/tool_calling_llm.py
holmes/interactive.py

🧬 Code graph analysis (2)

holmes/core/tool_calling_llm.py (2)

holmes/core/tracing.py (1)

DummySpan (101-117)

holmes/utils/stream.py (2)

StreamMessage (18-20)

StreamEvents (10-15)

holmes/interactive.py (1)

holmes/core/tools.py (1)

to_emoji (36-42)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: build (3.12)
GitHub Check: build (3.10)
GitHub Check: build (3.11)
GitHub Check: llm_evals
GitHub Check: Pre-commit checks

🔇 Additional comments (4)

holmes/core/tool_calling_llm.py (2)

420-422: Offset update matches non-TodoWrite count — numbering stays contiguous across batches.

No issues. This ensures the next iteration starts numbering at the correct offset, ignoring TodoWrite calls.

402-410: Awaiting results from the tool_num lookup to confirm it’s never zero. This will verify that tool_number=None is the only falsey case and thus downstream formatting is safe.

holmes/interactive.py (2)

761-765: Usage header correctly reflects only displayable tools.

This keeps user-facing counts aligned with /show indexing. No changes needed.

1054-1061: Confirming /show Indexing Consistency

Based on the ripgrep output:

In holmes/interactive.py at line 1043, tool_number_offset=len(all_tool_calls_history) is indeed the only offset passed into the LLM invocation (corresponding to the /show command) – nothing else in this file mutates or supplies a different offset.

Immediately after receiving response.tool_calls, the code filters out "TodoWrite" and extends all_tool_calls_history only with the non-TodoWrite calls (lines 1057–1060), ensuring the history used for /show remains aligned with the numbering basis.

In holmes/core/tool_calling_llm.py, the default tool_number_offset = 0 at line 622 is only ever used when no external offset is supplied (i.e., in contexts outside interactive.py).

There are no other occurrences of tool_number_offset in the repository. This confirms that:

The numbering offset for /show is always derived from the length of the filtered history.

No hidden or alternate offset sources exist that could desynchronize the displayed indices.

No changes are required. Everything aligns with the intended non-TodoWrite numbering.

@github-actions GitHub Actions

Copy link

Contributor

github-actions bot commented Aug 21, 2025

Results of HolmesGPT evals

ask_holmes: 33/39 test cases were successful, 2 regressions, 2 skipped, 2 setup failures

Test suite	Test case	Status
ask	01_how_many_pods	✅
ask	02_what_is_wrong_with_pod	✅
ask	03_what_is_the_command_to_port_forward	❌
ask	04_related_k8s_events	↪️
ask	05_image_version	✅
ask	09_crashpod	✅
ask	10_image_pull_backoff	✅
ask	11_init_containers	✅
ask	14_pending_resources	✅
ask	15_failed_readiness_probe	✅
ask	17_oom_kill	✅
ask	18_crash_looping_v2	✅
ask	19_detect_missing_app_details	✅
ask	20_long_log_file_search	✅
ask	24_misconfigured_pvc	✅
ask	28_permissions_error	🚧
ask	29_events_from_alert_manager	↪️
ask	39_failed_toolset	✅
ask	41_setup_argo	✅
ask	42_dns_issues_steps_new_tools	✅
ask	43_current_datetime_from_prompt	✅
ask	45_fetch_deployment_logs_simple	✅
ask	51_logs_summarize_errors	✅
ask	53_logs_find_term	✅
ask	54_not_truncated_when_getting_pods	✅
ask	59_label_based_counting	✅
ask	60_count_less_than	🚧
ask	61_exact_match_counting	✅
ask	63_fetch_error_logs_no_errors	✅
ask	79_configmap_mount_issue	✅
ask	83_secret_not_found	✅
ask	86_configmap_like_but_secret	✅
ask	93_calling_datadog	✅
ask	93_calling_datadog	✅
ask	93_calling_datadog	✅
ask	97_logs_clarification_needed	❌
ask	110_k8s_events_image_pull	✅
ask	24a_misconfigured_pvc_basic	✅
ask	13a_pending_node_selector_basic	✅

Legend

✅ the test was successful
↪️ the test was skipped
⚠️ the test failed but is known to be flaky or known to fail
🚧 the test had a setup failure (not a code regression)
🔧 the test failed due to mock data issues (not a code regression)
❌ the test failed and should be fixed before merging the PR

Labels

None yet

1 participant

@pavangudiwada

Exclude TodoWrite tool from numbering and display #887

Are you sure you want to change the base?

Exclude TodoWrite tool from numbering and display #887

Conversation

@pavangudiwada pavangudiwada commented Aug 21, 2025

Before

After

Uh oh!

coderabbitai bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 21, 2025

Results of HolmesGPT evals

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Aug 21, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)