Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: multi Agent OS server support in AskUiControllerClient #276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mlikasam-askui wants to merge 21 commits into main
base: main
Choose a base branch
Loading
from SOLENG-360-refactor/askui-controller-multi-target
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b654464
feat: support multiple target computers in AskUiControllerClient
mlikasam-askui May 8, 2026
768653c
add reporter messgaes
mlikasam-askui May 8, 2026
01e8f4f
fix logger
mlikasam-askui May 8, 2026
cfb84f6
refactor: drop tags from target computers and prefix tool output with...
mlikasam-askui May 8, 2026
de55cf2
fixed reporter
mlikasam-askui May 8, 2026
c7f9631
add temporary_select target
mlikasam-askui May 11, 2026
aa0aba5
add device_id
mlikasam-askui May 11, 2026
78523b4
improve doc string
mlikasam-askui May 11, 2026
4ca7f80
improve error message
mlikasam-askui May 11, 2026
9379663
add unit tests
mlikasam-askui May 11, 2026
e40c02c
improve tool results
mlikasam-askui May 12, 2026
3493c4a
fix lint problems
mlikasam-askui May 12, 2026
9a475c5
improve-doc-string
mlikasam-askui May 12, 2026
7bd98be
feat: capture unhandled errors in reporter
mlikasam-askui May 12, 2026
24de514
Revert "feat: capture unhandled errors in reporter"
mlikasam-askui May 12, 2026
55cdac0
implement review remarks
mlikasam-askui May 19, 2026
40873be
Merge remote-tracking branch 'origin/main' into SOLENG-360-refactor/a...
mlikasam-askui May 19, 2026
a8e6bef
refactor: rename AgentOs and computer-target classes for clarity
mlikasam-askui Jun 11, 2026
8885012
refactor: extract ComputerTargetConnection and let targets own their ...
mlikasam-askui Jun 11, 2026
c4fc57b
rm add_remote_agent_os_target_computer
mlikasam-askui Jun 12, 2026
b9389b1
replace list with describe
mlikasam-askui Jun 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 8 additions & 9 deletions CLAUDE.md
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -125,13 +125,12 @@ models.register("my-model", custom_model_instance)

### Agent OS Abstraction

`AgentOs` provides an abstraction layer for OS-level operations:
`ComputerAgentOS` provides an abstraction layer for OS-level operations:

```
AgentOs (Abstract Interface)
├── AskUiControllerClient (gRPC to AskUI Agent OS - primary)
ComputerAgentOS (Abstract Interface)
├── MultiComputerTargetAgentOS (gRPC to AskUI Agent OS - primary)
├── PlaywrightAgentOs (Web browser automation)
└── AndroidAgentOs (Android ADB)
```

### Locator System
Expand Down Expand Up @@ -175,7 +174,7 @@ Tools are auto-discovered and can be dynamically loaded via MCP configurations.
- `src/askui/prompts/` - System prompts for different models

### Tools & OS
- `src/askui/tools/agent_os.py` - Abstract `AgentOs` interface
- `src/askui/tools/agent_os.py` - Abstract `ComputerAgentOS` interface
- `src/askui/tools/askui/` - gRPC client for AskUI Agent OS
- `src/askui/tools/android/` - Android-specific tools
- `src/askui/tools/playwright/` - Web automation tools
Expand Down Expand Up @@ -247,7 +246,7 @@ When writing or updating documentation in `docs/`:
## Important Patterns

### Composition over Inheritance
- `AgentToolbox` wraps `AgentOs` implementations
- `AgentToolbox` wraps `ComputerAgentOS` implementations
- `ModelRouter` composes multiple model providers
- `CompositeReporter` aggregates multiple reporters

Expand All @@ -261,7 +260,7 @@ When writing or updating documentation in `docs/`:
- Retry strategies with exponential backoff

### Adapter Pattern
- `AgentOs` abstraction bridges OS implementations (gRPC, Playwright, ADB)
- `ComputerAgentOS` abstraction bridges OS implementations (gRPC, Playwright, ADB)
- `ModelFacade` adapts models to `ActModel`/`GetModel`/`LocateModel` interfaces

### Dependency Injection
Expand Down Expand Up @@ -299,13 +298,13 @@ When writing or updating documentation in `docs/`:
### Adding Custom Tools
1. Implement `Tool` protocol in `models/shared/tools.py`
2. Register in appropriate MCP server (`api/mcp_servers/{type}.py`)
3. Use `@auto_inject_agent_os` for AgentOs dependency
3. Use `@auto_inject_agent_os` for ComputerAgentOS dependency
4. Follow Pydantic schema validation

### Adding New Agent Types
1. Inherit from `Agent`
2. Implement required abstract methods
3. Provide appropriate `AgentOs` implementation
3. Provide appropriate `ComputerAgentOS` implementation
4. Register in agent factory if needed

## Performance & Caching
Expand Down
2 changes: 1 addition & 1 deletion docs/07_tools.md
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ Work with any agent type, no special dependencies required.

#### Computer Tools (`computer/`)

Require `AgentOs` and work with `ComputerAgent` for desktop automation.
Require `ComputerAgentOS` and work with `ComputerAgent` for desktop automation.

**Examples:**
- `ComputerSaveScreenshotTool(base_dir)` - Save screenshots to disk
Expand Down
2 changes: 2 additions & 0 deletions mypy.ini
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ plugins = pydantic.mypy,sqlalchemy.ext.mypy.plugin
exclude = (?x)(
^src/askui/models/ui_tars_ep/ui_tars_api\.py$
| ^src/askui/tools/askui/askui_ui_controller_grpc/.*$
| ^venv/.*$
| ^\.venv/.*$
)
mypy_path = src:tests
explicit_package_bases = true
Expand Down
3 changes: 3 additions & 0 deletions src/askui/__init__.py
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
from .models.types.response_schemas import ResponseSchema, ResponseSchemaBase
from .retry import ConfigurableRetry, Retry
from .tools import ModifierKey, PcKey
from .tools.askui import LocalComputerTarget, RemoteComputerTarget
from .utils.image_utils import ImageSource
from .utils.source_utils import InputSource

Expand All @@ -69,6 +70,8 @@
logging.getLogger(__name__).addHandler(logging.NullHandler())

__all__ = [
"RemoteComputerTarget",
"LocalComputerTarget",
"Agent",
"AutomationError",
"ComputerAgent",
Expand Down
4 changes: 2 additions & 2 deletions src/askui/agent_base.py
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from askui.models.shared.truncation_strategies import TruncationStrategy
from askui.prompts.act_prompts import CACHE_USE_PROMPT, create_default_prompt
from askui.telemetry.otel import OtelSettings, setup_opentelemetry_tracing
from askui.tools.agent_os import AgentOs
from askui.tools.agent_os import ComputerAgentOS
from askui.tools.android.agent_os import AndroidAgentOs
from askui.tools.caching_tools import (
InspectCacheMetadata,
Expand Down Expand Up @@ -57,7 +57,7 @@ def __init__(
reporter: Reporter | None = None,
retry: Retry | None = None,
tools: list[Tool] | None = None,
agent_os: AgentOs | AndroidAgentOs | None = None,
agent_os: ComputerAgentOS | AndroidAgentOs | None = None,
settings: AgentSettings | None = None,
callbacks: list[ConversationCallback] | None = None,
truncation_strategy: TruncationStrategy | None = None,
Expand Down
77 changes: 68 additions & 9 deletions src/askui/computer_agent.py
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@
create_computer_agent_prompt,
)
from askui.tools.computer import (
ComputerGetCurrentComputerTargetIdTool,
ComputerGetMousePositionTool,
ComputerGetSystemInfoTool,
ComputerKeyboardPressedTool,
ComputerKeyboardReleaseTool,
ComputerKeyboardTapTool,
ComputerListAgentOsTargetComputersTool,
ComputerListDisplaysTool,
ComputerMouseClickTool,
ComputerMouseHoldDownTool,
Expand All @@ -31,14 +33,15 @@
ComputerRetrieveActiveDisplayTool,
ComputerScreenshotTool,
ComputerSetActiveDisplayTool,
ComputerSwitchAgentOsTargetComputerTool,
ComputerTypeTool,
)
from askui.tools.exception_tool import ExceptionTool

from .reporting import CompositeReporter, Reporter
from .retry import Retry
from .tools import AgentToolbox, ComputerAgentOsFacade, ModifierKey, PcKey
from .tools.askui import AskUiControllerClient
from .tools.askui import ComputerTarget, MultiComputerTargetAgentOS

logger = logging.getLogger(__name__)

Expand All @@ -50,17 +53,39 @@ class ComputerAgent(Agent):
This agent can perform various UI interactions like clicking, typing, scrolling, and more.
It uses computer vision models to locate UI elements and execute actions on them.

A single `ComputerAgent` can drive **one or more machines** through the
`agent_os_target_computers` argument. Each entry is an Agent OS target
computer (local subprocess or remote gRPC endpoint) identified by a stable
`computer_id`. At any moment one target is *active* and receives all
explicit calls (`click`, `type`, `keyboard`, ...). The active target can be
changed at runtime via
`agent.tools.os.switch_agent_os_target_computer(computer_id)` or scoped to a
block using `agent.tools.os.temporary_select(computer_id)`. The `act()`
model is also given list/switch/get-current tools so it can orchestrate
work across machines on its own (e.g. read something on one computer and
re-enter it on another).

Args:
display (int, optional): The display number to use for screen interactions. Defaults to `1`.
display (int, optional): The display number to use for screen interactions on the default local target. Ignored when `agent_os_target_computers` is provided. Defaults to `1`.
reporters (list[Reporter] | None, optional): List of reporter instances for logging and reporting. If `None`, an empty list is used.
tools (AgentToolbox | None, optional): Custom toolbox instance. If `None`, a default one will be created with `AskUiControllerClient`.
agent_os_target_computers (list[ComputerTarget] | None, optional):
Target computers the agent can route actions to. May mix one
`LocalComputerTarget` (managing a controller subprocess on this
machine) with any number of `RemoteComputerTarget`s pointing at
controllers already running on other machines. Constraints: at
least one target, at most one local, and remote `address`es plus
all `computer_id`s must be unique. The first entry becomes the
initial active target. Defaults to a single local target bound to
`display`.
settings (AgentSettings | None, optional): Provider-based model settings. If `None`, uses the default AskUI model stack.
retry (Retry, optional): The retry instance to use for retrying failed actions. Defaults to `ConfigurableRetry` with exponential backoff. Currently only supported for `locate()` method.
act_tools (list[Tool] | None, optional): Additional tools to make available for
the `act()` method for every call. Same tools can instead be passed per call
via `act(..., tools=[...])` (see example below).

Example:
Single local machine (the default):

```python
from askui import ComputerAgent

Expand All @@ -70,6 +95,36 @@ class ComputerAgent(Agent):
agent.act("Open settings menu")
```

Example:
Research on one machine and write up the findings on another. The
first target in the list is the active one; `temporary_select`
re-routes a block of explicit calls and restores the previous
active target on exit.

```python
from askui import ComputerAgent
from askui.tools.askui import LocalComputerTarget, RemoteComputerTarget

with ComputerAgent(
agent_os_target_computers=[
LocalComputerTarget(computer_id="research-box"),
RemoteComputerTarget(
address="192.168.1.42:26000",
description="Writer box with a text editor open",
computer_id="writer-box",
),
],
) as agent:
agent.act(
"On research-box, open a browser, google 'askui', and read "
"the top results to gather key facts about what AskUI is, "
"what it does, and notable features. Then switch to "
"writer-box and write a Markdown document titled "
"'AskUI Findings' summarizing those facts as a bulleted "
"list in the open text editor."
)
```

Example (optional tools for `act()`):
Register tools from `askui.tools.store` (or your own `Tool` implementations)
either on the agent so they apply to all `act()` calls, or only for one call.
Expand All @@ -94,30 +149,31 @@ class ComputerAgent(Agent):
@telemetry.record_call(
exclude={
"reporters",
"tools",
"settings",
"act_tools",
"callbacks",
"truncation_strategy",
"agent_os_target_computers",
}
)
@validate_call(config=ConfigDict(arbitrary_types_allowed=True))
def __init__(
self,
display: Annotated[int, Field(ge=1)] = 1,
reporters: list[Reporter] | None = None,
tools: AgentToolbox | None = None,
agent_os_target_computers: list[ComputerTarget] | None = None,
settings: AgentSettings | None = None,
retry: Retry | None = None,
act_tools: list[Tool] | None = None,
callbacks: list[ConversationCallback] | None = None,
truncation_strategy: TruncationStrategy | None = None,
) -> None:
reporter = CompositeReporter(reporters=reporters)
self.tools = tools or AgentToolbox(
agent_os=AskUiControllerClient(
self.tools = AgentToolbox(
agent_os=MultiComputerTargetAgentOS(
display=display,
reporter=reporter,
agent_os_target_computers=agent_os_target_computers,
)
)
super().__init__(
Expand Down Expand Up @@ -500,8 +556,8 @@ def cli(

with ComputerAgent() as agent:
# Use for Windows
agent.cli(r'start "" "C:\Program Files\VideoLAN\VLC\vlc.exe"') # Start in VLC non-blocking
agent.cli(r'"C:\Program Files\VideoLAN\VLC\vlc.exe"') # Start in VLC blocking
agent.cli(r'start "" "C:\\Program Files\\VideoLAN\\VLC\\vlc.exe"') # Start in VLC non-blocking
agent.cli(r'"C:\\Program Files\\VideoLAN\\VLC\\vlc.exe"') # Start in VLC blocking

# Mac
agent.cli("open -a chrome") # Open Chrome non-blocking for mac
Expand Down Expand Up @@ -541,6 +597,9 @@ def get_default_tools() -> list[Tool]:
ComputerListDisplaysTool(),
ComputerRetrieveActiveDisplayTool(),
ComputerSetActiveDisplayTool(),
ComputerListAgentOsTargetComputersTool(),
ComputerSwitchAgentOsTargetComputerTool(),
ComputerGetCurrentComputerTargetIdTool(),
]


Expand Down
6 changes: 3 additions & 3 deletions src/askui/models/shared/android_base_tool.py
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

from askui.models.shared.tool_tags import ToolTags
from askui.models.shared.tools import ToolWithAgentOS
from askui.tools import AgentOs
from askui.tools import ComputerAgentOS
from askui.tools.agent_os_type_error import AgentOsTypeError
from askui.tools.android.agent_os import AndroidAgentOs

Expand Down Expand Up @@ -41,11 +41,11 @@ def agent_os(self) -> AndroidAgentOs:
return agent_os

@agent_os.setter
def agent_os(self, agent_os: AgentOs | AndroidAgentOs) -> None:
def agent_os(self, agent_os: ComputerAgentOS | AndroidAgentOs) -> None:
"""Set the agent OS.

Args:
agent_os (AgentOs | AndroidAgentOs): The agent OS instance to set.
agent_os (ComputerAgentOS | AndroidAgentOs): The agent OS instance to set.

Raises:
TypeError: If the agent OS is not an AndroidAgentOs instance.
Expand Down
25 changes: 13 additions & 12 deletions src/askui/models/shared/computer_base_tool.py
View file Open in desktop
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@

from askui.models.shared.tool_tags import ToolTags
from askui.models.shared.tools import ToolWithAgentOS
from askui.tools.agent_os import AgentOs
from askui.tools.agent_os import ComputerAgentOS
from askui.tools.agent_os_type_error import AgentOsTypeError
from askui.tools.android.agent_os import AndroidAgentOs


class ComputerBaseTool(ToolWithAgentOS):
"""Tool base class that has an AgentOs available."""
"""Tool base class that has a ComputerAgentOS available."""

def __init__(
self,
agent_os: AgentOs | None = None,
agent_os: ComputerAgentOS | None = None,
required_tags: list[str] | None = None,
**kwargs: Any,
) -> None:
Expand All @@ -23,33 +23,34 @@ def __init__(
)

@property
def agent_os(self) -> AgentOs:
def agent_os(self) -> ComputerAgentOS:
"""Get the agent OS.

Returns:
AgentOs: The agent OS instance.
ComputerAgentOS: The agent OS instance.
"""
agent_os = super().agent_os
if not isinstance(agent_os, AgentOs):
if not isinstance(agent_os, ComputerAgentOS):
raise AgentOsTypeError(
expected_type=AgentOs,
expected_type=ComputerAgentOS,
actual_type=type(agent_os),
)
return agent_os

@agent_os.setter
def agent_os(self, agent_os: AgentOs | AndroidAgentOs) -> None:
def agent_os(self, agent_os: ComputerAgentOS | AndroidAgentOs) -> None:
"""Set the agent OS facade.

Args:
agent_os (AgentOs | AndroidAgentOs): The agent OS facade instance to set.
agent_os (ComputerAgentOS | AndroidAgentOs): The agent OS facade
instance to set.

Raises:
TypeError: If the agent OS is not an AgentOs instance.
TypeError: If the agent OS is not a ComputerAgentOS instance.
"""
if not isinstance(agent_os, AgentOs):
if not isinstance(agent_os, ComputerAgentOS):
raise AgentOsTypeError(
expected_type=AgentOs,
expected_type=ComputerAgentOS,
actual_type=type(agent_os),
)
self._agent_os = agent_os
Loading
Loading

AltStyle によって変換されたページ (->オリジナル) /