Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Canvas&A2UI support #76

geffzhang started this conversation in Ideas
Apr 26, 2026 · 2 comments · 1 reply
Discussion options

One of the core innovations of OpenClaw lies in its breakthrough beyond the limitations of pure text interaction in traditional AI assistants. Through the Canvas and A2UI (Agent-to-UI) protocol, it empowers AI agents with the ability to directly render and manipulate visual interfaces on your device screen. This marks the evolution of AI from a "conversationalist" to an "executor" and "presenter."
I. Canvas: AI's Dedicated Visual Workspace
Canvas is a visual panel embedded within the OpenClaw client (iOS, Android, macOS). You can think of it as a dedicated "screen" for AI agents.

Nature and Positioning: It is a lightweight visual workspace rendered based on WKWebView (on macOS), supporting full HTML, CSS, and JavaScript. Its design philosophy is to serve as an "Agent-driven visual workspace," allowing AI not only to speak but also to display.
Core Functions: Through Canvas, agents can perform various visual operations:

Display Web Pages: Navigate and open any webpage within Canvas's WebView (navigate).
Execute Scripts: Inject and execute JavaScript code (eval) within the context of a loaded page to achieve dynamic interactions, such as modifying styles or auto-filling forms.
Capture State: Capture the current content displayed on Canvas (snapshot) at any time for user reporting or as a visual reference for the next operation.
Control Visibility: Expand (present) or hide (hide) the Canvas panel itself.

Access and Storage: Canvas content is stored in a specific session directory on the local device (e.g., ~/Library/Application Support/OpenClaw/canvas// on macOS) and accessed through a custom URL scheme (openclaw-canvas://), ensuring security and localized performance.

II. A2UI: Making AI Your Interface Designer
A2UI is a structured protocol defined by OpenClaw, with the purpose of enabling AI agents to generate and push complete interactive user interfaces for direct rendering on Canvas. This is the most powerful function of the Canvas tool.

Working Principle: Unlike traditional web development models, in the A2UI mode, AI writes or assembles UI components in real-time based on the conversation context and pushes them to the user's Canvas for rendering via the protocol. The results of user interaction with the interface can then be fed back to the AI, forming a closed loop. The AI's output is not HTML but structured JSONL frames (one JSON object per line), with each frame describing a UI component (such as text, button, card, etc.).
Core Commands: The A2UI host engine primarily handles four core commands, forming the basic primitives for AI to control the front-end:

Push: Pushes the generated UI code to Canvas for rendering.
Reset: Clears all A2UI-rendered content on Canvas, restoring it to a blank state.
Eval: Allows the AI to send a piece of JavaScript script to be executed in a secure sandbox on Canvas, enabling dynamic logic (such as countdowns, animations).
Snapshot: Captures the current state of Canvas and sends it back to the AI, enabling the AI to "see" the results of the interface it pushed. This facilitates debugging and iteration based on visual feedback, forming a true "visual closed loop."

Format and Compatibility: Currently, OpenClaw's Canvas primarily supports the A2UI v0.8 JSONL format. The v0.9 version and related createSurface interface are not yet supported. In practical use, users typically only need to describe their needs in natural language, and the agent will automatically generate the correct A2UI frames and call the a2ui_push action.

III. Application Scenarios and Value of Canvas & A2UI
Combining Canvas and A2UI, OpenClaw can achieve rich application scenarios:

Dynamic Information Display: AI can instantly generate data dashboards, charts, to-do lists, or weather cards and push them to your phone or computer screen, making information presentation more intuitive.
Interactive Tasks: Users can interact through AI-generated interfaces, such as filling out forms or clicking buttons to confirm actions. These interaction events can trigger subsequent AI tasks.
Visual Monitoring of Automated Processes: When combined with tools like browser automation, Canvas can be used to display the progress of automated execution or screenshots of key steps, giving users a clear perception of the automation process.
Lowering the Barrier to Entry: Users do not need to write any code. They can obtain customized visual interfaces simply by using natural language instructions (e.g., "Help me generate a dashboard for this week's task progress and push it to my phone's Canvas"), greatly enhancing the practicality and ease of use of AI assistants.

You must be logged in to vote

Replies: 2 comments 1 reply

Comment options

This is great. Let me see the best way to include this.

You must be logged in to vote
1 reply
Comment options

please see #78

Comment options

geffzhang
May 18, 2026
Maintainer Author

A2UI v0.9 surface support #117

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
2 participants

AltStyle によって変換されたページ (->オリジナル) /