Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: ZJU-REAL/ClawGUI

clawgui-app-v0.3.1

21 May 17:06
@github-actions github-actions

Choose a tag to compare

Fixes

  • Feishu GUI runner now drives step-by-step. The chat bubble updates with the current action on every step instead of staying frozen on "正在执行任务...". Each step has a 5-minute timeout so a wedged LLM call can't hang the loop forever.
  • Clean error surfacing for Feishu replies. SDK errors are decoded to code: message instead of the previous Error@<hash>. Pre-flight failures (missing Vision creds, no device-control auth) now reply to Feishu with a human-readable reason instead of silently bailing. Image-reply failures land in Settings → 外部通道 → 调试日志 so users can see them without logcat.
  • @_user_N mention stripping. Was hardcoded to @_user_1 ; now regex-matches @_(user|chat)_N so @bot 发朋友圈 reaches PhoneAgent as just 发朋友圈.
  • Live screenshot fallback when image-reply is enabled but trace recording is off — users still get a final-state image instead of silent nothing.
  • IME enabled check via the Android Framework API instead of ime list -s. Users who'd already enabled "ClawGUI Input" via system settings would see "未启用" in our panel because the shell command needed Shizuku/wadb auth to succeed. Settings → 输入法 also dropped the manual switcher button — agent handles the switching itself.

Chore

  • Repo cleanup: retired the v1 clawgui-app/ (Java client) and promoted clawgui-app-ng/ into the natural clawgui-app/ path. Android applicationId stays com.clawgui.ng so existing v0.3.0 installs upgrade in place.
Assets 3
Loading

clawgui-app-v0.3.0

19 May 15:37
@github-actions github-actions

Choose a tag to compare

Structured Plan + execution trace

  • PhoneAgent now maintains a user-visible task plan through a 6-state machine — PENDING / IN_PROGRESS / DONE / SKIPPED / FAILED / BLOCKED.
  • The model emits incremental <plan>{"ops":[...]}</plan> blocks each step (init / update / insert_after / remove).
  • A PlanCard renders inside the assistant bubble; status transitions fade + scale, the IN_PROGRESS row pulses.
  • New ActionTraceList: every step shown as a chip (index / name / args / status badge), with a spinner placeholder while waiting on the next LLM round-trip.

Floating Plan + Trace overlay

  • While a task runs, a semi-transparent draggable card floats above the host app so the user can keep an eye on progress without flipping back to ClawGUI.
  • 3 collapse stages: 20dp blue dot → 1-line chip → full panel. Tap cycles, long-press drags.
  • Auto-hides on finish / cancel / error. ×ばつ dismisses for the current run.
  • Settings → Floating Panel: enable toggle, alpha slider (40–100%), permission prompt.

Mid-task Ask

  • The agent now proactively asks the user when uncertain: missing parameters, multiple candidates, sensitive operations, no clear next step, fuzzy task scope.
  • The input field is embedded directly in the floating panel — no need to switch back to ClawGUI to answer. IME pops automatically.
  • The answer is injected as a system-role context turn; PhoneAgent resumes immediately.

Image attachments + multimodal

  • + button in the input bar opens a sheet: pick from gallery or take a photo. Up to 3 attachments.
  • Thumbnails render inside the user bubble; tap for a full-screen Lightbox (pinch-zoom + pan).
  • Images are treated as task inputs (not just reference): auto-exported to Pictures/ClawGUI/ so any host app's picker can find them.
  • Brain falls back to the Vision provider when it can't see images. ImageInsight runs a generic content-summary + extractable-text pre-pass.

Polish

  • Follow-up suggestions: 3 one-tap chips under the latest assistant reply that drop a draft into the input box.
  • Thinking separation: Brain reasoning_content and inline <think> tags are split off into a collapsible panel. PhoneAgent bubbles show per-step thinking when you peek a trace row.
  • Appearance: Light / Dark / System in settings now actually drives the theme (the UI was there but unwired).
  • Return-to-ClawGUI: task finish brings the chat back to the foreground via NEW_TASK + REORDER_TO_FRONT + a PendingIntent fallback that works on Honor / MIUI.

Robustness

  • AutoGLM parser hardening: tolerates stray <plan> blocks, natural-language preambles, markdown fences, full-width punctuation (【】, ,), single/double quotes, whitespace variants (do ( / do(action='), JSON-shape fallbacks, and nested quotes.
  • Step-1 action guard: if the model violates the "step 1 may only Launch / Home / Ask / finish" rule, the runtime intercepts and injects a system feedback turn — no more poking ClawGUI's own UI.
  • Plan protocol fault-tolerance: any malformed op is dropped silently; the main task still moves forward on <answer>.
  • Parse-failure retry: first parse error gets one retry with a system correction; only the second consecutive failure bails.
  • Overlay lifecycle fix: LifecycleRegistry is single-shot after ON_DESTROY, which was why the floating panel never came back on the second task — each show() now allocates a fresh owner.

Prompt overhaul

  • System prompt compressed from 200 → ~80 lines, format contract pulled to the top so the model's attention lands on it.
  • Step 1 is pure planning (no screenshot; the prompt no longer hints there's a screen), restricted to Launch / Home / Ask / finish.
  • Explicit "pick the app first" framing with concrete mappings (WeChat / Moments / Xiaohongshu / Weibo / Douyin / Meituan / ...).
  • Tells the model to ignore ClawGUI's own floating panel — it's UI for the user, not part of the task.
  • Ask reframed as "ask when uncertain, not only when something breaks", with 5 trigger scenarios + 4 counter-examples + a 3-per-task cap.
Loading

clawgui-app-v0.2.0

18 May 14:14
@github-actions github-actions

Choose a tag to compare

What's Changed

  • feat(clawgui-eval): add Kimi K2.5 evaluation support by @ZichenWen1 in #1
  • fix: handle reasoning_content in streaming for reasoning models by @ekkoitac in #8
  • fix(agent): fix issues in webui.py by @jucve in #9
  • feat: add clawgui-app on-device deployment module by @gta886 in #14
  • docs cleanup, remove diagnostic export, refine Roadmap wording by @lhppppp in #15
  • fix(clawgui-app): restore source files swallowed by root .gitignore by @lhppppp in #16
  • Remove some configs, as it causes the system to freeze. by @HangFang6 in #17

New Contributors

Full Changelog: https://github.com/ZJU-REAL/ClawGUI/commits/clawgui-app-v0.2.0

Contributors

gta886, HangFang6, and 4 other contributors
Loading
sugarandgugu reacted with thumbs up emoji
1 person reacted

AltStyle によって変換されたページ (->オリジナル) /