Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

feat: add prototype speech-to-text support for AI agent input (experimental) #697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Pradeep-Gopi-E wants to merge 3 commits into browser-use:main
base: main
Choose a base branch
Loading
from Pradeep-Gopi-E:SpeechAssist

Conversation

@Pradeep-Gopi-E
Copy link

@Pradeep-Gopi-E Pradeep-Gopi-E commented Nov 3, 2025
edited by cubic-dev-ai bot
Loading

Summary by cubic

Adds speech-to-text to the Run Agent tab with a mic button, and moves the JavaScript into interface.py for cleaner separation and easier maintenance.

  • New Features

    • Mic button next to the task textbox triggers Web Speech API recognition.
    • Fills the textbox with the transcript and dispatches an input event to update Gradio state.
    • Shows listening/error states; defaults to en-US.
  • Refactors

    • create_browser_use_agent_tab now accepts speech_js and binds it to the mic button.
    • Speech JS centralized in interface.py (js_speech_function) and passed into the tab.
    • UI tidy-up: Row layout for textbox + button, larger textbox, updated placeholders, button stored in manager.

Written for commit 53f5a8b. Summary will update automatically on new commits.

Copy link

CLAassistant commented Nov 3, 2025
edited
Loading

CLA assistant check
All committers have signed the CLA.

@Pradeep-Gopi-E Pradeep-Gopi-E changed the title (削除) refactor: Clean up use agent tab by moving JS to interface (削除ここまで) (追記) feat: add prototype speech-to-text support for AI agent input (experimental) (追記ここまで) Nov 3, 2025
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files

Prompt for AI agents (all 1 issues)

Understand the root cause of the following 1 issues and fix them.
<file name="src/webui/components/browser_use_agent_tab.py">
<violation number="1" location="src/webui/components/browser_use_agent_tab.py:1085">
These event handlers are now registered twice, so each action (submit, stop, pause, clear) will fire twice, leading to duplicate task runs and inconsistent UI state. Please remove the duplicate registrations.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Ask questions if you need clarification on any suggestion

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

Removed redundant event handler connections for buttons.
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@cubic-dev-ai cubic-dev-ai[bot] cubic-dev-ai[bot] left review comments

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /