Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Workflow prompt splitting can corrupt UTF-8 text before Codex app-server turn start #80

WSL0809 started this conversation in General
Discussion options

Summary

WORKFLOW.md prompt loading can corrupt valid UTF-8 text when splitting front matter from the prompt body. This can make the rendered prompt invalid UTF-8 and cause Jason.encode!/2 to fail before Symphony sends turn/start to the Codex app-server.

A ready fix branch is available here:

I attempted to open a PR directly against openai/symphony, but GitHub returned permission errors from both gh pr create and the GitHub connector (CreatePullRequest / 403). I also tried to open an issue, but Issues are disabled for this repository. The branch above is ready for maintainers to pull or use to create the PR.

Root Cause

Workflow.split_front_matter/1 currently uses:

String.split(content, ~r/\R/, trim: false)

Without the Unicode regex flag, \R can treat raw byte 0x85 as a newline-like NEL byte. Some valid UTF-8 Chinese characters contain 0x85 as part of a multibyte sequence. For example, is encoded as E8 80 85.

In a workflow prompt containing text like:

> 维护者:Symphony Codex 自动维护。

that split can break the byte sequence into invalid UTF-8. The rendered prompt then fails at Jason.encode!/2 inside Symphony before it reaches the Codex app-server.

Fix

The prepared branch changes workflow file line splitting to only split on explicit file line endings:

String.split(content, ~r/\r\n|\n|\r/u, trim: false)

This preserves UTF-8 prompt text and still handles CRLF, LF, and CR line endings for front matter parsing.

Validation

Validated on the prepared branch:

  • mix format --check-formatted lib/symphony_elixir/workflow.ex test/symphony_elixir/core_test.exs
  • mix test test/symphony_elixir/core_test.exs
  • mix run -e '<UTF-8 workflow prompt Jason.encode! repro>' showed prompt_valid=true and jason_encode=:ok
  • mix test passed: 235 tests, 0 failures, 2 skipped

Note: the first full-suite run hit an existing timing-window flaky assertion in test/symphony_elixir/core_test.exs:575; rerunning that individual test passed, and the subsequent full-suite run passed.

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /