-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Open
Conversation
Merging to main in this repository is managed by Trunk.
- To merge this pull request, check the box to the left or comment
/trunk mergebelow.
After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related Issue
Closes #1964
Summary
This PR fixes an issue where healthy but slow gbrain engines could be incorrectly classified as
broken-configwhen the probe operation exceeded the default timeout threshold.In environments where the gbrain CLI takes longer than 5 seconds to respond (for example, when using remote databases or connection poolers), the probe would time out and fall back to the generic
broken-configclassification. As a result,/sync-gbrainskipped the code and memory stages even though the engine itself was healthy and operational.Changes Made
1. Added a dedicated
slow-enginestatusIntroduced a new
LocalEngineStatusvalue:"slow-engine"This allows timeout-related failures to be distinguished from actual configuration problems.
2. Improved timeout handling
The classifier now explicitly detects timeout conditions (such as
ETIMEDOUT) and returnsslow-engineinstead of incorrectly falling back tobroken-config.3. Made probe timeout configurable
Replaced the hardcoded timeout value with an environment-variable-based configuration:
GSTACK_GBRAIN_PROBE_TIMEOUT_MSThis allows users with slower database connections or remote infrastructure to adjust the probe timeout without modifying source code.
4. Increased the default timeout
Updated the default probe timeout from:
to:
This better accommodates slower but healthy environments while still preventing indefinite waits.
5. Added a user-friendly sync message
When a slow engine is detected, users now receive a clear explanation indicating that the engine may still be healthy and suggesting increasing the timeout or improving database connectivity, rather than being told their configuration is malformed.
Why This Change?
The original behavior treated timeout failures as configuration failures, which could mislead users into troubleshooting the wrong component. This PR separates slow-response scenarios from actual configuration issues, making diagnostics more accurate and improving the overall user experience.
Testing
slow-engine.broken-configandbroken-dbclassifications remain unchanged.GSTACK_GBRAIN_PROBE_TIMEOUT_MS.Checklist