Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

fix(forge test): ensure --rerun only runs failing tests from correct contracts #10753

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dghelm wants to merge 12 commits into foundry-rs:master
base: master
Choose a base branch
Loading
from dghelm:fix/forge-test-rerun-qualified-matching

Conversation

Copy link
Contributor

@dghelm dghelm commented Jun 10, 2025

Motivation

Fixes issue #10693 where forge test --rerun incorrectly runs passing tests that have the same name as failing tests in different contracts.

Solution

Implemented qualified test matching with contract-aware filtering while maintaining full backward compatibility:

Key Changes

  1. Extended TestFilter trait with matches_qualified_test(contract_name, test_name) method for context-aware filtering
  2. Enhanced FilterArgs system with qualified_failures field to store specific contract/test combinations
  3. Smart failure persistence that intelligently chooses between qualified format (ContractName_testName) and legacy format (testName1|testName2)
    based on whether multiple contracts are involved
  4. Updated test runner logic in is_matching_test_in_context to use qualified matching when appropriate

Architecture

  • Intelligent format detection: Uses qualified format only when test names are ambiguous across multiple contracts
  • Graceful fallback: Maintains legacy behavior for single-contract scenarios
  • Conservative approach: Minimizes format changes to preserve existing workflows

Files Modified

  • crates/common/src/traits.rs - Extended TestFilter trait
  • crates/forge/src/cmd/test/filter.rs - Added qualified matching logic and fixed test name format consistency
  • crates/forge/src/cmd/test/mod.rs - Enhanced failure persistence and loading with intelligent format selection
  • crates/forge/src/multi_runner.rs - Updated test matching logic to use qualified context
  • crates/forge/tests/cli/test_cmd.rs - Added comprehensive regression test

Backward Compatibility

All existing functionality is preserved:

  • Single-contract scenarios continue using legacy format
  • Existing test failure files work unchanged
  • All CLI arguments and behavior remain the same

Side Effects and Considerations

Potential Side Effects

  1. Test failure file format changes: Files may now contain qualified names (ContractTest_testName) in multi-contract scenarios
  2. Filter matching behavior: Slightly different matching logic when qualified failures are present
  3. Performance impact: Minimal additional processing to detect ambiguous test names

Mitigation

  • Backward compatibility preserved: Legacy format still used for single-contract scenarios
  • Graceful fallback: System falls back to legacy behavior when qualified parsing fails
  • Conservative approach: Only uses qualified format when absolutely necessary

PR Checklist

  • Added Tests
    • Added should_replay_failures_only_correct_contract regression test
    • Verified existing should_replay_failures_only test continues to pass
  • Added Documentation
    • Comprehensive inline code comments explaining the qualified matching logic
    • Updated function documentation for new matches_qualified_test method
  • Breaking changes

This contribution was made to explore the viability of bug-fixing work via Claude Code assistance, as prompted by this Tweet. If it's a mess, please don't spend too much time helping me get it over the line.

Copy link
Collaborator

@grandizzy grandizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for your PR, left some comments

// This is the key test: --rerun should NOT run both contracts with same test name
// Before the fix, this would run 2 tests (both testSameName functions)
// After the fix, this should run only 1 test (only the specific failing one)
cmd.forge_fuse().args(["test", "--rerun"]).assert_failure();
Copy link
Collaborator

@grandizzy grandizzy Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's check the stderr and make sure there's only one test executed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified test to capture output and assert stdout contains "0 tests passed, 1 failed, 0 skipped (1 total tests)" to ensure exactly one test runs. Does this work?

// Parse qualified pattern format: ContractName_testName or legacy format
if content.contains('_') && !content.contains('|') {
// Single qualified failure
if let Some(pos) = content.rfind('_') {
Copy link
Collaborator

@grandizzy grandizzy Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct as tests could have the test_ name format, so in this case a ContractName_test_Name pattern will result in contract_part being ContractName_test and test_part - Name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.

Created split_qualified_test_name() function that looks for _test patterns to properly separate contract and test names, with fallback to original behavior.

#[arg(long = "no-match-coverage", visible_alias = "nmco", value_name = "REGEX")]
pub coverage_pattern_inverse: Option<regex::Regex>,

/// Qualified test failures from --rerun (contract, test) pairs.
Copy link
Collaborator

@grandizzy grandizzy Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should make this pub arg but rather set (or reuse) the rerun flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove pub in next commit.

Not sure of a better option. The rerun flag is insufficient as a boolean when this is holding internal state that contains the actual list of (contract, test) pairs to rerun.

The flow is:
f--rerun flag → load failure file → populate qualified_failures → use in filtering

As I see it, the alternatives would be:

  1. Pass the data separately through multiple function parameters (messier)
  2. Global state (bad practice)
  3. Reload the file multiple times (inefficient)

@dghelm dghelm requested a review from grandizzy June 11, 2025 18:00
Copy link
Collaborator

@grandizzy grandizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @dghelm I think this can be simplified and we can get rid of bunch of custom parsing logic if we save failed tests in different format (and not directly the filter per se) like json

[
 {
 "test_suite": "CounterTest",
 "failures": ["test_counter1", "test_counter2"]
 },
 {
 "test_suite": "AnotherCounterTest",
 "failures": ["test_counter1", "test_counter2"]
 }
]

then we load it and compose the test filter accordingly

Copy link
Contributor Author

dghelm commented Jun 16, 2025

Thank you @dghelm I think this can be simplified and we can get rid of bunch of custom parsing logic if we save failed tests in different format (and not directly the filter per se) like json

[
 {
 "test_suite": "CounterTest",
 "failures": ["test_counter1", "test_counter2"]
 },
 {
 "test_suite": "AnotherCounterTest",
 "failures": ["test_counter1", "test_counter2"]
 }
]

then we load it and compose the test filter accordingly

This makes sense to me. I assume this would be a breaking change since these files may be used by external tooling. Would you like me to implement this approach? And if so, do you prefer the changes here or that I raise a new PR?

Copy link
Collaborator

grandizzy commented Jun 16, 2025
edited
Loading

This makes sense to me. I assume this would be a breaking change since these files may be used by external tooling. Would you like me to implement this approach? And if so, do you prefer the changes here or that I raise a new PR?

Yes, please go ahead and implement in this PR. Should not be a biggie as the file is already removed on forge clean and if we cannot load the old format then we can just ignore / consider it doesn't exist. thank you!

Copy link
Collaborator

@dghelm would you still have capacity to finalize the PR or should I take it over? thank you!

Copy link
Contributor Author

dghelm commented Sep 8, 2025

  • Switch rerun cache to JSON and filter by exact (contract, test) pairs.
  • Aggregate failures per contract across files; normalize and dedup test names.
  • Contract-aware filtering in runners; avoid signature allocations by using func.name.
  • Updated CLI tests to validate new JSON format and strict rerun behavior.

Hopefully this gets us closer -- but feel free to take it over at any time!

if out.is_empty() { None } else { Some(out) }
}

// Legacy regex-based rerun cache reader removed; rerun now uses JSON-qualified pairs.
Copy link
Member

@DaniPopes DaniPopes Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment is unnecessary


/// Returns `true` if the function is a test function that matches the given filter in the context
/// of a specific contract (needed for exact rerun filtering).
pub(crate) fn is_matching_test_in_context(
Copy link
Member

@DaniPopes DaniPopes Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a optional contract_name argument to <dyn TestFilter>::matches_test_function instead of this function

}

/// Load persisted failures as qualified (contract, test) pairs from JSON.
fn last_run_failures_qualified(config: &Config) -> Option<Vec<(String, String)>> {
Copy link
Member

@DaniPopes DaniPopes Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should return Result and inspect_err just like before to log the error

let mut names = Vec::new();
for (sig, result) in &suite.test_results {
if result.status.is_failure()
&& let Some(name) = sig.split('(').next()
Copy link
Member

@DaniPopes DaniPopes Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep the full signature instead of just the test name, tests are matched against their signature.

if let Some(pairs) = &self.qualified_failures {
return pairs.iter().any(|(c, t)| c == contract_name && t == bare);
}
self.matches_contract(contract_name) && self.matches_test(bare)
Copy link
Member

@DaniPopes DaniPopes Sep 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, do not match against just the test name but the full signature instead

@grandizzy grandizzy self-assigned this Oct 2, 2025
@jenpaff jenpaff moved this to Ready For Review in Foundry Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

@grandizzy grandizzy grandizzy requested changes

@DaniPopes DaniPopes DaniPopes requested changes

@klkvr klkvr Awaiting requested review from klkvr

@mattsse mattsse Awaiting requested review from mattsse mattsse is a code owner

@yash-atreya yash-atreya Awaiting requested review from yash-atreya

@zerosnacks zerosnacks Awaiting requested review from zerosnacks zerosnacks is a code owner

@onbjerg onbjerg Awaiting requested review from onbjerg onbjerg is a code owner

@0xrusowsky 0xrusowsky Awaiting requested review from 0xrusowsky 0xrusowsky is a code owner

Requested changes must be addressed to merge this pull request.

Projects

Status: Ready For Review

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

bug(forge test): --rerun runs passing tests of the same name in different contracts

AltStyle によって変換されたページ (->オリジナル) /