fix: fix expectations for export accuracy tests #466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

nirinchev merged 1 commit into main from ni/export-accuracy

Aug 21, 2025

Merged

fix: fix expectations for export accuracy tests #466

nirinchev merged 1 commit into main from ni/export-accuracy

Aug 21, 2025

Conversation

nirinchev

Copy link

Collaborator

@nirinchev nirinchev commented Aug 20, 2025

Proposed changes

Looks like we had some tests failing because the model was providing the export title but we weren't expecting it.

Checklist

I have signed the MongoDB CLA

@nirinchev


 fix: fix expectations for export accuracy tests

1f5c8ee

@Copilot Copilot AI review requested due to automatic review settings

August 20, 2025 22:22

@nirinchev nirinchev requested a review from a team as a code owner

August 20, 2025 22:22

Copilot

Copilot AI reviewed

Aug 20, 2025

View reviewed changes

Copy link

Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes failing export accuracy tests by updating test expectations to include the exportTitle parameter that the model is now providing. The tests were failing because the model was generating an export title, but the test expectations weren't accounting for this additional parameter.

Key changes:

Added exportTitle: Matcher.string() to all export tool test expectations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

@nirinchev nirinchev added the accuracy-tests label

Aug 20, 2025

@github-actions GitHub Actions

Copy link

Contributor

github-actions bot commented Aug 20, 2025

📊 Accuracy Test Results

📈 Summary

Metric	Value
Commit SHA	`1c4165d16422a606bec361b87d4c4b77d485f750`
Run ID	`e5a0c705-e69c-48a4-adc8-f7a6f6e9ce89`
Status	done
Total Prompts Evaluated	59
Models Tested	1
Average Accuracy	97.0%
Responses with 0% Accuracy	1
Responses with 75% Accuracy	3
Responses with 100% Accuracy	55

📊 Baseline Comparison

Metric	Value
Baseline Commit	`c4ba2c912f43e1b447b96bc9f9f80fa9a154de5f`
Baseline Run ID	`a69e95c1-915c-4d46-ae29-2722e1f5adf5`
Baseline Run Status	`done`
Responses Improved	2
Responses Regressed	0