Commit 0eff9b1

committed

Eval markdown

1 parent 1929ba1 commit 0eff9b1Copy full SHA for 0eff9b1

File tree

1 file changed

-2

lines changed

docs
- evaluation.md

1 file changed

-2

lines changed

`‎docs/evaluation.md‎`

Lines changed: 9 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -2,6 +2,13 @@`
`2`	`2`
`3`	`3`	`Follow these steps to evaluate the quality of the answers generated by the RAG flow.`
`4`	`4`
	`5`	`+* [Deploy a GPT-4 model](#deploy-a-gpt-4-model)`
	`6`	`+* [Setup the evaluation environment](#setup-the-evaluation-environment)`
	`7`	`+* [Generate ground truth data](#generate-ground-truth-data)`
	`8`	`+* [Run bulk evaluation](#run-bulk-evaluation)`
	`9`	`+* [Review the evaluation results](#review-the-evaluation-results)`
	`10`	`+* [Run bulk evaluation on a PR](#run-bulk-evaluation-on-a-pr)`
	`11`	`+`
`5`	`12`	`## Deploy a GPT-4 model`
`6`	`13`
`7`	`14`
`@@ -45,7 +52,7 @@ python evals/generate_ground_truth_data.py`
`45`	`52`
`46`	`53`	`Review the generated data after running that script, removing any question/answer pairs that don't seem like realistic user input.`
`47`	`54`
`48`		`-## Evaluate the RAG answer quality`
	`55`	`+## Run bulk evaluation`
`49`	`56`
`50`	`57`	Review the configuration in `evals/eval_config.json` to ensure that everything is correctly setup. You may want to adjust the metrics used. See [the ai-rag-chat-evaluator README](https://github.com/Azure-Samples/ai-rag-chat-evaluator) for more information on the available metrics.
`51`	`58`
`@@ -72,6 +79,6 @@ Compare answers across runs by running the following command:`
`72`	`79`	`python -m evaltools diff evals/results/baseline/`
`73`	`80`	```
`74`	`81`
`75`		`-## Run the evaluation on a PR`
	`82`	`+## Run bulk evaluation on a PR`
`76`	`83`
`77`	`84`	To run the evaluation on the changes in a PR, you can add a `/evaluate` comment to the PR. This will trigger the evaluation workflow to run the evaluation on the PR changes and will post the results to the PR.

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 0eff9b1

File tree

1 file changed

1 file changed

`‎docs/evaluation.md‎`

0 commit comments