Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pareta-AI/example-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

1 Commit

Repository files navigation

Pareta example datasets

Ready-to-run example eval sets for the task families in the Pareta model marketplace. Each folder has an items.jsonl (and real source documents/ for document tasks) you can browse, download, or load via Try the example set in-app.

These are the same bundled examples the product ships — built from public benchmarks (synthetic / CC0 / licensed eval corpora), not customer data.

Task Metric Source Items Docs
agent-airline Successful task τ-bench airline 10
agent-retail Successful task τ-bench retail 10
code-generation pass@1 MBPP+ 10
contract-canonical-fields F1 Kleister-NDA 10
contract-clause-enumeration F1 CUAD 10
contract-key-fields F1 CUAD 10
contract-long-doc-fact F1 Kleister-Charity 10
contract-ma-deal-points F1 MAUD 10
doc-qa-abstractive ANLS DUDE 10 10
doc-qa-extractive ANLS DUDE + MP-DocVQA 10 10
doc-qa-list ANLS DUDE 10 10
doc-qa-refusal NA-acc DUDE 10 10
emotion-classification F1 GoEmotions 10
form-receipt-extraction F1 CORD-v2 + FUNSD + SROIE 10 10
function-completion pass@1 HumanEval+ 10
hate-offensive F1 Davidson 10
intent-classification F1 Banking77 10
intent-in-scope F1 CLINC150 10
intent-multilingual F1 MASSIVE 10
invoice-extraction F1 katanaml + FATURA2 10 10
phi-redaction F1 MTSamples 10
pii-detection F1 ai4privacy 10
text-to-api Syntax Match Accuracy BFCL v3 10
text-to-sql Execution Accuracy BIRD-SQL 10
toxic-binary F1 toxic-chat 10
toxic-content-multilabel F1 Jigsaw 10
unknown-intent AUROC CLINC150 OOS 10

Generated by scripts/build-example-datasets.py in the Pareta repo.

About

Ready-to-run example eval datasets for the Pareta model marketplace (items.jsonl + real source documents per task).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /