chat-quality

Here is 1 public repository matching this topic...

animesh01 / cqs-evaluation

LLM-as-a-judge evaluation demo for conversational AI: scores chats on a 4-dimension rubric into a single 0–100 quality score and calibrates the automated judge against human labels. Synthetic demo data.

nlp model-evaluation human-in-the-loop product-analytics conversational-ai streamlit chatbot-evaluation ai-evaluation llm-evaluation llm-as-a-judge rubric-scoring chat-quality

Updated Jun 10, 2026
Python

Improve this page

Add a description, image, and links to the chat-quality topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the chat-quality topic, visit your repo's landing page and select "manage topics."

Learn more

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat-quality

Here is 1 public repository matching this topic...

animesh01 / cqs-evaluation

Improve this page

Add this topic to your repo