confidence-scoring

An enterprise-grade backend system that automatically classifies, resolves, and escalates customer support tickets using FastAPI, Python, and Advanced NLP, while ensuring safety through confidence-based decision making and human oversight.

nlp sqlalchemy jwt rbac tf-idf fastapi customer-support-ai confidence-scoring

Updated Apr 10, 2026
Python

shopgraph

laundromatic / shopgraph

Star 3

The extraction API that shows its work. Product data extraction with per-field confidence scoring and extraction provenance. REST API + MCP server. 50 free calls/month.

ecommerce ucp schema-org structured-data ai-agents product-data mcp-server confidence-scoring agent-commerce stripe-mpp shopgraph extraction-provenance

Updated Jun 5, 2026
TypeScript

obielin / llm-extract

Star 2

Extract structured data from any document — PDF, DOCX, HTML, CSV, plain text — using LLMs with Pydantic schema validation, per-field confidence scores, and source grounding.

python nlp pdf extraction structured-output pydantic llm document-parsing anthropic confidence-scoring

Updated Apr 5, 2026
Python

metareason-ai / metareason-core

Star 2

Open-source LLM evaluation engine with statistical confidence scoring

statistical-analysis bayesian-inference ai-governance llm-evaluation confidence-scoring

Updated Mar 24, 2026
Python

seljicom / selji-zero-noise

Star 1

Zero-Noise utilities for safer product research and review signal analysis.

ecommerce decision-support consumer-research review-analysis product-research confidence-scoring zero-noise buyer-tools shopping-tools

Updated Feb 7, 2026
JavaScript

dakshjain-1616 / AgentLiar

Star 1

Verification system that catches coding agents falsely claiming task completion. Runs 4 parallel checks (file integrity, test quality, scope narrowing, optional LLM judge) over task+claim+diff and returns a weighted 0-100 confidence score with evidence.

verification asyncio agents github-actions pydantic fastapi ai-evaluation openrouter coding-agents code-review-automation llm-judge test-quality confidence-scoring scope-detection agent-overclaim

Updated May 21, 2026
Python

SouravUpadhyay7 / self_correcting_rag

Star 1

Research-grade Self-Correcting RAG agent built with LangGraph that retrieves knowledge, generates answers, evaluates grounding/relevance/completeness, and iteratively self-improves with confidence scoring and memory.

python rag streamlit langchain llm-agent openrouter hallucination-detection langgraph knowledge-retrieval huggingface-embeddings confidence-scoring self-correcting-ai

Updated Mar 20, 2026
Python

lorenzespinosa / n8n-ai-agent-delegator

Star 1

Multi-agent AI task delegation architecture for n8n: orchestrator routes natural-language commands to specialist agents with confidence scoring and human-in-the-loop gates.

automation mcp orchestration multi-agent openai ai-agents n8n llm llmops agent-orchestration confidence-scoring

Updated May 30, 2026

theangelofwill / CrossModel-Consensus

Star 1

System that aggregates outputs from multiple Large Language Models (GPT-4, Claude-3, custom models) to generate reliable, high-confidence results through consensus-based reasoning evaluation. Demonstrates sophisticated AI orchestration with 92.7% accuracy improvement over single-model.

python api docker portfolio machine-learning ai deep-learning orchestration pytorch neural-networks multi-model consensus-algorithm model-comparison mlflow fastapi ai-engineering llm prompt-engineering confidence-scoring

Updated Dec 22, 2025
Python

simply-mihir / nistula-technical-assessment

Star 1

AI-powered concierge that normalises guest messages from WhatsApp, Booking.com, Airbnb, Instagram and direct channels, drafts a reply with Claude, and routes responses through a deterministic confidence-scoring pipeline. Built with FastAPI + Claude Sonnet 4.

python nlp webhook postgresql customer-support saas hospitality multi-channel operational-dashboard ai-agents claude intent-classification fastapi llm prompt-engineering anthropic ai-tooling confidence-scoring unified-messaging guest-messaging

Updated May 18, 2026
Python

duke-of-beans / composite-confidence-score

Star 0

7-axis weighted confidence function for AI output quality. Evidence, reasoning, calibration, source, domain, coherence, meta.

evaluation ai-quality multi-axis reasoning-engine confidence-scoring

Updated Jun 2, 2026
TypeScript

obinexus / gating

Star 0

hotl hitl automated-workflows obinexus task-cognition protocol-enforcement confidence-scoring housng-automony strategic-exection qa-gating semantic-matrix dual-gating systematic-navigation life-work-balance verb-noun-modeling homogeneous-exection task-validation cognitvie-automation matrix-aligment human-aware-systems

Updated Oct 22, 2025

wjddusrb03 / docforge

Star 0

Smart Document Conversion for the AI Era - CPU-only, fast, with confidence scoring. Converts PDF, DOCX, PPTX, HTML, EPUB to Markdown, JSON, HTML, Text.

python markdown json streaming document-conversion html-to-markdown text-extraction batch-processing pdf-parser cli-tool docx-to-markdown document-parser rag llm rich-cli epub-to-markdown pdf-to-markdown cpu-only confidence-scoring pptx-to-markdown

Updated Mar 29, 2026
Python

selfradiance / memledger

Star 0

Append-only CLI ledger for structured agent memory claims with provenance, confidence, contestability, and immutable history.

nodejs cli typescript sqlite provenance developer-tools ai-agents audit-trail append-only zod local-first agent-memory confidence-scoring memory-integrity claim-ledger

Updated Apr 28, 2026
TypeScript

JLHC-AI-portfolio / community-fair-supplier-packet-review

Star 0

Supplier PDF-to-Excel/CSV workflow with structured extraction, confidence scoring, validation flags, and human-review cues.

nodejs express validation data-cleaning csv-export excel-automation pdf-extraction document-automation confidence-scoring ai-assisted-extraction

Updated Apr 28, 2026
JavaScript

adrianzevenster / confidence-engine-llm-extraction

Star 0

A lightweight extraction engine that returns structured document fields with token-level confidence, entropy, and model uncertainty signals for evaluation and observability.

python information-extraction uncertainty-estimation mlops fastapi llm confidence-scoring