selective-prediction

Star

Here are 26 public repositories matching this topic...

Language: All

Filter by language

All 26 Python 22 Jupyter Notebook 4

Sort: Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

goergen95 / seapig

Star 7

Uncertainty based selection of compatible inputs

deep-learning pytorch remote-sensing uncertainty-estimation selective-prediction torchgeo geospatial-ai confidence-scoring

Updated Jun 14, 2026
Python

Dan23RR / snc-core

Star 4

Behavioral Trust Clustering a thermodynamic governance layer that reduces LLM hallucination by 52% on HumanEval. Drop-in wrapper for any decoder. MIT.

abstention openai-api selective-prediction humaneval llm ollama qwen hallucination-mitigation trust-calibration regulated-ai behavioral-clustering

Updated May 4, 2026
Python

Guardrails watch what AI says. REMORA governs what AI does. A pre-execution governance layer for AI agent tool calls: ACCEPT, VERIFY, ABSTAIN, ESCALATE, with policy, evidence, uncertainty, and an auditable DecisionEnvelope. Research-grade, open source.

mcp ml thermodynamics opa uncertainty-quantification ai-safety ai-agents audit-trail enterprise-architecture togaf conformal-prediction policy-as-code guardrails opentelemetry lyapunov-stability selective-prediction agentic-ai tool-calling llm-governance multi-oracle

Updated Jun 11, 2026
Python

cleverhans-lab / confidential-guardian

Star 1

We show that a model owner can artificially introduce uncertainty into their model and provide a corresponding detection mechanism.

machine-learning uncertainty calibration zero-knowledge rejection abstention selective-prediction

Updated Jun 2, 2025
Jupyter Notebook

JiajunChen223 / DegradeRisk-Seg

Star 1

DegradeRisk-Seg: risk-controlled semantic segmentation under degraded multi-modal remote-sensing observations

pytorch calibration remote-sensing semantic-segmentation multimodal-learning risk-control selective-prediction degradation-benchmark

Updated Jun 4, 2026
Python

Tharun2908 / mistral-medqa-abstention

Star 1

Reliable medical QA with Mistral-7B, QLoRA, selective prediction, and learned abstention via warm-start SFT + DPO.

mistral peft dpo huggingface abstention medical-qa reliable-ai selective-prediction llm medqa qlora llm-safety

Updated May 31, 2026
Python

HrxuAlbert / cherry-pick-override

Star 0

Code and data release for the paper 'Cherry-pick Override: Unsafe Directional Commitment in LLM Judges under Mixed Evidence'

nlp reproducibility fact-checking multi-agent-systems ai-safety conformal-prediction fact-verification abstention selective-prediction llm llm-evaluation llm-as-judge

Updated Jun 5, 2026
Python

AnwarDebes / Clause-Driven-LLM

Star 0

Interpretable Tsetlin Machine control layer for LLM agents: it learns clause-based routing over frozen embeddings, escalates low-margin queries to a human, and emits a SAT-verifiable receipt for every decision.

propositional-logic intent-classification interpretable-machine-learning tsetlin-machine neuro-symbolic-ai selective-prediction ai-governance out-of-scope-detection llm-routing sat-verification

Updated Jun 14, 2026
Python

cleverhans-lab / sc-gap

Star 0

Code for our paper analyzing the looseness of the upper bound on selective classification performance.

machine-learning uncertainty-quantification rejection abstention selective-classification selective-prediction

Updated Nov 18, 2025
Jupyter Notebook

AnwarDebes / RobTM

Star 0

Tsetlin Machines with a certificate on every answer: the exact number of feature flips a prediction survives, computed per sample, with predict-or-abstain when the radius is too small.

research adversarial-attacks interpretable-ml tsetlin-machine adversarial-robustness abstention certified-robustness selective-prediction

Updated Jun 12, 2026
Python

Arutselvan / selective_prediction_mtl

Star 0

Investigation of how sampling strategies affect Selective Prediction performance in Multi Task Learning

nlp swag multi-task bert-fine-tuning snli-dataset selective-prediction

Updated Jan 4, 2022
Python

mhmdaskari / llm-confidence-scoring

Star 0

Experiments on whether disclosing logarithmic scoring rules reduces LLM overconfidence in multiple-choice QA.

calibration hallucination confidence-estimation abstention selective-prediction llm mmlu proper-scoring-rules mmlu-pro simpleqa

Updated Jun 13, 2026
Python

KonkovaElena / airi-summer-school-2026

Star 0

Reproducible MEDAI deferral simulation (AIRI 2026). Synthetic research code.

python machine-learning reproducible-research monte-carlo-simulation decision-support medical-informatics human-in-the-loop fair-principles expert-in-the-loop research-software uncertainty-calibration selective-prediction learning-to-defer bootstrap-statistics

Updated May 20, 2026
Python

steverab / incerto

Star 0

A comprehensive library for uncertainty quantification in machine learning.

calibration uncertainty-quantification active-learning conformal-prediction out-of-distribution-detection distribution-shift selective-prediction llm

Updated May 17, 2026
Python

Estella-Hu / deepfake-detection-bayesian-uncertainty

Star 0

Deepfake detection with Bayesian uncertainty quantification, selective prediction, and an interactive Streamlit demo.

computer-vision pytorch uncertainty-quantification deepfake-detection streamlit selective-prediction bayesian-ml trustworth-ai

Updated Mar 18, 2026
Jupyter Notebook

musicofhel / confgate

Star 0

Free confidence gate for LLM correctness — logistic regression on (generation length, mean logprob), with cascade routing and split-conformal certificates. The pinned topo-confidence result.

machine-learning calibration cascade uncertainty-quantification confidence conformal-prediction selective-prediction llm