Hritik Datta

Product @ Pre6 AI · I build production-grade AI agent systems.

Product by title, builder by craft — I design AI products and ship the engineering behind them: multi-agent orchestration, agent evaluation, and AI safety infrastructure.

What I work on

I care about the unglamorous half of AI products — the part that decides whether they survive contact with real users. Most demos route a single LLM call. Production systems need orchestration, evaluation, safety gates, and observability. That gap is what I build into.

Multi-agent orchestration — supervisor/specialist architectures with typed state, tool binding, and streaming traces.
Agent reliability — measurable, auditable evaluation of agent runs across reliability, safety, latency, and cost.
LLM safety — scanning retrieval context for prompt injection, secret leakage, PII, and exfiltration before it reaches a model.
Developer tooling — sharp CLIs that turn fuzzy engineering signals into decisions teams can act on.

Featured work

Project	What it is	Stack	Links
nabla	A reverse-mode autograd engine you can watch think — the algorithm behind PyTorch/JAX, from scratch, with an interactive visualizer that animates backprop through the computation graph. Gradient-checked to 1e-10.	Python · CI	Live Demo · Code
mosaic	A byte-pair-encoding tokenizer you can see — train a real BPE on your own text and watch any string break into a mosaic of tokens. Zero-dependency, lossless round-trips.	Python · CI	Live Studio · Code
winnow	Budget-aware context compression for RAG and agents — BM25 relevance + MMR diversity packs the highest-signal context into a token budget. Deterministic, zero runtime deps, no API keys, with a reproducible benchmark.	Python · CI	Live Demo · Code
warren	From-scratch HNSW approximate-nearest-neighbor index — the graph algorithm behind vector databases. Recall@10 of 0.99+ while scanning ~5% of the database, measured against exact search.	Python · NumPy · CI	Live Demo · Code
stencil	Constrained decoding — compiles a JSON Schema to a DFA and masks an LLM's tokens so invalid output is impossible. 100% valid by construction vs ~0% unconstrained.	Python · CI	Live Demo · Code
mend	Repairs malformed JSON from LLMs into valid JSON — fences, single quotes, trailing commas, truncated output. Recovers 16/16 real-world defects vs stdlib's 0.	Python · CI	Live Demo · Code
gemma4-multi-agent	Multi-agent system — a Supervisor routes work across 4 specialist agents with live reasoning traces and sandboxed tool execution.	Python · LangGraph · Gemini · Streamlit	Code
agent-evals-lab	Evaluation workbench for agent reliability — typed scoring engine, policy rules, regression detection, and a trace-inspection dashboard.	TypeScript · React · CI	Live Demo · Code
verdict	Adversarial LLM red-teaming platform — runs PAIR, Crescendo, and injection attacks against any model, then reports attack-success-rate metrics with per-category breakdowns and HTML reports.	Python · CI	Code
rag-safety-gateway	AI security gateway that scans RAG context for prompt injection, secrets, PII, and exfiltration risk, producing deterministic allow/redact/quarantine decisions.	TypeScript · React · CI	Live Demo · Code
hermes	Test-time compute scaling engine — gives any LLM o1-style reasoning search via Process Reward Models, MCTS, and beam search.	Python · CI	Code

Every featured project ships with tests, CI, and documentation — clone, run, and review the design in minutes.

repo-pulse generating a real engineering-health report
_{repo-pulse — one of my CLIs, generating a real engineering-health report with no keys or config.}

How I build

Typed contracts first → domain models before logic, so behavior is auditable
Deterministic by default → scoring and decisions reproducible without a live model
Measurable, then pretty → evals and telemetry before dashboards
Reviewable in 60 seconds → clone, run, understand — no API keys to start

Stack

Python · TypeScript · LangGraph · LangChain · React · Streamlit · Google Gemini · OpenAI · pytest · Vitest · GitHub Actions · uv

_{Open to conversations on AI agent engineering, evals, and LLM safety.}

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hritik Datta Hritikd

Achievements

Achievements

Block or report Hritikd

Hritik Datta

What I work on

Featured work

How I build

Stack

Pinned Loading

Uh oh!