I'm an AI/ML Engineer with an MS from UW-Madison, specializing in building production-grade LLM systems, agentic pipelines, and RAG architectures from scratch.
- 🔬 Currently building: self-improving LLM agents with QLoRA fine-tuning and RLAIF feedback loops
- 🏗️ I write raw API calls over frameworks — my agents don't use LangChain, AutoGen, or CrewAI
- 🤖 Deep focus on fine-tuning, knowledge distillation, and multi-modal AI
- 🎯 Open to: ML Engineer · AI Engineer · MLOps · Research Engineer roles
LLMs & Agents
Python PyTorch HuggingFace LlamaIndex Ollama
Vector DBs & RAG
pgvector ChromaDB Mistral LLaMA
Computer Vision & Fine-Tuning
Backend & Infra
FastAPI Docker PostgreSQL GitHub Actions
Grok-4 teacher → QLoRA-fine-tuned LLaMA-3.2-1B student
- Failure trace annotation pipeline → JSONL training data
- 95% inference cost reduction vs teacher at same quality
- ChromaDB strategy memory for cross-run self-improvement
QLoRA Knowledge-Distillation LLaMA-3 bitsandbytes ChromaDB
Production CLI coding agent — zero framework dependencies
- ReAct loop + 11 workspace tools, raw HTTPS to xAI/OpenAI
- RLAIF scoring via Grok 4 · JSONL tracing · cross-session memory
- ×ばつ faster than comparable LangChain baseline
ReAct RLAIF Tool-Calling xAI Python
E5-small-v2 + Mistral 7B over 10K+ indexed emails
- JWT auth with SQL-enforced per-user isolation at pgvector layer
- Embedding caching + batched retrieval · sub-80ms retrieval latency
- Precision@5 = 0.84 · Answer faithfulness = 0.79
RAG pgvector Mistral-7B FastAPI JWT
CNN vs ViT transfer learning on FER2013 (7-class)
- 5-model study: CNN → ResNet → ViT → Hybrid → Domain ViT
- 71.4% accuracy with
trpakov/vit-face-expression - Full ablation study + per-class precision/recall/F1
ViT CNN PyTorch Transfer-Learning FER2013
"Build it from scratch. Understand every layer. That's how you ship reliable AI."
I'm an AI/ML Engineer with an MS from UW-Madison, specializing in building production-grade LLM systems, agentic pipelines, and RAG architectures from scratch.
- 🔬 Currently building: self-improving LLM agents with QLoRA fine-tuning and RLAIF feedback loops
- 🏗️ I write raw API calls over frameworks — my agents don't use LangChain, AutoGen, or CrewAI
- 🤖 Deep focus on fine-tuning, knowledge distillation, and multi-modal AI
- 🎯 Open to: ML Engineer · AI Engineer · MLOps · Research Engineer roles
LLMs & Agents
Python PyTorch HuggingFace LlamaIndex Ollama
Vector DBs & RAG
pgvector ChromaDB Mistral LLaMA
Computer Vision & Fine-Tuning
Backend & Infra
FastAPI Docker PostgreSQL GitHub Actions
Grok 4 teacher → QLoRA fine-tunes LLaMA-3.2-1B student
- ChromaDB strategy memory for cross-session learning
- 90% task completion on Tau Bench benchmark
- Knowledge distillation via DPO on annotated failure traces
QLoRA DPO ChromaDB LLaMA-3 Self-Improving
Production CLI coding agent — zero framework dependencies
- ReAct loop + 11 workspace tools, raw HTTPS to xAI/OpenAI
- RLAIF scoring via Grok 4 · JSONL tracing · cross-session memory
- ×ばつ faster than comparable LangChain baseline
ReAct RLAIF Tool-Calling xAI Python
E5-small-v2 + Mistral 7B over 10K+ indexed emails
- JWT auth with SQL-enforced per-user isolation at pgvector layer
- Embedding caching + batched retrieval to cut inference latency
- Retrieval precision & answer faithfulness metrics
RAG pgvector Mistral-7B FastAPI JWT
CNN vs ViT transfer learning on FER2013 (7-class)
- Comparative study: CNN baseline vs ViT vs CNN-Transformer hybrid
trpakov/vit-face-expressionoutperforms from-scratch CNN- Full ablation study with confusion matrices
ViT CNN PyTorch Transfer-Learning FER2013
"Build it from scratch. Understand every layer. That's how you ship reliable AI."