Pipelines, monitoring loops, retrieval engines, fine-tuned models — I care about the full arc from raw data to deployed endpoint. Currently exploring distributed model serving, agentic RAG, and LLM fine-tuning at low compute.
- 🔭 Building production ML systems end-to-end — not just notebooks
- 🧪 Into MLOps, LLM fine-tuning (LoRA/PEFT), and RAG pipelines
- 🏢 Ex-ML Intern @ Fitin — shipped churn model (ROC-AUC 0.91, 100K+ records) + full Airflow + FastAPI pipeline
- 🥇 Ranked 35th / 2,247 on Kaggle Spaceship Titanic
- 🔧 Open-source contributor → Giskard AI (PR #2440 — Bias LLM Judge eval check merged)
- ✍️ Write about ML systems & RAG on Substack
Production ML Monitoring & Auto-Retraining
KS-test + chi-squared drift detection triggering Airflow-orchestrated retraining on threshold breach. 3-gate MLflow promotion with zero-downtime symlink swaps. p99 latency + prediction confidence tracked via Prometheus → Grafana.
Airflow MLflow Docker FastAPI Prometheus
LoRA Fine-tuning of TinyLlama-1.1B on Legal Docs
4-bit quantization (bitsandbytes) + PEFT/LoRA fine-tuning for domain-specific legal QA. Optimised for low-VRAM environments. Deployed via HuggingFace Transformers inference pipeline.
Multi-Domain Document Retrieval System
Retrieval Precision@5: 0.62 → 0.81 over 30-query eval. 91% query routing accuracy via LLM decomposition + parallel sub-query execution. 50+ enterprise docs, <2s latency.
Hospital Resource Forecasting System
Hybrid SARIMA–XGBoost pipeline, RMSE 1.28 on 7-day patient inflow forecasting. PuLP linear programming cut projected ICU overcapacity by 18% under hard resource constraints with SHAP explainability.
ML & Deep Learning
scikit-learn PyTorch HuggingFaceMLOps & Infra
DockerLLMs & GenAI
Cloud & Deployment
AWS