class SoftwareEngineer: def __init__(self): self.name = "Jeet Patel" self.role = "Software Engineer | AI/ML Engineer" self.education = "M.S. Data Science @ Indiana University (GPA: 3.8)" self.location = "Bloomington, IN" def current_focus(self): return [ "Building production ML systems that scale", "Backend services with AI/ML integration", "Distributed systems and inference optimization", "MLOps and reliable model deployment" ] def technologies(self): return { "languages": ["Python", "Go", "Java", "SQL"], "backend": ["FastAPI", "Django", "REST APIs", "gRPC"], "ml_stack": ["PyTorch", "TensorFlow", "LangChain", "HuggingFace"], "databases": ["PostgreSQL", "Redis", "MongoDB", "Neo4j"], "cloud": ["AWS", "GCP", "Docker", "Kubernetes"], "mlops": ["MLflow", "Airflow", "CI/CD", "Monitoring"] }
Building production AI systems for nonprofit analytics
- Built LLM-powered chatbot handling 1,200+ monthly queries with 90%+ accuracy
- Deployed Text-to-SQL service using Llama 3 with LangChain and FAISS
- Implemented Mistral-7B pipeline with chain-of-thought reasoning for mission classification
- Designed distributed processing system for 175K+ nonprofit records across GPU clusters
- Built Neo4j knowledge graph revealing 78 latent nonprofit funding networks
Optimizing AI inference and building scalable ML systems
- Reduced inference latency by 18% through batch processing optimizations
- Built RAG-powered chatbot with vector search (FAISS/ChromaDB)
- Developed BERT-based classification model achieving 80% accuracy
- Engineered DynamoDB system handling 500K+ creator-brand records
- Implemented LoRA-based fine-tuning enabling weekly model updates
Production-grade API achieving 1,600x latency reduction through intelligent caching
Highlights:
- vLLM + AWQ quantization (28GB → 14GB VRAM)
- Redis caching: 8.3s → 5ms response time
- Sliding-window rate limiting per API key
- Prometheus + Grafana observability
Stack: vLLM, FastAPI, Redis, Prometheus, Docker
Production-grade backend demonstrating Stripe-style payment patterns
Highlights:
- Idempotency middleware preventing duplicate charges
- Database transactions with rollback guarantees
- Rate limiting with Redis
- 20ms P50 latency, 0% error rate
Stack: Go, PostgreSQL, Redis, Docker
LangChain ReAct agents for intelligent startup pitch evaluation
Highlights:
- Multi-agent architecture with tool calling
- Real-time market analysis
- Structured evaluation framework
Stack: LangChain, OpenAI GPT-4, Streamlit, Python
End-to-end ML system for Medicare billing prediction
Highlights:
- Bio_ClinicalBERT for medical text processing
- XGBoost ensemble with SHAP explainability
- MLflow experiment tracking
- Delta Lake data versioning
Stack: PyTorch, XGBoost, MLflow, Delta Lake, SHAP
Scalable ETL pipeline for financial fraud detection
Highlights:
- Real-time streaming with PySpark
- SageMaker model deployment
- Redshift data warehouse
- Automated alerting system
Stack: AWS, PySpark, SageMaker, Redshift, Lambda
End-to-end data pipeline for climate insights
Highlights:
- Snowflake data warehouse with Snowpipe
- dbt transformations with CI/CD
- Interactive Tableau dashboards
- What-if scenario modeling
Stack: Snowflake, dbt, Tableau, Python
📝 I Built a Subscription Backend Like Stripe in 6 Hours: Here's What I Learned
Building something interesting? I'm always open to discussing software engineering, ML systems, or potential opportunities.
jeetp5118@gmail.com · LinkedIn · Medium