Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
@Harkiran11
Harkiran11
Follow

Harkiran Panesar Harkiran11

Software Engineer Intern @ TD | SWE @ McMaster | DevOps & AI Enthusiast

Highlights

  • Pro

Block or report Harkiran11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Harkiran11 /README.md

πŸ‘‹ Hello, World! I'm Harkiran Panesar

Software Engineering Student @ McMaster University

πŸ“« Let's Connect!

LinkedIn Email Portfolio

I'm passionate about building robust, scalable software solutions with a strong focus on high-performance backend architecture, machine learning infrastructure, and GPU-accelerated computing. I enjoy turning complex problems into highly optimized, automated, and agentic systems.


πŸ› οΈ Technologies & Tools

Area Technologies
Languages & Compute
AI & Machine Learning
Cloud & Architecture
Data & Databases
Frameworks & APIs

πŸ“Œ Highlighted Projects

Project Description Tech Stack
OmniDoc: Multimodal Document Intelligence Built an inference pipeline and batching logic running Llama 3.2 Vision and Qwen-VL simultaneously on AMD MI300X hardware. Achieved 340 pages/min (18x faster than CPU baseline) for complex chart-level Q&A and semantic citations. Python ROCm Llama Vision Qwen-VL MI300X
ML Experiment Tracker Developed a high-throughput polling dashboard for tracking ML training runs. Resolved read-heavy bottlenecks by placing Redis in front of PostgreSQL with write-invalidations, achieving sub-100ms reads under load. React Flask Redis PostgreSQL Docker
CUDA Matrix Multiplication Engine Engineered a low-level GPU compute kernel utilizing shared memory tiling and memory coalescing. Profiled heavily with NVIDIA Nsight Compute to resolve compute-bound vs memory-bound bottlenecks on massive matrix workloads. CUDA C++ NVIDIA Nsight GPU Profiling
PathFinderAI: Agentic RAG System Architected a multi-agent RAG system with persistent state management across conversational turns. Integrated live external data APIs and semantic vector search using LangGraph for complex reasoning chains. Python LangGraph LangChain Azure Vector DBs

πŸ“ˆ GitHub Stats

GitHub Streak

Pinned Loading

  1. cuda-matmul-engine cuda-matmul-engine Public

    High-performance CUDA matrix multiplication kernels - shared memory tiling, register blocking, Roofline Model analysis. Benchmarked against cuBLAS.

    Cuda 1

  2. ml-experiment-tracker ml-experiment-tracker Public

    Full-stack ML experiment tracking dashboard. Flask + React + PostgreSQL + Redis + Docker.

    JavaScript

  3. omnidoc omnidoc Public

    Upload complex documents , PDFs with charts, tables, scans and have a natural language conversation about everything, including the visuals.

    Python

  4. PathFinderAI-RAG-chatbot PathFinderAI-RAG-chatbot Public

    A production-grade RAG chatbot built in n8n that helps McMaster engineering students navigate careers using grounded data from university resources, live job APIs, and government salary statsβ€”all o...

  5. Stock_price_prediction Stock_price_prediction Public

    Stock Price Prediction with PyTorch and LSTM

    Jupyter Notebook

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /