Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

MusaIslamFahad/vision-track-AI

Repository files navigation

🎯 VisionTrack AI - Object Detection & Tracking

Python YOLOv8 OpenCV Streamlit Plotly W&B License

Production-grade object detection and tracking - YOLOv8 + BoT-SORT/ByteTrack, interactive Streamlit dashboard, Plotly analytics, MP4 export and optional W&B logging.

Built by Md. Musa Islam Fahad Β· CSE (Data Science) Β· Daffodil International University


πŸ“– Overview

VisionTrack AI is a production-grade, multi-object detection and tracking system. It combines YOLOv8 (Ultralytics) for fast and accurate detection with BoT-SORT or ByteTrack to assign persistent IDs and render movement trails across frames for all 80 COCO object classes.

An interactive Streamlit dashboard exposes full control over models, trackers, and thresholds, and renders live Plotly analytics (FPS timeline, object count chart, class breakdown, track timeline). Sessions can optionally be logged to Weights & Biases, and the annotated output is exportable as an MP4 video.

The system works on:

  • πŸŽ₯ Live webcam streams
  • πŸ“ Pre-recorded video files
  • πŸ–ΌοΈ Static images

✨ Features

Feature Details
πŸ€– Detection Model YOLOv8 (n / s / m / l / x variants) - pre-trained on COCO 80 classes
πŸ” Dual Tracker Support BoT-SORT (appearance + motion) Β· ByteTrack (motion only) - switchable from UI
🎨 Rich Visualisation Colour-coded bounding boxes, persistent track IDs, movement trails
🏷️ Class Filter Filter any of the 80 COCO classes directly from the sidebar
πŸ“Š Live Analytics Plotly charts - FPS timeline, object count, class breakdown, track timeline
πŸ“₯ MP4 Export Download the fully annotated video from the dashboard
πŸ“‘ W&B Logging Per-frame FPS, object count, session summary, model config (optional)
πŸ“· Multi-source Input Webcam, video file, or image - selectable from the UI
🐳 Docker Ready Full Dockerfile for containerised CPU or GPU deployment
☁️ Multi-platform Deploy Streamlit Cloud · HuggingFace Spaces · Docker

🧰 Tech Stack

Layer Technology
Language Python 3.10+
Detection Model YOLOv8 (Ultralytics)
Trackers BoT-SORT (appearance + Kalman filter) Β· ByteTrack (motion-only)
Computer Vision OpenCV
Deep Learning Backend PyTorch
UI / Dashboard Streamlit 1.35
Analytics Charts Plotly
Experiment Tracking Weights & Biases (optional)

πŸ“ Project Structure

object_detection_tracker/
β”‚
β”œβ”€β”€ app.py # Streamlit main application β€” entry point
β”‚
β”œβ”€β”€ src/
β”‚ └── tracker.py # ObjectTracker class (YOLOv8 + BoT-SORT engine)
β”‚
β”œβ”€β”€ utils/
β”‚ β”œβ”€β”€ analytics.py # SessionStats + Plotly chart builders
β”‚ β”œβ”€β”€ video_utils.py # Video I/O helpers (read, write, frame extraction)
β”‚ └── logger.py # W&B experiment logger
β”‚
β”œβ”€β”€ .streamlit/
β”‚ └── config.toml # Streamlit theme + server configuration
β”‚
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile # Container build definition
β”œβ”€β”€ .env.example # Environment variable template
β”œβ”€β”€ .gitignore
└── README.md

βš™οΈ Local Installation

1. Clone the repository

git clone https://github.com/MusaIslamFahad/codealpha_tasks.git
cd codealpha_tasks/CodeAlpha_Object_Detection_and_Tracking

2. Create a virtual environment

python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS / Linux
source venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

Or install manually:

pip install ultralytics opencv-python streamlit plotly wandb python-dotenv

GPU Acceleration: PyTorch is installed automatically via ultralytics. For CUDA support, install the GPU-enabled build from pytorch.org before running pip install -r requirements.txt.

4. Configure environment variables (optional - for W&B)

cp .env.example .env
# Edit .env and add: WANDB_API_KEY=your_key_here

5. Run the app

streamlit run app.py
# Open http://localhost:8501

πŸš€ Usage

Streamlit Dashboard

streamlit run app.py

Open http://localhost:8501 in your browser. Use the sidebar to:

  • Select input source (webcam / video file / image)
  • Choose model variant (yolov8n β†’ yolov8x)
  • Select tracker (BoT-SORT or ByteTrack)
  • Set confidence and IOU thresholds
  • Filter specific COCO classes
  • Enter your W&B API key (optional)

The main panel shows the annotated live feed, Plotly analytics charts below, and a download button for the MP4 export.

Keyboard Controls (webcam mode)

Key Action
q Quit / stop stream

🐳 Docker Deployment

# Build the image
docker build -t visiontrack-ai .
# Run on CPU
docker run -p 8501:8501 visiontrack-ai
# Run with GPU
docker run --gpus all -p 8501:8501 visiontrack-ai
# Open http://localhost:8501

☁️ Deploy to HuggingFace Spaces

  1. Create a new Space at huggingface.co/spaces
  2. Choose Streamlit SDK
  3. Push this repository to the Space:
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/visiontrack-ai
git push hf main
  1. (Optional) Apply for a Free GPU grant in your Space settings for real-time inference

☁️ Deploy to Streamlit Cloud

  1. Fork this repository to your GitHub account
  2. Go to share.streamlit.io and sign in
  3. Select repo β†’ app.py β†’ Deploy
  4. Add secrets in the Streamlit Cloud dashboard if using W&B:
    • Key: WANDB_API_KEY β†’ Value: your API key

πŸ“Š W&B Experiment Tracking

  1. Create a free account at wandb.ai
  2. Get your API key from Settings β†’ API Keys
  3. Paste it in the sidebar W&B API key field, or add it to .env

Each processing session automatically logs:

Metric Description
Per-frame FPS Inference speed over time
Object count Number of detections per frame
Unique tracks Total distinct objects tracked per session
Avg FPS / Peak objects Session-level summary stats
Model config Variant, confidence, IOU, tracker, and class filter settings

🧠 Architecture

Input (Image / Video File / Webcam)
 β”‚
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ YOLOv8 β”‚ ← Pre-trained on COCO (80 classes)
 β”‚ Detector β”‚ ← Configurable: n / s / m / l / x variant
 β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
 β”‚ raw detections (bbox, class, confidence)
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Tracker β”‚ ← BoT-SORT: appearance embedding + Kalman filter
 β”‚ (selectable from UI) β”‚ ← ByteTrack: motion-only, lighter & faster
 β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚ tracked detections (bbox, class, track_id)
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Annotation Engine β”‚ ← Bounding boxes, colour-coded IDs, trails, FPS HUD
 β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Streamlit Dashboard (app.py) │────▢│ W&B Logger β”‚ (optional)
 β”‚ β”œβ”€ Live annotated feed β”‚ β”‚ (logger.py) β”‚
 β”‚ β”œβ”€ Plotly analytics charts β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚ β”‚ β”œβ”€ FPS timeline β”‚
 β”‚ β”‚ β”œβ”€ Object count chart β”‚
 β”‚ β”‚ β”œβ”€ Class breakdown β”‚
 β”‚ β”‚ └─ Track timeline β”‚
 β”‚ └─ MP4 download button β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Step-by-step:

  1. Detection: YOLOv8 processes each frame and returns bounding boxes, class labels, and confidence scores.
  2. Tracking: The selected tracker (BoT-SORT or ByteTrack) matches detections across frames and assigns persistent IDs.
  3. Annotation: The annotation engine draws bounding boxes, colour-coded track IDs, and movement trails on each frame.
  4. Dashboard: Streamlit renders the annotated frames live alongside Plotly charts for FPS, object count, class breakdown, and track timelines.
  5. Export: The processed video is written to disk by video_utils.py and made available as an MP4 download.
  6. Logging: If W&B is configured, logger.py pushes per-frame and session-level metrics in real time.

🎯 Model Variants

# Swap model by changing one line in app.py or src/tracker.py
model = YOLO("yolov8n.pt") # Nano - fastest, lowest VRAM
model = YOLO("yolov8s.pt") # Small
model = YOLO("yolov8m.pt") # Medium
model = YOLO("yolov8l.pt") # Large
model = YOLO("yolov8x.pt") # Extra-large - most accurate
Model Size Speed (CPU) mAP50-95 Best For
yolov8n 6.2 MB ~8 FPS 37.3 CPU / edge / Streamlit Cloud
yolov8s 21.5 MB ~5 FPS 44.9 Balanced speed + accuracy
yolov8m 49.7 MB ~3 FPS 50.2 Higher accuracy
yolov8l 83.7 MB ~2 FPS 52.9 High accuracy
yolov8x 130.5 MB ~1 FPS 53.9 Max accuracy (GPU recommended)

Recommendation: Use yolov8n on CPU or Streamlit Cloud. Use yolov8m or larger on a dedicated GPU.


πŸ” Tracker Comparison

Tracker Algorithm Speed ID Stability Best For
BoT-SORT Appearance embedding + Kalman filter Medium ⭐⭐⭐⭐⭐ Crowded scenes, re-identification
ByteTrack Motion-only (IoU matching) Fast ⭐⭐⭐ Sparse scenes, low-VRAM environments

πŸ“‹ Requirements

ultralytics>=8.0.0
opencv-python>=4.8.0
torch>=2.0.0
streamlit>=1.35.0
plotly>=5.18.0
wandb>=0.16.0
python-dotenv>=1.0.0
numpy>=1.24.0

Python version: 3.10 or higher


πŸ“¦ COCO Classes (80)

Click to expand full class list

person bicycle car motorcycle airplane bus train truck boat traffic light fire hydrant stop sign parking meter bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard sports ball kite baseball bat baseball glove skateboard surfboard tennis racket bottle wine glass cup fork knife spoon bowl banana apple sandwich orange broccoli carrot hot dog pizza donut cake chair couch potted plant bed dining table toilet tv laptop mouse remote keyboard cell phone microwave oven toaster sink refrigerator book clock vase scissors teddy bear hair drier toothbrush


🀝 Contributing

Contributions are welcome! If you'd like to improve the project:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -m 'Add some feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request

πŸ‘€ Author

Md. Musa Islam Fahad
CSE (Data Science) Β· Daffodil International University, Dhaka
πŸ“§ musa.islam.fahad@gmail.com
🌐 Portfolio · GitHub · LinkedIn


πŸ“„ License

This project is licensed under the MIT License - see LICENSE for details.


πŸ™ Acknowledgements

  • Ultralytics for YOLOv8, BoT-SORT, and ByteTrack integration
  • OpenCV for computer vision utilities
  • Streamlit for the dashboard framework
  • Plotly for interactive analytics charts
  • Weights & Biases for experiment tracking infrastructure
  • CodeAlpha for the internship opportunity and project brief

⭐ If you found this useful or built something cool on top of it, drop a star. It helps a lot!

⬆ Back to Top

About

VisionTrack AI is a production-grade multi-object detection & tracking app powered by YOLOv8 + BoT-SORT/ByteTrack, featuring a rich interactive Streamlit dashboard with live Plotly analytics, class filtering, MP4 export, and optional W&B logging.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /