Has anyone successfully trained their own data? · MultimediaTechLab/YOLO · Discussion #210

Rhroan
Jun 17, 2025

I try v9-s v9-m v9-c rd-9c
used 100 images, just want find sticker, but I can't train a model

python yolo/lazy.py task=train name=test task.data.batch_size=16 model=rd-9c task.epoch=100 device=0 out_path="d:\temp\yolo"

圖片

Answered by Rhroan

Jun 29, 2025

I think I got answer

My dataset only 1 class , when enable Mosaic: 1
It no used and get more trouble

View full answer

Replies: 3 comments 3 replies

Rhroan
Jun 17, 2025
Author

https://drive.google.com/file/d/1ZjAGnYeLOWjGJ804qs0a51gsYtNDVzf1/view?usp=drive_link

0 replies

Rhroan
Jun 29, 2025
Author

I think I got answer

My dataset only 1 class , when enable Mosaic: 1
It no used and get more trouble

0 replies

Answer selected by Rhroan

You can't train anything with 100 training images. You need more than 10000 to get started. You need to find a labelled dataset such as COCO 2017 or something else and train on that to get started. Then do 'transfer learning' after the training convergences 70-80% of the way on the original data set.

3 replies

@vierzwei

vierzwei Nov 11, 2025

Does this mean you were able to use this repro to train a custom data set? For me, everything results in some internal errors, even the supplied coco.yaml or mock.yaml dataset files.

@darinhitchings

darinhitchings Nov 12, 2025

The supplied coco.yaml file has paths in it that point to where your dataset is on disk. The authors of this code (abbreviating this site as MMTL) can't know how your hard-disk is arranged. They couldn't have given more examples... but long-story short, they're operating from the perspective of "if you've been doing this for a while, you'll already know how to setup your own coco.yaml file". So they leave it to you to customize the file for your needs. I think I asked chatgpt to do it. It gave me extra stuff I didn't want, but it gave me a block of code that was sufficient to get the job done after I stripped the fluff out. The main thing is getting the path: correct at the top of coco.yaml (I was off by a parent/child dir level in my first attempt). If you're working on custom data, then you will also need to specify the train: and val: fields of coco.yaml as well. Each one should be a link to a text file. By default they're at train2017.txt and val2017.txt. Each of these files is then supposed to contain a long list of the names of samples in your dataset. If you have 100k samples in your data set, each one should be in its own image file in a directory. And then you'll need a parallel set of annotation files with the same base name as the image file names. You'll want 60-70% of your data files to be in your 'train2017.txt' file and 20-30% in your 'val2017.txt' file. And the last 10% is for testing after training is done.

if the yaml file has

path:/coco

then my dir structure has:

where if I have the file in /coco/images/train2017 with the name 000000000009.jpg then I should have a corresponding file with the labels (bounding-boxes and segment data) in the directory /coco/labels, ie /coco/labels/000000000009.txt

And the file train2017.txt is a list of all the jpeg file names within /coco/images/train2017 with one item per row.

If you want to use a custom data set you need to follow this structure.

But to answer your question, no, I'm training on COCO data so I don't consider that to be a custom data set.

I don't consider custom data sets to be the problem. I consider a) installing all the libraries with their version dependencies and inter-submodule version conflict issues to be a problem, b) while the visualizations on wandb.ai are kind of awesome, I still hate the lightning framework and found it very hard to work with. Also I found several glitches/errors in this code when trying to use it for multi-gpu training. I got a decent console log output with good status info when doing single GPU training but not when using more than one. There were bugs in the code dealing with synchronization across multiple GPUs. So that's kind of a pain in the ass.

Lastly, when training from scratch the best mAP score I could get was about 0.428 which is a far cry from 0.53. When training on an already trained checkpoint and switching out code base to use integer math not floating point math, we were also several points below 0.53 on the mAP score. So long story short, this code doesn't train well and I don't have confidence in the lightning framework. And it's also missing all of the augmentation features that the ultralytics / WKY version of Yolo v9 has. Unfortunately that version has licensing constraints... But at least it performs up to spec. So we're writing it from scratch. Good times.

@darinhitchings

darinhitchings Nov 12, 2025

My spacing (for indentation) got crushed. The directories test2017, train2017, val2017 are subdirectories of 'images' and then in the second instance the pair train2017, val2017 are subdirectories of labels.

Has anyone successfully trained their own data? #210

Uh oh!

Rhroan Jun 17, 2025

Replies: 3 comments · 3 replies

Uh oh!

Uh oh!

Rhroan Jun 17, 2025 Author

Uh oh!

Rhroan Jun 29, 2025 Author

Uh oh!

darinhitchings Sep 12, 2025

Uh oh!

vierzwei Nov 11, 2025

Uh oh!

darinhitchings Nov 12, 2025

Uh oh!

darinhitchings Nov 12, 2025

Rhroan
Jun 17, 2025

Replies: 3 comments 3 replies

Rhroan
Jun 17, 2025
Author

Rhroan
Jun 29, 2025
Author

darinhitchings
Sep 12, 2025