-
Notifications
You must be signed in to change notification settings - Fork 264
-
I try v9-s v9-m v9-c rd-9c
used 100 images, just want find sticker, but I can't train a model
python yolo/lazy.py task=train name=test task.data.batch_size=16 model=rd-9c task.epoch=100 device=0 out_path="d:\temp\yolo"
Beta Was this translation helpful? Give feedback.
All reactions
I think I got answer
My dataset only 1 class , when enable Mosaic: 1
It no used and get more trouble
Replies: 3 comments 3 replies
-
Beta Was this translation helpful? Give feedback.
All reactions
-
I think I got answer
My dataset only 1 class , when enable Mosaic: 1
It no used and get more trouble
Beta Was this translation helpful? Give feedback.
All reactions
-
You can't train anything with 100 training images. You need more than 10000 to get started. You need to find a labelled dataset such as COCO 2017 or something else and train on that to get started. Then do 'transfer learning' after the training convergences 70-80% of the way on the original data set.
Beta Was this translation helpful? Give feedback.
All reactions
-
Does this mean you were able to use this repro to train a custom data set? For me, everything results in some internal errors, even the supplied coco.yaml or mock.yaml dataset files.
Beta Was this translation helpful? Give feedback.
All reactions
-
The supplied coco.yaml file has paths in it that point to where your dataset is on disk. The authors of this code (abbreviating this site as MMTL) can't know how your hard-disk is arranged. They couldn't have given more examples... but long-story short, they're operating from the perspective of "if you've been doing this for a while, you'll already know how to setup your own coco.yaml file". So they leave it to you to customize the file for your needs. I think I asked chatgpt to do it. It gave me extra stuff I didn't want, but it gave me a block of code that was sufficient to get the job done after I stripped the fluff out. The main thing is getting the path: correct at the top of coco.yaml (I was off by a parent/child dir level in my first attempt). If you're working on custom data, then you will also need to specify the train: and val: fields of coco.yaml as well. Each one should be a link to a text file. By default they're at train2017.txt and val2017.txt. Each of these files is then supposed to contain a long list of the names of samples in your dataset. If you have 100k samples in your data set, each one should be in its own image file in a directory. And then you'll need a parallel set of annotation files with the same base name as the image file names. You'll want 60-70% of your data files to be in your 'train2017.txt' file and 20-30% in your 'val2017.txt' file. And the last 10% is for testing after training is done.
if the yaml file has
path:/coco
then my dir structure has:
/coco
|---- annotations
|---- images
|---- test2017
|---- train2017
|---- val2017
|---- labels
|---- train2017
|---- val2017
where if I have the file in /coco/images/train2017 with the name 000000000009.jpg then I should have a corresponding file with the labels (bounding-boxes and segment data) in the directory /coco/labels, ie /coco/labels/000000000009.txt
And the file train2017.txt is a list of all the jpeg file names within /coco/images/train2017 with one item per row.
If you want to use a custom data set you need to follow this structure.
But to answer your question, no, I'm training on COCO data so I don't consider that to be a custom data set.
I don't consider custom data sets to be the problem. I consider a) installing all the libraries with their version dependencies and inter-submodule version conflict issues to be a problem, b) while the visualizations on wandb.ai are kind of awesome, I still hate the lightning framework and found it very hard to work with. Also I found several glitches/errors in this code when trying to use it for multi-gpu training. I got a decent console log output with good status info when doing single GPU training but not when using more than one. There were bugs in the code dealing with synchronization across multiple GPUs. So that's kind of a pain in the ass.
Lastly, when training from scratch the best mAP score I could get was about 0.428 which is a far cry from 0.53. When training on an already trained checkpoint and switching out code base to use integer math not floating point math, we were also several points below 0.53 on the mAP score. So long story short, this code doesn't train well and I don't have confidence in the lightning framework. And it's also missing all of the augmentation features that the ultralytics / WKY version of Yolo v9 has. Unfortunately that version has licensing constraints... But at least it performs up to spec. So we're writing it from scratch. Good times.
Beta Was this translation helpful? Give feedback.
All reactions
-
My spacing (for indentation) got crushed. The directories test2017, train2017, val2017 are subdirectories of 'images' and then in the second instance the pair train2017, val2017 are subdirectories of labels.
Beta Was this translation helpful? Give feedback.