Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit abc840d

Browse files
committed
changes to readme.md, mostly editorial
1 parent 9b82ecf commit abc840d

File tree

1 file changed

+66
-42
lines changed

1 file changed

+66
-42
lines changed

‎README.md

Lines changed: 66 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# marathon-docker-template
2-
Template code for participating in Topcoder Marathon Matches
3-
2+
Template submission format for participating in Topcoder Marathon Matches.
3+
Information in the challenge specification always overrides information in this document.
44

55
## Submission format
66
Our template supports both the "submit data" and "submit code" submission styles. Your submission should be a single ZIP file with the following content:
@@ -10,99 +10,123 @@ Our template supports both the "submit data" and "submit code" submission styles
1010
solution.csv
1111
/code
1212
Dockerfile
13+
flags.txt // optional
1314
<your code>
14-
```
15-
16-
The /solution/solution.csv is the output your algorithm generates on the provisional test set. The format of this file will be described in the challenge specification.
15+
```
1716

18-
The /code directory should contain a dockerized version of your system that will be used to reproduce your results in a well defined, standardized way. This folder must contain a Dockerfile that will be used to build a docker container that will host your system during final testing. How you organize the rest of the contents of the /code folder is up to you, as long as it satisfies the requirements listed below in the Final testing section.
17+
The `/solution/solution.csv` is the output your algorithm generates on the provisional test set. The format of this file will be described in the challenge specification.
18+
19+
The `/code` directory should contain a dockerized version of your system that will be used to reproduce your results in a well defined, standardized way. This folder must contain a `Dockerfile` that will be used to build a docker container that will host your system during final testing. How you organize the rest of the contents of the `/code` folder is up to you, as long as it satisfies the requirements listed below in the Final testing section.
1920

2021
#### Notes:
21-
During provisional testing only your solution.csv file will be used for scoring, however the tester tool will verify that your submission file confirms to the required format. This means that at least the /code/Dockerfile must be present from day 1, even if it doesn't describe any meaningful system to be built. However, we recommend that you keep working on the dockerized version of your code as the challenge progresses, especially if you are at or close to a prize winning rank on the provisional leaderboard.
22+
-During provisional testing only your `solution.csv` file will be used for scoring, however the tester tool will verify that your submission file conforms to the required format. This means that at least the `/code/Dockerfile` must be present from day 1, even if it doesn't describe any meaningful system to be built. However, we recommend that you keep working on the dockerized version of your code as the challenge progresses, especially if you are at or close to a prize winning rank on the provisional leaderboard.
2223

23-
Make sure that your submission package is smaller than 500 MB. This means that if you use large files (external libraries, data files, pretained model files, etc) that won't fit into this limit, then your docker build process must download these from the net during building. There are several ways to achieve this, e.g. external libraries may be installed from a git repository, data files may be downloaded using wget or curl from Dropbox or Google Drive or any other public file hosting service. In any case always make sure that your build process is carefully tested end to end before you submit your package for final testing.
24+
-Make sure that your submission package is smaller than 500 MB. This means that if you use large files (external libraries, data files, pretained model files, etc) that won't fit into this limit, then your docker build process must download these from the net during building. There are several ways to achieve this, e.g. external libraries may be installed from a git repository, data files may be downloaded using `wget` or `curl` from Dropbox or Google Drive or any other public file hosting service. In any case always make sure that your build process is carefully tested end to end before you submit your package for final testing.
2425

25-
During final testing your last submission file will be used to build your docker container.
26+
-During final testing your last submission file will be used to build your docker container.
2627

27-
Make sure that the contents of the /solution and /code folders are in sync, i.e. your solution.csv file contains the exact output of the current version of your code.
28+
-Make sure that the contents of the `/solution` and `/code` folders are in sync, i.e. your solution.csv file contains the exact output of the current version of your code.
2829

2930

3031
## Final testing
3132

32-
To be able to successfully submit your system for final testing, some familiarity with [Docker](https://www.docker.com/) is required. If you have not used this technology before then you may first check [this page](https://www.docker.com/why-docker) and other learning material linked from there. To install Docker follow [these instructions](https://www.docker.com/community-edition).
33+
To be able to successfully submit your system for final testing, some familiarity with [Docker](https://www.docker.com/) is required. If you have not used this technology before then you may first check [this page](https://www.docker.com/why-docker) and other learning material linked from there. To install Docker follow [these instructions](https://www.docker.com/community-edition).
34+
35+
In some contest you will work with GPU-accelerated systems in which case Nvidia-docker will also be required. See how to install Nvidia-docker [here](https://github.com/NVIDIA/nvidia-docker). Note that all sample `docker` commands given below should be replaced with `nvidia-docker` in this case.
3336

3437
## Contents of the /code folder
35-
The /code folder of your submission must contain:
38+
The `/code` folder of your submission must contain:
3639
1. All your code (training and inference) that are needed to reproduce your results.
37-
2. A dockerfile (named Dockerfile, without extension) that will be used to build your system.
38-
3. All data files that are needed during training and inference, with the exception of
39-
the contest's own training and testing data. You may assume that the contents of the /Training folder and the training annotations (as described in the Input files section) will be available on the machine where your docker container runs, compressed files already unpacked,
40-
large data files that can be downloaded automatically either during building or running your docker script.
41-
4. Your trained model file(s). Alternatively your build process may download your model files from the network. Either way, you must make it possible to run inference without having to execute training first.
40+
2. A dockerfile (named `Dockerfile`, without extension) that will be used to build your system.
41+
3. All data files that are needed during training and inference, with the exception of
42+
-the contest's own training and testing data. You may assume that the training and testing data (as described in the problem statement's "Input files" section) will be available on the machine where your docker container runs, compressed files already unpacked,
43+
-large data files that can be downloaded automatically either during building or running your docker script.
44+
4. Your trained model file(s). Alternatively your build process may download your model files from the network. Either way, you must make it possible to run inference without having to execute training first.
4245

4346
The tester tool will unpack your submission, and the
4447
```
4548
docker build -t <id> .
4649
```
47-
command will be used to build your docker image (the final '.' is significant), where `<id>` is your TopCoder handle.
50+
command will be used to build your docker image (the final '.' is significant), where `<id>` is your TopCoder handle.
51+
4852
The build process must run out of the box, i.e. it should download and install all necessary 3rd party dependencies, either download from internet or copy from the unpacked submission all necessary external data files, your model files, etc.
4953
Your container will be started by the
5054
```
5155
docker run -v <local_data_path>:/data:ro -v <local_writable_area_path>:/wdata -it <id>
5256
```
53-
command (single line), where the -v parameter mounts the contest's data to the container's /data folder. This means that all the raw contest data will be available for your container within the /data folder. Note that your container will have read only access to the /data folder. You can store large temporary files in the /wdata folder.
57+
command, where the `-v` parameter mounts the contest's data to the container's `/data` folder. This means that all the raw contest data will be available for your container within the `/data` folder. Note that your container will have read only access to the `/data` folder. You can store large temporary files in the `/wdata` folder.
5458

55-
To validate the template file supplied with this repo. You can execute the following command:
59+
To validate the template file supplied with this repo, you can execute the following command:
5660
```
5761
docker run -it <id>
5862
```
5963

64+
#### Custom docker options
65+
In some cases it may be necessary to pass custom options to the `docker` or `nvidia-docker` commands. If you need such flags, you should list them in a file named `flags.txt` and place this file in the `/code` folder of your submission. The file must contain a single line only. If this file exists then its content will be added to the options list of the `docker run` command.
66+
67+
Example:
68+
69+
If `flags.txt` contains:
70+
```
71+
--ipc=host --shm-size 4G
72+
```
73+
then the docker command will look like:
74+
```
75+
docker run --ipc=host --shm-size 4G -v <local_data_path>:/data:ro -v <local_writable_area_path>:/wdata -it <id>
76+
```
77+
6078
## Train and test scripts
6179

62-
Your container must contain a train and test (a.k.a. inference) script having the following specification:
80+
Your container must contain a train and test (a.k.a. inference) script having the following specification. See the problem statement for further, problem specific requirements like the allowed time limits for these scripts.
6381

64-
`train.sh <data-folder>` should create any data files that your algorithm needs for running test.sh later. The supplied `<data-folder>` parameters point to a folder having training data in the same structure as is available for you during the coding phase. You may assume that the data folder path will be under /data.
82+
### train.sh
6583

66-
As its first step train.sh must delete the self-created models shipped with your submission.
84+
`train.sh <data-folder>` should create any data files that your algorithm needs for running `test.sh` later. The supplied `<data-folder>` parameter points to a folder having training data in the same structure as is available for you during the coding phase. You may assume that the data folder path will be under `/data` within your container.
6785

68-
Some algorithms may not need any training at all. It is a valid option to leave train.sh empty, but the file must exist nevertheless.
86+
As its first step `train.sh` must delete the self-created models shipped with your submission.
6987

70-
Training should be possible to do with working with only the contest's own training data and publicly available external data. This means that this script should do all the preprocessing and training steps that are necessary to reproduce your complete training work flow.
88+
Some algorithms may not need any training at all. It is a valid option to leave `train.sh` empty, but the file must exist nevertheless.
7189

72-
A sample call to your training script (single line):
90+
Training should be possible to do with working with only the contest's own training data and publicly available external data. This means that this script should do all the preprocessing and training steps that are necessary to reproduce your complete training work flow.
91+
92+
A sample call to your training script:
7393
```
74-
./train.sh /data/train/
94+
./train.sh /data/train/
7595
```
7696

7797
In this case you can assume that the training data looks like this:
7898
```
79-
data/
80-
train/
81-
TODO fill after structure fixed
99+
data/
100+
train/
101+
// all raw training data,
102+
// e.g. images and annotations
82103
```
83104

84-
`test.sh <data-folder> <output_path>` should run your inference code using new, unlabeled data and should generate an output CSV file, as specified by the problem statement. You may assume that the data folder path will be under /data.
105+
### test.sh
106+
107+
`test.sh <data-folder> <output_path>` should run your inference code using new, unlabeled data and should generate an output CSV file, as specified by the problem statement. You may assume that the data folder path will be under `/data`.
85108

86-
Inference should be possible to do without running training first, i.e. using only your prebuilt model files.
109+
Inference should be possible to do without running training first, i.e. using only your prebuilt model files.
87110

88-
It should be possible to execute your inference script multiple times on the same input data or on different input data. You must make sure that these executions don't interfere, each execution leaves your system in a state in which further executions are possible.
111+
It should be possible to execute your inference script multiple times on the same input data or on different input data. You must make sure that these executions don't interfere, each execution leaves your system in a state in which further executions are possible.
89112

90113
A sample call to your testing script (single line):
91114
```
92-
./test.sh /data/test/ solution.csv
115+
./test.sh /data/test/ solution.csv
93116
```
94117
In this case you can assume that the testing data looks like this:
95118
```
96-
data/
97-
test/
98-
TODO fill
119+
data/
120+
test/
121+
// all raw testing data,
122+
// e.g. unlabeled images
99123
```
100-
101124

102125
## Verification workflow
103-
1. test.sh is run on the provisional test set to verify that the results of your latest online submission can be reproduced. This test run uses your home built models.
104-
2. test.sh is run on the final validation dataset, again using your home built models. Your final score is the one that your system achieves in this step.
105-
3. train.sh is run on the full training dataset to verify that your training process is reproducible. After the training process finishes, further executions of the test script must use the models generated in this step.
106-
4. test.sh is run on the final validation dataset (or on a subset of that), using the models generated in the previous step, to verify that the results achieved in step #2 above can be reproduced.
126+
1. `test.sh` is run on the provisional test set to verify that the results of your latest online submission can be reproduced. This test run uses your home built models.
127+
2. `test.sh` is run on the final validation dataset, again using your home built models. Your final score is the one that your system achieves in this step.
128+
3. `train.sh` is run on the full training dataset to verify that your training process is reproducible. After the training process finishes, further executions of the test script must use the models generated in this step.
129+
4. `test.sh` is run on the final validation dataset (or on a subset of that), using the models generated in the previous step, to verify that the results achieved in step #2 above can be reproduced.
130+
107131
A note on reproducibility: we are aware that it is not always possible to reproduce the exact same results. E.g. if you do online training then the difference in the training environments may result in different number of iterations, meaning different models. Also you may have no control over random number generation in certain 3rd party libraries. In any case, the results must be statistically similar, and in case of differences you must have a convincing explanation why the same result can not be reproduced.
108132

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /