build-docker license open-issues coverage badge docs
To ensure the traceability, reproducibility and standardization for all ML datasets and models generated and consumed within Toyota Research Institute (TRI), we developed the Dataset-Governance-Policy (DGP) that codifies the schema and maintenance of all TRI's Autonomous Vehicle (AV) datasets.
- Schema: Protobuf-based schemas for raw data, annotations and dataset management.
- DataLoaders: Universal PyTorch DatasetClass to load all DGP-compliant datasets.
- CLI: Main CLI for handling DGP datasets and the entrypoint of visulization tools.
Please see Getting Started for environment setup.
Getting started is as simple as initializing a dataset-class with the relevant dataset JSON, raw data sensor names, annotation types, and split information. Below, we show a few examples of initializing a Pytorch dataset for multi-modal learning from 2D bounding boxes, and 3D bounding boxes.
from dgp.datasets import SynchronizedSceneDataset # Load synchronized pairs of camera and lidar frames, with 2d and 3d # bounding box annotations. dataset = SynchronizedSceneDataset('<dataset_name>_v0.0.json', datum_names=('camera_01', 'lidar'), requested_annotations=('bounding_box_2d', 'bounding_box_3d'), split='train')
A list of starter scripts are provided in the examples directory.
- examples/load_dataset.py: Simple example script to load a multi-modal dataset based on the Getting Started section above.
You can build the base docker image and run the tests within docker container via:
make docker-build make docker-run-tests
Build the Python wheel.
make build
For setup local developement.
make develop
Runing the test using local development environment.
make testThis repository adheres to PEP 440 for versioning.
We appreciate all contributions to DGP! To learn more about making a contribution to DGP, please see Contribution Guidelines.
| Job | CI | Notes |
|---|---|---|
| docker-build | Build Status | Docker build and push to container registry |
| pre-merge | Build Status | Pre-merge testing |
| doc-gen | Build Status | GitHub Pages doc generation |
| coverage | Build Status | Code coverage metrics and badge generation |
| Type | Platforms |
|---|---|
| π¨ Bug Reports | GitHub Issue Tracker |
| π Feature Requests | GitHub Issue Tracker |
DGP is developed and currently maintained by Quincy Chen, Arjun Bhargava, Chao Fang, Chris Ochoa and Kuan-Hui Lee from ML-Engineering team at Toyota Research Institute (TRI), with contributions coming from ML-Research team at TRI, Woven Planet and Parallel Domain.