Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: NVIDIA/dcgm-exporter

4.5.3-4.8.2

07 May 00:28
@nccurry nccurry
691c927
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update to DCGM 4.5.3 and DCGM Exporter 4.8.2.
  • Improve GPU health metrics, including reporting GPU-wide health incidents such as fallen-off-bus XIDs.
  • Make /debug/pprof profiling endpoints opt-in via --enable-pprof / DCGM_EXPORTER_ENABLE_PPROF.
  • Add PodMapper informer caching for Kubernetes pod mapping (#626) (@jaeeyoungkim).
  • Add per-process GPU metrics for time-sharing and MIG (#594) (@krystiancastai).
  • Make Helm priorityClassName configurable with explicit defaults (#444) (@runzhliu).
  • Add MIG device support for HPC job labels (#602) (@jay-mckay).
  • Update go-dcgm field metadata handling, deprecated field alias resolution, health constants, policy registration handling, and version info APIs.
  • Document IPv6 address formats for remote hostengine and metrics listen addresses.
  • Refresh dependencies, container base images, Docker image references, Helm chart values, Kubernetes manifests, and tests for this release.

Contributors

runzhliu, jay-mckay, and 2 other contributors
Assets 2
Loading

4.5.2-4.8.1

09 Feb 15:43
@glowkey glowkey
52ffa18
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update to DCGM 4.5.2, latest Go 1.24, and base containers
  • Fix distroless symlink issue
  • Fix for parsing blank XIDs
  • Fix for nvlink entities starting at offset 1
Loading
hongbo-miao reacted with hooray emoji
1 person reacted

4.5.1-4.8.0

28 Jan 22:10
@glowkey glowkey
3cb017b
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update to DCGM 4.5.1
  • Enabled monitoring of GPU bind/unbind events and automatic reloading (@nvvfedorov) - beta
  • Sync default metric watchlist for docker and helm (@faizan-exe)
  • Fix health endpoint behavior (@Alja9)
  • Increase default memory limit to 512Mi (@faizan-exe)
  • Make scrapeTimeout configurable (@faizan-exe)
  • Fix P2P Status mappings (@wkd-woo)

NOTICE: Helm chart now uses distroless container by default

Contributors

Alja9, wkd-woo, and 2 other contributors
Loading

4.4.2-4.7.1

10 Dec 15:30
@glowkey glowkey
b921c57
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update Go-DCGM
  • Update XID error texts based off "XID errors v580" (#588)
  • FIX: fix for vailation time us->ns (#589)
  • feat: updating readme to include OpenObserve blog and dashboards for ... (#580)
Loading

4.4.2-4.7.0

18 Nov 17:47
@glowkey glowkey
54267f2
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

Contributors

andrew-leung, daveoy, and 2 other contributors
Loading

4.4.1-4.6.0

13 Oct 17:54
@glowkey glowkey
13ad457
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Add allow list for pod label filtering (#564)
  • handle uninitialized map (#563)
  • Add hostPID field to values.yaml and DaemonSet template for Helm (#503)
  • feat(dcgm-exporter): add option to fail on nvml provider init error (#557)
  • Improved support for GPU NvLink monitoring
Loading

4.4.1-4.5.2

17 Sep 15:00
@glowkey glowkey
a5a5aa8
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Follow FHS convention for logging: #556
  • Fix: enable metrics without pods using kubernetes-enable-{dra,virtual-gpus} #554
  • Add disable startup validate flag #555
  • hpc: reduce error logging if jobs directory does not exist
Loading

4.4.0-4.5.0

19 Aug 16:10
@glowkey glowkey
4ecf9b6
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update to DCGM 4.4 and Cuda 13.0
  • Kubernetes UID support (@andrew-leung)
  • Create distroless container target

Contributors

andrew-leung
Loading

4.3.1-4.4.0

07 Aug 15:12
@glowkey glowkey
6949141
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

  • Update To DCGM 4.3.1
  • Update podapi for DRA
  • Enable DCGM_EXP_P2P_STATUS for reporting GPU peer-to-peer nvlink status
  • Fix for empty HPC directory
  • Enable InitContainer support
Loading

4.2.3-4.2.0

11 Jul 14:25
@glowkey glowkey
9378144
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.

Choose a tag to compare

DCGM-Exporter 4.2.3-4.2.0

  • [ISSUE-512] Added a new debugging facility to dump runtime objects into files
  • Kubernetes pod label support
Loading
gkeesh7 and Tonoyama reacted with thumbs up emoji jlouazel and amitaekbote reacted with heart emoji
4 people reacted
Previous 1 3 4 5
Previous

AltStyle によって変換されたページ (->オリジナル) /