TensorFloat-32
Appearance
From Wikipedia, the free encyclopedia
Numbering format in Nvidia hardware
This article needs additional citations for verification . Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "TensorFloat-32" – news · newspapers · books · scholar · JSTOR (April 2025) (Learn how and when to remove this message)
Find sources: "TensorFloat-32" – news · newspapers · books · scholar · JSTOR (April 2025) (Learn how and when to remove this message)
| Floating-point formats |
|---|
| IEEE 754 |
|
| Other |
| Alternatives |
| Tapered floating point |
TensorFloat-32 (TF32) is a numeric floating point format designed for Tensor Core running on certain Nvidia GPUs.
Format
[edit ]The binary format is:
- 1 sign bit
- 8 exponent bits
- 10 significand bits (also called mantissa, or precision bits)
The total 19-bit format fits within a double word (32 bits), and while it lacks precision compared with a normal 32-bit IEEE 754 floating-point number, provides much faster computation, up to 8 times on a A100 (compared to a V100 using FP32).[1]
See also
[edit ]References
[edit ]- ^ https://deeprec.readthedocs.io/en/latest/NVIDIA-TF32.html accessed 23 May 2024