-
Notifications
You must be signed in to change notification settings - Fork 74
Comments
Jitify2 Windows/Visual Studio 2022 fixes#146
Jitify2 Windows/Visual Studio 2022 fixes #146ptheywood wants to merge 24 commits intoNVIDIA:jitify2 from
Conversation
Robadob
commented
Aug 5, 2025
I no longer / don't currently have the ability to check with VS2019 as github actions no longer provides an image with it, and I do not have a local install at the moment.
I have it at home, not sure about in the office. Could try it later this week/next
benbarsdell
commented
Aug 6, 2025
Thanks so much for this!
I'll need to find time to take a closer look, but I quickly ran the tests on Linux and the only problem is the --build-config $<CONFIG> bit, which I had to change to --build-config "\"$<CONFIG>\"" to make it work (otherwise the value is empty and it thinks it's missing). (It may still need work; CMake is not my strong suit).
Regarding the trivially-copyable thing, it seems that trivial copyability of classes with deleted copy constructors is complicated, which may explain why the compilers have different opinions. I don't really even remember why the Arg class is made non-copyable, so you can probably just remove that.
5f57534 to
ef0b040
Compare
aedd6c5 to
132540f
Compare
ptheywood
commented
Aug 22, 2025
I've now resolved all but one of the test failures under windows (Visual Studio 2022 with CUDA 12.6), which originally were:
[ FAILED ] Jitify2Test.GetDirectoryHelpers
[ FAILED ] Jitify2Test.ReadmeExampleCode
[ FAILED ] Jitify2Test.UserGuideExampleCode
[ FAILED ] Jitify2Test.NvccSimple
[ FAILED ] Jitify2Test.EncodedQuoteIncludes
[ FAILED ] Jitify2Test.LinkCurrentExecutable
[ FAILED ] Jitify2Test.SerializationGoldensProgram
[ FAILED ] Jitify2Test.SerializationGoldensPreprocessedProgram
[ FAILED ] Jitify2Test.SerializationGoldensCompiledProgram
[ FAILED ] Jitify2Test.SerializationGoldensLinkedProgram
with equivalent failures for Jitify2TestStatic.
Most failures were related to the location of the cuda home / include directory and/or nvcc, and the presence of the host compiler on the path and could be mitigated with CUDA_HOME/JITIFY_CUDA_HOME and JITIFY_NVCC are used.
Additionally cl.exe must be added to users paths for JITIFY_ENABLE_NVCC.
These were caused by / addressed by:
- On Windows, there is no symlinked CUDA install as a fallback / default location as on Windows, so the default/fallback locaiton in
guess_cuda_homewas not valid.- Using the version from
CUDA_VERSIONdefined incuda.hseemed like a reasonable option, although this may differ from the versionnvccfound at runtime which could cause issues (if env variables are not set), although this applies to the linux fallback option as well.
- Using the version from
- On Windows,
CUDA_PATHis set by the cuda installer to point to the most recent CUDA installation. This could be checked ifJITIFY_CUDA_HOMEanCUDA_HOMEare not set / valid. - On Windows, the default CUDA installation directory contains spaces, so the path to the binary must be quoted when invoked via
_popen.- The same applies to the temporary include directory, which also includes single backslashes that need escaping or quoting too.
GetTempPath2Aon windows was returning the shortened version of the path to the temporary folder, which (potenitally) modifies the username portion of the path to include a~, i.e.C:\Users\PTHEYW~1\AppData\Local\Temp\where my local username is actuallyptheywood.- This fix may not be requried, as the single backslashes in the unquoted string were probably the real problem
- A warning is emmitted on windows about the temporary directory not being deleted - as the
detail::remove_allis not implemented for windows in C++14, and the__cplusplusvalue in MSVC tends to lie.- Switching to use
JITIFY_CPLUSPLUSinstead allowsstd::filesystem::remove_allto be used instead, and we do not require a c++14 version so I've not put the time in to address that case
- Switching to use
Jitify2Test.NvccSimple continued to fail, as is_valid_nvcc was always returning false.
_pcloseon windows (and thereforerun_system_command), returns0whennvcc --versionsucceeds or1whennvcccould not be invoked.is_valid_nvccwould only return true when a positive non-zero integer was returned, which was not the case on success on windows.- The documentation for the return value from
_pcloseis poor, stating to look at the_cwaitdocs for info on the value returned, but_cwait's documentation does not contain this...
- The documentation for the return value from
pcloseon posix/linux returns a positive integer value on success, which may include more than just the return code which could be checked viaWIFEXITED&WEXITSTATUS.- Use of
run_system_commandinis_valid_nvccwas always triggering asigpipeunder unix as nothing was being read from the pipe, leading to thepclosereturn value being a ~5 digit positive integer, and the underlying return code was141(128+13for sigpipe).
- Use of
I've resolved this by making run_system_command return:
-1if somethingn went very wrong opening / closing the pipe (as before)- the exit code of the underlying command,
0on success by convention, non zero otherwise
And where run_system_command is used the result is compared against 0 for truthy-ness.
This is covered by the test suite for is_nvcc_valid, but does not appear to be covered for use of detail::Nvcc::operator() (in NvccProgram::compile)? So I'm not 100% I've not caused any problems there / know if that behaves or not under windows.
Does this seem like an OK solution?
Jitify2Test.EncodedQuoteIncludes was failing as the index of the cuda_fp16.h header was 3 not 2 as in the test (i.e. "__jitify_I3@" instead of "__jitify_I2@").
I've resovled this by expecting a different value on windows than on linux.
Finally, Jitify2Test.LinkCurrentExecutable is continuing to fail under windows:
unknown file: error: C++ exception with description "Jitify fatal error: Linking failed: NVJITLINK_ERROR_INVALID_INPUT
Linker options: "-l. -arch=sm_86"
Linker error: Failed to add file: C:\Users\ptheywood\code\ptheywood\jitify\build-cu126\Release\jitify2_test.exe" thrown in the test body.
I'm unsure on why this doesn't work under windows / am not familiar enough with nvjitlink to know if this should work under windows or not, or how to fix it.
Any suggestions?
fc48a6d to
e5c59d6
Compare
...arget generators. Provides ctest with the --build-config only if $<CONFIG> genex is not empty
Fixed by splitting the string literal into two adjacent string literals.
...:nvvm in the test suite
This should not be neccessary, but (void)pch_verbose; does not prevent this being raised
...pyable under MSVC
... in guess_cuda_home
This allows for cases where the path to nvcc includes spaces, such as the default CUDA toolkit installation location on Windows
... path (no modified usernames)
...ws in remove_all This enables removal of the temporary directory, suppressing a warning
...st 0 on usage for success Always reads from the pipe even when not capturing output, to ensure that a sigpipe is not raised which prevents the exit code of the underlying command from being accessed
...c++ standard std::result_of was deprecated in c++17 and removed in c++20. This was identified under MSVC in c++20
e5c59d6 to
2c90d02
Compare
With CUDA 13 (after rebasing on jitify2), additional errors occur under windows with VS2022:
When attempting to build the test suite(s), an error occurs in the MSVC customisations
C:\Program Files\Microsoft Visual Studio2022円\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 13.0.targets(803,9): error MSB6001: Invalid command line switch for "". The value [CUDA_INC_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include] contains an odd number of double-quote characters. Only even numbers of literal double-quote characters are acceptable to command line tools. [C:\Users\ptheywood\code\ptheywood\jitify\build-cu130\jitify2_test_static.vcxproj]
From the .vcxproj files(s), CUDA_INC_DIR is a quoted string of 2 include directories (as cccl has moved), separated by a ;. I would have expected this should be fine, but ; is the separator used within .vcxproj <Defines>. This feels like it may be a bug in the cuda build customisations?.
<Defines>%(Defines);_WINDOWS;_CRT_SECURE_NO_WARNINGS;NDEBUG;JITIFY_ENABLE_NVTX=1;JITIFY_LINK_CUDA_STATIC=1;JITIFY_LINK_NVRTC_STATIC=1;JITIFY_LINK_NVJITLINK_STATIC=1;CUDA_INC_DIR="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include/cccl";CUB_DIR=;CMAKE_INTDIR="RelWithDebInfo"</Defines>
add_compile_definitions(
CUDA_INC_DIR="${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES}"
CUB_DIR=${CUDA_INC_DIR})
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include;C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v13.0/include/cccl
This can be worked around by passing them separately (splitting in cmake and passing as separate variables), but libcudacxx / cuda/std is used within the cuda toolkit (i.e. in curand) so where these are not explicitly passed to tests failures also occur.
Passing the CUB_INC_DIR as an additional include directory to tests which did not previsouly provide it partially works, however failures still occur due to:
-
Jitify2Test.EncodedQuoteIncludesfails as the inclusion index has changed -
Several tests fail due to CUDA 13's removal of pre-turing compute capabilities
- I've bumped these to Turing (75) which was introduced with CUDA 10.0 which seems recent enough to be OK as a minimum the test suite will run on (4 major versions), with the exception of passing multiple arch flags which would require >= CUDA 11.0 (for sm_80 to include more than one arch).
-
Thrust requires --std=c++17 for CCCL 3.0
-
Jitify2Test.LibCudaCxxAndBuiltinLimits fails due to
incompatible redefinition of macrowarnings forINTPTR_MINetc betweencuda/std/cstdintandstdint.h.-cuda/std/cstdint(105): warning #47-D: incompatible redefinition of macro \"INTPTR_MIN\" (declared at line 16 of stdint.h) - # define INTPTR_MIN INT64_MIN - ^ - -Remark: The warnings can be suppressed with \"-diag-suppress <warning-number>\" - -cuda/std/cstdint(106): warning #47-D: incompatible redefinition of macro \"INTPTR_MAX\" (declared at line 18 of stdint.h) - # define INTPTR_MAX INT64_MAX - ^ - -cuda/std/cstdint(107): warning #47-D: incompatible redefinition of macro \"UINTPTR_MAX\" (declared at line 20 of stdint.h) - # define UINTPTR_MAX UINT64_MAX - ^ - -cuda/std/cstdint(109): warning #47-D: incompatible redefinition of macro \"INTMAX_MIN\" (declared at line 17 of stdint.h) - # define INTMAX_MIN INT64_MIN - ^ - -cuda/std/cstdint(110): warning #47-D: incompatible redefinition of macro \"INTMAX_MAX\" (declared at line 19 of stdint.h) - # define INTMAX_MAX INT64_MAX - ^ - -cuda/std/cstdint(111): warning #47-D: incompatible redefinition of macro \"UINTMAX_MAX\" (declared at line 21 of stdint.h) - # define UINTMAX_MAX UINT64_MAX - ^ - -cuda/std/cstdint(113): warning #47-D: incompatible redefinition of macro \"PTRDIFF_MIN\" (declared at line 22 of stdint.h) - # define PTRDIFF_MIN INT64_MIN - ^ - -cuda/std/cstdint(114): warning #47-D: incompatible redefinition of macro \"PTRDIFF_MAX\" (declared at line 23 of stdint.h) - # define PTRDIFF_MAX INT64_MAXIn CCCL 3, these are defined in
libcudacxx/include/cuda/std/cstdint#L105-L114, as# define INTPTR_MIN INT64_MIN # define INTPTR_MAX INT64_MAX # define UINTPTR_MAX UINT64_MAX # define INTMAX_MIN INT64_MIN # define INTMAX_MAX INT64_MAX # define UINTMAX_MAX UINT64_MAX # define PTRDIFF_MIN INT64_MIN # define PTRDIFF_MAX INT64_MAX
Which is different than in
jitsafe_header_stdint_h:#define INTPTR_MIN LONG_MIN #define INTMAX_MIN LLONG_MIN #define INTPTR_MAX LONG_MAX #define INTMAX_MAX LLONG_MAX #define UINTPTR_MAX ULONG_MAX #define UINTMAX_MAX ULLONG_MAX #define PTRDIFF_MIN INTPTR_MIN #define PTRDIFF_MAX INTPTR_MAXI'm unsure on the correct resolution for this.
This PR under Windows now has 2 errors under windows, which I'm not sure how to (best) resolve:
Jitify2Test.LinkCurrentExecutablefails as mentioned in the previous comment. I do not know how to resolve this.Jitify2Test.LibCudaCxxAndBuiltinLimitsfails with CUDA 13.0, due to the `incompatible redefinition warnings. I'm not sure on how to best fix this.
(削除) I also need to re-test my MSVC fixes under linux once again prior to this being ready for full review/merging. (削除ここまで)
Edit: Slight fix required for difference in CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES values for CUDA 13 between unix/windows, but otherwise CUDA 12.9 passes all tests, CUDA 13.0 fails Jitify2Test.LibCudaCxxAndBuiltinLimits which was expected.
Any suggestions on the 2 outstanding test failures @benbarsdell?
mondus
commented
Sep 23, 2025
@benbarsdell Any chance that you could give @ptheywood some insight into the two failing tests? We are wanting to get a release of FLAME GPU out very soon but are stuck needing this PR to be merged. Could the tests even be skipped on windows to get these changes merged in?
Uh oh!
There was an error while loading. Please reload this page.
Compiling the current state of
jitify2(70783a3) under windows / MSVC 2022 was resulting in a number of compilation errors and compilation warnings when included in a separate project.getenvandfopenvia a compiler definitionstd::invoke_result_tfor >= C++17 in place ofstd::result_ofstd::result_ofwas deprecated in C++17 and removed in C++20size_ttointwarning with an explicit castjtiify2::CompilerProgramData::nvvmin the test suite#550-Dset but never used variablepch_verboseon windows(void)pch_verbose;does not prevent this being raised on windowsAdditionally, a few fixes are included for the CMakeLists.txt project in this repository for use under Windows / with Visual Studio 2022 as the generator.
-C/--build-configwith the target genex.I no longer / don't currently have the ability to check with VS2019 as github actions no longer provides an image with it, and I do not have a local install at the moment. We may also be dropping support as we cannot trivally use it via CI anymore.
There are still 2 outstanding issues I have not yet resolved, and some input would be helpful:
The trivially copyable kernel launch argument static assertion is failing for
TEST(Jitify2Test, ClassKernelArg)injitify2_test.cu, whereArgis used. I'm unsure on the correct fix for this as the static assertion looks sound to me.The test could be built with
JITIFY_IGNORE_NOT_TRIVIALLY_COPYABLE_ARGS=1to avoid this, but that feels wrong.With the above test disabled, a number of tests are failing in the jitify2 test suite. I've not yet attempted to resolve these, but will when I have time (ctest log).
I have also only tested this on Windows, so need to ensure I have not caused issues under linux either.
Todo
fd04575with GCC 12.3.0 and CUDA 12.9.86