Release 1.6.0 (2026-03-13)#

Supported architectures#

clangd works now, providing meaningful diagnostics for both sides of CUDA translation units, in either clang or nvcc dialect mode.
Fixed miscompiles of constexpr-evaluated expressions involving errors.
Fix an edgecase of SFINAE behaving differently from NVIDIA nvcc.
Fix various compiler crashes relating to diagnostic deferral.
Fix a compiler crash caused by attempting to convert an i1 to bf16.
Various performance improvents.
Don't reject __launch_bounds__ expressions containing commas.
Added an optimisation to detect cumsums and lower them using DPP/PERMLANE.
CUDA printf is now subjected to type checking of the format arguments against the format string speifiers.
PTX diagnostics now prettily underline the offending element.
Slightly improved compile times for all CUDA translation units.
elect instruction no longer crashes the compiler.
PTX parser no longer rejects labels immediatley followed by }.
Newly-supported NVCC flags:
- --gpu-architecture
- --gpu-code