Release 1.3.1 (2025-05-12)#
Compiler#
- Fixed a bug in the handling of weak device-side symbols which broke the device binaries for certain projects.
- Fixed various PTX miscompilations.
- Added support for approximate-math PTX instructions (
lg2.approxand friends).
Library#
- Fixed many small bugs in the device-side APIs.
- Per-thread-default-stream actually works now, rather than silently using the legacy stream.
- Fixed a race condition in the fft library.
Thirdparty Project demos#
- GROMACS now works. SCALE appears to support a wider selection of AMD architectures than the HIP port, and seems to perform somewhat better (on MI210, at least!).