This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
The new Nvidia drivers have a bug where the FastReplicateTo6 function produces a lookup into the REPLICATE_TO_8 table rather than the REPLICATE_TO_6 table.
This seems to be an optimization gone wrong. Combining the logic of the FastReplicate functions seems to address the bug.
This buffer was a list of EncodingData structures sorted by their bit length, with some duplication from the cpu decoder implementation.
We can take advantage of its sorted property to optimize its usage in the shader.
Thanks to wwylele for the optimization idea.
These changes should help in reducing crashes/drivers panics that may
occur due to synchronization issues between the shader completion and
later access of the decoded texture.
Per the spec, L1 is clamped to the value 0xff if it is greater than 0xff. An oversight caused us to take the maximum of L1 and 0xff, rather than the minimum.
Huge thanks to wwylele for finding this.
Co-Authored-By: Weiyi Wang <wwylele@gmail.com>
ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively.
This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.