This fixes an issue where the render scale array would not be updated when technically the scales on the flat array were the same, but the start index for the vertex scales was different.
* Add support for render scale to vertex stage.
Occasionally games read off textureSize on the vertex stage to inform the fragment shader what size a texture is without querying in there. Scales were not present in the vertex shader to correct the sizes, so games were providing the raw upscaled texture size to the fragment shader, which was incorrect.
One downside is that the fragment and vertex support buffer description must be identical, so the full size scales array must be defined when used. I don't think this will have an impact though. Another is that the fragment texture count must be updated when vertex shader textures are used. I'd like to correct this so that the update is folded into the update for the scales.
Also cleans up a bunch of things, like it making no sense to call CommitRenderScale for each stage.
Fixes render scale causing a weird offset bloom in Super Mario Party and Clubhouse Games. Clubhouse Games still has a pixelated look in a number of its games due to something else it does in the shader.
* Split out support buffer update, lazy updates.
* Commit support buffer before compute dispatch
* Remove unnecessary qualifier.
* Address Feedback
* Limit Custom Anisotropic Filtering to only fully mipmapped textures
There's a major flaw with the anisotropic filtering setting that causes @GamerzHell9137 to report graphical bugs that otherwise wouldn't be there, because he just won't set it to Auto. This should fix those issues, hopefully.
These bugs are generally because anisotropic filtering is enabled on something that it shouldn't be, such as a post process filter or some data texture. This PR maintains two host samplers when custom AF is enabled, and only uses the forced AF one when the texture is 2d and fully mipmapped (goes down to 1x1). This is because game textures are the ideal target for this filtering, and they are typically fully mipmapped, unlike things like screen render targets which usually have 1 or just a few levels.
This also only enables AF on mipmapped samplers where the filtering is bilinear or trilinear. This should be self explanatory.
This PR also allows the changing of Anisotropic Filtering at runtime, and you can immediately see the changes. All samplers are flushed from the cache if the setting changes, causing them to be recreated with the new custom AF value. This brings it in line with our resolution scale. 😌
* Expected minimum mip count for large textures rather than all, address feedback
* Use Target rather than Info.Target
* Retrigger build?
* Fix rebase
* Implement DrawTexture functionality
* Non-NVIDIA support
* Disable some features that should not affect draw texture (slow path)
* Remove space from shader source
* Match 2D engine names
* Fix resolution scale and add missing XML docs
* Disable transform feedback for draw texture fallback
* Only reupload the texture scale array if it changes.
Before, this would be called all the time if any shader needed a scale value. The cost of doing this has increased with threaded-gal, as the scale array is copied to a span pool, and it's was called on pretty much every draw sometimes.
This improves GPU performance in games, scaled or not. Most affected game seems to be Xenoblade Chronicles: Definitive Edition.
* Just use = instead of |=
* 3D engine now uses DeviceState too, plus new state modification tracking
* Remove old methods code
* Remove GpuState and friends
* Optimize DeviceState, force inline some functions
* This change was not supposed to go in
* Proper channel initialization
* Optimize state read/write methods even more
* Fix debug build
* Do not dirty state if the write is redundant
* The YControl register should dirty either the viewport or front face state too, to update the host origin
* Avoid redundant vertex buffer updates
* Move state and get rid of the Ryujinx.Graphics.Gpu.State namespace
* Comments and nits
* Fix rebase
* PR feedback
* Move changed = false to improve codegen
* PR feedback
* Carry RyuJIT a bit more
* Use DeviceState for compute and i2m
* Migrate 2D class, more comments
* Migrate DMA copy engine
* Remove now unused code
* Replace GpuState by GpuAccessorState on GpuAcessor, since compute no longer has a GpuState
* More comments
* Add logging (disabled)
* Add back i2m on 3D engine
* Make GPU memory manager a member of GPU channel
* Move physical memory instance to the memory manager, and the caches to the physical memory
* PR feedback
* Ground work for separate GPU channels
* Rename TextureManager to TextureCache
* Decouple texture bindings management from the texture cache
* Rename BufferManager to BufferCache
* Decouple buffer bindings management from the buffer cache
* More comments and proper disposal
* PR feedback
* Force host state update on channel switch
* Typo
* PR feedback
* Missing using
Before this, all samplers instance were leaking on exit because the
dispose method was never getting called.
This fix this issue by making TextureBindingsManager disposable and
calling the dispose method in the TextureManager.
* Pass CbufSlot when getting info from the texture descriptor
Fixes some issues with bindless textures, when CbufSlot is not equal to the current TextureBufferIndex.
Specifically fixes a random chance of full screen colour flickering in Super Mario Party.
* Apply suggestions from code review
Oops
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Improve Buffer Textures and flush Image Stores
Fixes a number of issues with buffer textures:
- Reworked Buffer Textures to create their buffers in the TextureManager, then bind them with the BufferManager later.
- Fixes an issue where a buffer texture's buffer could be invalidated after it is bound, but before use.
- Fixed width unpacking for large buffer textures. The width is now 32-bit rather than 16.
- Force buffer textures to be rebound whenever any buffer is created, as using the handle id wasn't reliable, and the cost of binding isn't too high.
Fixes vertex explosions and flickering animations in UE4 games.
* Set ImageStore flag... for ImageStore.
* Check the offset and size.
* Support for resources on non-contiguous GPU memory regions
* Implement MultiRange physical addresses, only used with a single range for now
* Actually use non-contiguous ranges
* GetPhysicalRegions fixes
* Documentation and remove Address property from TextureInfo
* Finish implementing GetWritableRegion
* Fix typo
* Initial implementation of Render Target Scaling
Works with most games I have. No GUI option right now, it is hardcoded.
Missing handling for texelFetch operation.
* Realtime Configuration, refactoring.
* texelFetch scaling on fragment shader (WIP)
* Improve Shader-Side changes.
* Fix potential crash when no color/depth bound
* Workaround random uses of textures in compute.
This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything.
* Fix scales oscillating when changing between non-native scales.
* Scaled textures on compute, cleanup, lazier uniform update.
* Cleanup.
* Fix stupidity
* Address Thog Feedback.
* Cover most of GDK's feedback (two comments remain)
* Fix bad rename
* Move IsDepthStencil to FormatExtensions, add docs.
* Fix default config, square texture detection.
* Three final fixes:
- Nearest copy when texture is integer format.
- Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error)
- Discount small textures.
* Remove scale threshold.
Not needed right now - we'll see if we run into problems.
* All CPU modification blacklists scale.
* Fix comment.
* Support separate textures and samplers
* Add missing bindless flag, fix SNORM format on buffer textures
* Add missing separation
* Add comments about the new handles