Commit graph

97 commits

Author SHA1 Message Date
gdkchan
87f238be60
Change packed aliasing formats to UInt (#6358) 2024-02-24 19:02:20 -03:00
gdkchan
167f50bbcd
Implement virtual buffer dependencies (#6190)
* Implement virtual buffer copies

* Introduce TranslateAndCreateMultiBuffersPhysicalOnly, use it for copy and clear

* Rename VirtualBufferCache to VirtualRangeCache

* Fix potential issue where virtual range could exist in the cache, without a physical buffer

* Fix bug that could cause copy with negative size on CopyToDependantVirtualBuffer

* Remove virtual copy back for SyncAction

* GetData XML docs

* Make field readonly

* Fix virtual buffer modification tracking

* Remove CopyFromDependantVirtualBuffers from ExternalFlush

* Move things around a little to avoid perf impact

- Inline null check for CopyFromDependantVirtualBuffers
- Remove extra method call for SynchronizeMemoryWithVirtualCopyBack, prefer calling CopyFromDependantVirtualBuffers separately

* Fix up XML doc

---------

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2024-02-22 11:03:07 -03:00
riperiperi
31ed061bea
Vulkan: Improve texture barrier usage, timing and batching (#6240)
* WIP barrier batch

* Add store op to image usage barrier

* Dispose the barrier batch

* Fix encoding?

* Handle read and write on the load op barrier.

Load op consumes read accesses but does not add one, as the only other operation that can read is another load.

* Simplify null check

* Insert barriers on program change in case stale bindings are reintroduced

* Not sure how I messed this one up

* Improve location of bindings barrier update

This is also important for emergency deferred clear

* Update src/Ryujinx.Graphics.Vulkan/BarrierBatch.cs

Co-authored-by: Mary Guillemard <thog@protonmail.com>

---------

Co-authored-by: Mary Guillemard <thog@protonmail.com>
2024-02-17 00:21:37 -03:00
gdkchan
e37735ed26
Implement X8Z24 texture format (#6315) 2024-02-15 19:06:26 -03:00
gdkchan
4a6724622e
Force CPU copy for non-identity DMA remap (#6293) 2024-02-10 15:38:58 -03:00
gdkchan
609de33b0b
Implement BGR10A2 render target format (#6273) 2024-02-08 19:52:38 +01:00
gdkchan
8bb7a3fc97
Clamp vertex buffer size to mapped size if too high (#6272)
* Clamp vertex buffer size to mapped size if too high

* Update comment
2024-02-08 18:27:12 +01:00
gdkchan
8927e0669f
Revert change to skip flush when range size is 0 (#6254) 2024-02-04 18:12:12 -03:00
gdkchan
bbed3b9926
Fix depth compare value for TLD4S shader instruction with offset (#6253)
* Fix depth compare value for TLD4S shader instruction with offset

* Shader cache version bump
2024-02-04 20:58:17 +01:00
gdkchan
4df22eb867
Fix missing data for new copy dependency textures with mismatching size (#6161) 2024-01-22 17:42:26 -03:00
gdkchan
f241f88558
Add a separate device memory manager (#6153)
* Add a separate device memory manager

* Still need this

* Device writes are always tracked

* Device writes are always tracked (2)

* Rename more instances of gmm to mm
2024-01-22 17:14:46 -03:00
gdkchan
870d9599cc
Change shader cache init wait method (#6131)
* Change shader cache init wait method

* Make field readonly
2024-01-18 14:17:38 -03:00
gdkchan
f4b74e9ce1
Fix vertex buffer size when switching between inline and state draw parameters (#6101)
* Fix vertex buffer size when switching between inline and state draw parameters

* Format whitespace
2024-01-14 09:37:19 +01:00
gdkchan
1df6c07f78
Implement support for multi-range buffers using Vulkan sparse mappings (#5427)
* Pass MultiRange to BufferManager

* Implement support for multi-range buffers using Vulkan sparse mappings

* Use multi-range for remaining buffers, delete old methods

* Assume that more buffers are contiguous

* Dispose multi-range buffers after they are removed from the list

* Properly init BufferBounds for constant and storage buffers

* Do not try reading zero bytes data from an unmapped address on the shader cache + PR feedback

* Fix misaligned sparse buffer offsets

* Null check can be simplified

* PR feedback
2023-12-04 20:30:19 +01:00
TSRBerry
2989c163a8
editorconfig: Set default encoding to UTF-8 (#5793)
* editorconfig: Add default charset

* Change file encoding from UTF-8-BOM to UTF-8
2023-12-04 14:17:13 +01:00
gdkchan
21cd4c0c00
Extend bindless elimination to see through shuffle (#5958)
* Extend bindless elimination to see through shuffle

* Shader cache version bump
2023-11-23 00:51:51 +01:00
gdkchan
70d65d3d8e
Enable copy dependency between RGBA8 and RGBA32 formats (#5943)
* Enable copy dependency between RGBA8 and RGBA32 formats

* Take packed flag into account for texture formats

* Account for widths not being a multiple of each other

* Don't try to alias depth textures as color, fix log condition

* PR feedback
2023-11-19 15:27:34 -03:00
gdkchan
0b58f46266
Extend bindless elimination to see through Phis with the same results (#5957)
* Extend bindless elimination to see through Phis with the same results

* Shader cache version bump
2023-11-19 15:10:44 -03:00
gdkchan
dcf10561b9
Fix missing texture flush for draw then DMA copy sequence without render target change (#5933)
* Unbind render targets before DMA copy

* Move DirtyAction to TextureGroupHandle

* Fix lost copy dependency bug

* XML doc
2023-11-15 21:36:25 -03:00
Zoltan Csizmadia
29e192f241
Migrate to .NET 8 (#5887)
* Change TargetFramework to net8.0

* Disable info messages

* Fix warings

* Disable additional analyzer messages

* Fix typo

* Add whitespace

* Fix ref vs in warnings

* Use explicit [In] on array parameters

* No need to guard Remove with Contains

* Use 'ArgumentOutOfRangeException.ThrowIf...' instead of explicitly throwing a new exception instance

* Bump .NET SDK version

* Enable JsonSerializerIsReflectionEnabledByDefault

* Use 8.0.100 GA release

* Bump System package versions

---------

Co-authored-by: Zoltan Csizmadia <Zoltan.Csizmadia@vericast.com>
2023-11-15 17:41:31 +01:00
gdkchan
5b3662b793
Disable DMA GPU copy for block linear to linear copies (#5927)
* Disable DMA GPU copy for block linear to linear copies

* Simplify check

* PR feedback
2023-11-14 23:24:42 -03:00
gdkchan
1329c47ea4
Work around issue apparently caused by 5909 (#5926) 2023-11-14 22:24:54 -03:00
gdkchan
e6e5838916
Do not set modified flag again if texture was not modified (#5909)
* Do not set modified flag again if texture was not modified

* Formatting

* Fix copy dep regression
2023-11-13 18:07:05 -03:00
gdkchan
841dd56f4c
Implement copy dependency for depth and color textures (#4365)
* Implement copy dependency for depth and color textures

* Revert changes added because R32 <-> D32 copies were illegal

* Restore depth alias matches
2023-10-31 19:00:39 -03:00
gdkchan
9ef0be477b
Skip some invalid texture flushes (#5755) 2023-10-30 23:18:28 +01:00
riperiperi
76b53e018a
GPU: Add fallback when textureGatherOffsets is not supported (#5792)
* GPU: Add fallback when textureGatherOffsets is not supported.

This PR adds a fallback for GPUs or APIs that don't support an equivalent to the method `textureGatherOffsets`, where each of the 4 gathered texels has an individual offset. This is done by reusing the existing code to handle non-const offsets for texture instructions, though it has also been corrected as there were a few implementation issues.

MoltenVK reports support for this capability, and it didn't error when we initially released the MacOS build, but that has since changed. MVK still reports support, but spirv-cross has been fixed in a way that it _attempts_ to use this capability, but the metal compiler errors since it doesn't exist.

Some other fixes:
- textureGatherOffsets emulation has been changed significantly. It now uses 4 texture sample instructions (not gather), calculates a base texel (i=0 j=0) and adds the offsets onto it before converting into a tex coord. The final result is offset into a texel center, so it shouldn't be subject to interpolation, though this isn't perfect and could have some error with floating point formats with linear sampling. It is subject to texture wrap mode as it should be, which is why texelFetch was not used.
  - Maybe gather should be used here with component `w` (i=0, j=0), though this multiplies number of texels fetched by 4... The way it was doing this before _was_ wrong_, but doing it right would avoid issues with texel center precision.
- textureGatherOffset (singular) now performs textureGather with the offset applied to the coords, rather than the slower fallback where each texel is fetched individually.

* Increment shader cache version, remove unused arg

* Use base texture size for gather coord offset.

Implicit LOD for gather is not supported.

* Use 4 texture gathers for offsets emulation

Avoids issues with interpolation at cost of performance

(not sure how bad this is)

* Address Feedback
2023-10-20 15:05:09 +02:00
gdkchan
28dd7d80af
Enable copy between MS and non-MS textures with different height (#5801) 2023-10-18 04:47:22 +00:00
riperiperi
f460ecc182
GPU: Add HLE macros for popular NVN macros (#5761)
* GPU: Add HLE macros for popular NVN macros

* Remove non-vector equality check

The case where it's not hardware accelerated will do the check integer-wise anyways.

* Whitespace 😔

* Address Feedback
2023-10-06 19:55:07 -03:00
gdkchan
b6ac45d36d
Fix SPIR-V call out arguments regression (#5767)
* Fix SPIR-V call out arguments regression

* Shader cache version bump
2023-10-06 00:18:30 -03:00
gdkchan
0aceb534cb
Fix SPIR-V function calls (#5764)
* Fix SPIR-V function calls

* Shader cache version bump
2023-10-04 21:35:26 -03:00
gdkchan
a0af6e4d07
Use unique temporary variables for function call parameters on SPIR-V (#5757)
* Use unique temporary variables for function call parameters on SPIR-V

* Shader cache version bump
2023-10-04 19:46:11 -03:00
gdkchan
a2a97e1b11
Implement textureSamples texture query shader instruction (#5750)
* Implement textureSamples texture query shader instruction

* Shader cache version bump
2023-10-03 22:43:11 +00:00
riperiperi
e63157cc33
GPU: Don't create tracking handles for buffer textures (#5727)
* GPU: Don't create tracking handles for buffer textures

Buffer texture memory is handled by the buffer cache - the texture shouldn't create any tracking handles as they aren't used. This change simply makes them create and iterate 0 tracking handles, while keeping the rest of the texture group around.

This prevents a possible issue where many buffer textures are created as views of overlapping buffer ranges, and virtual regions have many dependant textures that don't actually contribute anything to handle state.

Should improve performance in Mortal Kombat 1, possibly certain UE4 games when FIFO raises to 100%.

* Fix interval tree bug

* Don't check view compatibility for buffer textures
2023-09-26 12:37:10 -03:00
riperiperi
f6c3f1cdfd
GPU: Discard data when getting texture before full clear (#5719)
* GPU: Discard data when getting texture before full clear

* Fix rules and order of clear checks

* Fix formatting
2023-09-25 23:07:03 +02:00
gdkchan
a745913329
Fix gl_Layer to geometry shader change not writing gl_Layer (#5682)
* Fix gl_Layer to geometry shader change not writing gl_Layer

* Shader cache version bump
2023-09-14 14:53:53 -03:00
gdkchan
e2cfe6fe44
Fix shader GlobalToStorage pass when base address comes from local or shared memory (#5668)
* Fix shader GlobalToStorage pass when base address comes from local or shared memory

* Shader cache version bump
2023-09-11 01:22:18 +00:00
gdkchan
ddb6493896
Delete ResourceAccess (#5626)
* Delete ResourceAccess

* Set write flag for vertex/geometry as compute output buffers
2023-09-05 22:59:21 +02:00
gdkchan
f09bba82b9
Geometry shader emulation for macOS (#5551)
* Implement vertex and geometry shader conversion to compute

* Call InitializeReservedCounts for compute too

* PR feedback

* Set clip distance mask for geometry and tessellation shaders too

* Transform feedback emulation only for vertex
2023-08-29 21:10:34 -03:00
riperiperi
cd7b52f995
Vulkan: Fix MoltenVK flickering (#5612)
#5576 changed where the position was declared, but forgot to add the Invariant declaration to position when the ReducedPrecision flag was enabled. This was causing weird graphical bugs in a bunch of games, mostly to do with mismatching depth between multiple draws of the same geometry.

Maybe the attempt to add it to Position in DeclareInputOrOutput can be removed now, assuming that path is never used.
2023-08-23 16:40:25 -03:00
gdkchan
6ed613a6e6
Fix vote and shuffle shader instructions on AMD GPUs (#5540)
* Move shuffle handling out of the backend to a transform pass

* Handle subgroup sizes higher than 32

* Stop using the subgroup size control extension

* Make GenerateShuffleFunction static

* Shader cache version bump
2023-08-16 21:31:07 -03:00
gdkchan
17354d59d1
Declare and use gl_PerVertex block for VTG per-vertex built-ins (#5576)
* Declare and use gl_PerVertex block for VTG per-vertex built-ins

* Shader cache version bump
2023-08-16 23:16:25 +02:00
riperiperi
511b558ddc
GPU: Add Z16RUnormGUintBUintAUint Format (#5578)
This format seems to be an alias for Z16Unorm used by OpenGL games.
2023-08-16 23:02:53 +02:00
gdkchan
effd546331
Implement scaled vertex format emulation (#5564)
* Implement scaled vertex format emulation

* Auto-format (whitespace)

* Delete ToVec4Type
2023-08-16 08:30:33 -03:00
riperiperi
492a046335
Vulkan: Buffer Mirrors for MacOS performance (#4899)
* Initial implementation of buffer mirrors

Generally slower right now, goal is to reduce render passes in games that do inline updates

Fix support buffer mirrors

Reintroduce vertex buffer mirror

Add storage buffer support

Optimisation part 1

More optimisation

Avoid useless data copies.

Remove unused cbIndex stuff

Properly set write flag for storage buffers.

Fix minor issues

Not sure why this was here.

Fix BufferRangeList

Fix some big issues

Align storage buffers rather than getting full buffer as a range

Improves mirrorability of read-only storage buffers

Increase staging buffer size, as it now contains mirrors

Fix some issues with buffers not updating

Fix buffer SetDataUnchecked offset for one of the paths when using mirrors

Fix buffer mirrors interaction with buffer textures

Fix mirror rebinding

Move GetBuffer calls on indirect draws before BeginRenderPass to avoid draws without render pass

Fix mirrors rebase

Fix rebase 2023

* Fix crash when using stale vertex buffer

Similar to `Get` with a size that's too large, just treat it as a clamp.

* Explicitly set support buffer as mirrorable

* Address feedback

* Remove unused fragment of MVK workaround

* Replace logging for staging buffer OOM

* Address format issues

* Address more format issues

* Mini cleanup

* Address more things

* Rename BufferRangeList

* Support bounding range for ClearMirrors and UploadPendingData

* Add maximum size for vertex buffer mirrors

* Enable index buffer mirrors

Enabled on all platforms for the IbStreamer.

* Feedback

* Remove mystery BufferCache change

Probably macos related?

* Fix mirrors not creating when staging buffer is empty.

* Change log level to debug
2023-08-14 14:18:47 -03:00
gdkchan
550fd4a733
Simplify resolution scale updates (#5541) 2023-08-14 13:57:39 -03:00
riperiperi
33f544fd92
GPU: Track basic buffer copies that modify texture memory (#5554)
This branch changes the buffer copy fast path to notify memory tracking for all resources that aren't buffers. This fixes cases where games would copy buffer data directly into texture memory, which before would only work if the texture did not already exist. I imagine this happens when the guest driver is moving data between allocations or uploading it.

Since this only affects the fast path, cases where the source data has been modified from GPU (fast path copy destination doesn't count) will still fail to notify the texture, though I don't imagine games will do this. This should be resolved in future.

This should fix some texture issues with guest OpenGL games on switch, such as Dragon Quest Builders.

This may also be useful in future for games that move shader data around memory, if we end up using memory tracking for those.
2023-08-14 08:41:11 +02:00
gdkchan
b423197619
Delete ShaderConfig and organize shader resources/definitions better (#5509)
* Move some properties out of ShaderConfig

* Stop using ShaderConfig on backends

* Replace ShaderConfig usages on Translator and passes

* Move remaining properties out of ShaderConfig and delete ShaderConfig

* Remove ResourceManager property from TranslatorContext

* Move Rewriter passes to separate transform pass files

* Fix TransformPasses.RunPass on cases where a node is removed

* Move remaining ClipDistancePrimitivesWritten and UsedFeatures updates to decode stage

* Reduce excessive parameter passing a bit by using structs more

* Remove binding parameter from ShaderProperties methods since it is redundant

* Replace decoder instruction checks with switch statement

* Put GLSL on the same plan as SPIR-V for input/output declaration

* Stop mutating TranslatorContext state when Translate is called

* Pass most of the graphics state using a struct instead of individual query methods

* Auto-format

* Auto-format

* Add backend logging interface

* Auto-format

* Remove unnecessary use of interpolated strings

* Remove more modifications of AttributeUsage after decode

* PR feedback

* gl_Layer is not supported on compute
2023-08-13 22:26:42 -03:00
jcm
773e239db7
Implement color space passthrough option (#5531)
Co-authored-by: jcm <butt@butts.com>
2023-08-07 18:54:05 +01:00
gdkchan
42750a74f8
Do not add more code after alpha test discard on fragment shader (#5529)
* Do not add more code after alpha test discard on fragment shader

* Shader cache version bump
2023-08-07 12:20:37 -03:00
riperiperi
6e784e0aca
GPU: Don't sync/bind index buffer when it's not in use (#5526)
* GPU: Don't sync/bind index buffer when it's not in use

Sometimes draws don't use an index buffer. It's not necessary to check or upload data for the current index buffer binding as it won't be used.

This fixes Pokemon: Legends Arceus updating a stale index buffer for every draw during its TFB pass, which was all non-indexed draws.

This probably didn't cost much on normal PCs, but it had a large impact on MacOS, which the macos1 release build avoided by mirroring index buffers (the PR currently does not). Needs buffer mirrors still for the rest of the performance.

There are additional cases where index buffers are bound or checked with non-indexed draws on the backend, but this one was straightforward to fix and has the largest impact. Testing is welcome to ensure nothing weird broke.

* Fix case with _rebind
2023-08-06 16:29:20 -03:00