Commit graph

63 commits

Author SHA1 Message Date
gdkchan
a745913329
Fix gl_Layer to geometry shader change not writing gl_Layer (#5682)
* Fix gl_Layer to geometry shader change not writing gl_Layer

* Shader cache version bump
2023-09-14 14:53:53 -03:00
gdkchan
e2cfe6fe44
Fix shader GlobalToStorage pass when base address comes from local or shared memory (#5668)
* Fix shader GlobalToStorage pass when base address comes from local or shared memory

* Shader cache version bump
2023-09-11 01:22:18 +00:00
gdkchan
ddb6493896
Delete ResourceAccess (#5626)
* Delete ResourceAccess

* Set write flag for vertex/geometry as compute output buffers
2023-09-05 22:59:21 +02:00
gdkchan
f09bba82b9
Geometry shader emulation for macOS (#5551)
* Implement vertex and geometry shader conversion to compute

* Call InitializeReservedCounts for compute too

* PR feedback

* Set clip distance mask for geometry and tessellation shaders too

* Transform feedback emulation only for vertex
2023-08-29 21:10:34 -03:00
riperiperi
cd7b52f995
Vulkan: Fix MoltenVK flickering (#5612)
#5576 changed where the position was declared, but forgot to add the Invariant declaration to position when the ReducedPrecision flag was enabled. This was causing weird graphical bugs in a bunch of games, mostly to do with mismatching depth between multiple draws of the same geometry.

Maybe the attempt to add it to Position in DeclareInputOrOutput can be removed now, assuming that path is never used.
2023-08-23 16:40:25 -03:00
gdkchan
6ed613a6e6
Fix vote and shuffle shader instructions on AMD GPUs (#5540)
* Move shuffle handling out of the backend to a transform pass

* Handle subgroup sizes higher than 32

* Stop using the subgroup size control extension

* Make GenerateShuffleFunction static

* Shader cache version bump
2023-08-16 21:31:07 -03:00
gdkchan
17354d59d1
Declare and use gl_PerVertex block for VTG per-vertex built-ins (#5576)
* Declare and use gl_PerVertex block for VTG per-vertex built-ins

* Shader cache version bump
2023-08-16 23:16:25 +02:00
riperiperi
511b558ddc
GPU: Add Z16RUnormGUintBUintAUint Format (#5578)
This format seems to be an alias for Z16Unorm used by OpenGL games.
2023-08-16 23:02:53 +02:00
gdkchan
effd546331
Implement scaled vertex format emulation (#5564)
* Implement scaled vertex format emulation

* Auto-format (whitespace)

* Delete ToVec4Type
2023-08-16 08:30:33 -03:00
riperiperi
492a046335
Vulkan: Buffer Mirrors for MacOS performance (#4899)
* Initial implementation of buffer mirrors

Generally slower right now, goal is to reduce render passes in games that do inline updates

Fix support buffer mirrors

Reintroduce vertex buffer mirror

Add storage buffer support

Optimisation part 1

More optimisation

Avoid useless data copies.

Remove unused cbIndex stuff

Properly set write flag for storage buffers.

Fix minor issues

Not sure why this was here.

Fix BufferRangeList

Fix some big issues

Align storage buffers rather than getting full buffer as a range

Improves mirrorability of read-only storage buffers

Increase staging buffer size, as it now contains mirrors

Fix some issues with buffers not updating

Fix buffer SetDataUnchecked offset for one of the paths when using mirrors

Fix buffer mirrors interaction with buffer textures

Fix mirror rebinding

Move GetBuffer calls on indirect draws before BeginRenderPass to avoid draws without render pass

Fix mirrors rebase

Fix rebase 2023

* Fix crash when using stale vertex buffer

Similar to `Get` with a size that's too large, just treat it as a clamp.

* Explicitly set support buffer as mirrorable

* Address feedback

* Remove unused fragment of MVK workaround

* Replace logging for staging buffer OOM

* Address format issues

* Address more format issues

* Mini cleanup

* Address more things

* Rename BufferRangeList

* Support bounding range for ClearMirrors and UploadPendingData

* Add maximum size for vertex buffer mirrors

* Enable index buffer mirrors

Enabled on all platforms for the IbStreamer.

* Feedback

* Remove mystery BufferCache change

Probably macos related?

* Fix mirrors not creating when staging buffer is empty.

* Change log level to debug
2023-08-14 14:18:47 -03:00
gdkchan
550fd4a733
Simplify resolution scale updates (#5541) 2023-08-14 13:57:39 -03:00
riperiperi
33f544fd92
GPU: Track basic buffer copies that modify texture memory (#5554)
This branch changes the buffer copy fast path to notify memory tracking for all resources that aren't buffers. This fixes cases where games would copy buffer data directly into texture memory, which before would only work if the texture did not already exist. I imagine this happens when the guest driver is moving data between allocations or uploading it.

Since this only affects the fast path, cases where the source data has been modified from GPU (fast path copy destination doesn't count) will still fail to notify the texture, though I don't imagine games will do this. This should be resolved in future.

This should fix some texture issues with guest OpenGL games on switch, such as Dragon Quest Builders.

This may also be useful in future for games that move shader data around memory, if we end up using memory tracking for those.
2023-08-14 08:41:11 +02:00
gdkchan
b423197619
Delete ShaderConfig and organize shader resources/definitions better (#5509)
* Move some properties out of ShaderConfig

* Stop using ShaderConfig on backends

* Replace ShaderConfig usages on Translator and passes

* Move remaining properties out of ShaderConfig and delete ShaderConfig

* Remove ResourceManager property from TranslatorContext

* Move Rewriter passes to separate transform pass files

* Fix TransformPasses.RunPass on cases where a node is removed

* Move remaining ClipDistancePrimitivesWritten and UsedFeatures updates to decode stage

* Reduce excessive parameter passing a bit by using structs more

* Remove binding parameter from ShaderProperties methods since it is redundant

* Replace decoder instruction checks with switch statement

* Put GLSL on the same plan as SPIR-V for input/output declaration

* Stop mutating TranslatorContext state when Translate is called

* Pass most of the graphics state using a struct instead of individual query methods

* Auto-format

* Auto-format

* Add backend logging interface

* Auto-format

* Remove unnecessary use of interpolated strings

* Remove more modifications of AttributeUsage after decode

* PR feedback

* gl_Layer is not supported on compute
2023-08-13 22:26:42 -03:00
jcm
773e239db7
Implement color space passthrough option (#5531)
Co-authored-by: jcm <butt@butts.com>
2023-08-07 18:54:05 +01:00
gdkchan
42750a74f8
Do not add more code after alpha test discard on fragment shader (#5529)
* Do not add more code after alpha test discard on fragment shader

* Shader cache version bump
2023-08-07 12:20:37 -03:00
riperiperi
6e784e0aca
GPU: Don't sync/bind index buffer when it's not in use (#5526)
* GPU: Don't sync/bind index buffer when it's not in use

Sometimes draws don't use an index buffer. It's not necessary to check or upload data for the current index buffer binding as it won't be used.

This fixes Pokemon: Legends Arceus updating a stale index buffer for every draw during its TFB pass, which was all non-indexed draws.

This probably didn't cost much on normal PCs, but it had a large impact on MacOS, which the macos1 release build avoided by mirroring index buffers (the PR currently does not). Needs buffer mirrors still for the rest of the performance.

There are additional cases where index buffers are bound or checked with non-indexed draws on the backend, but this one was straightforward to fix and has the largest impact. Testing is welcome to ensure nothing weird broke.

* Fix case with _rebind
2023-08-06 16:29:20 -03:00
gdkchan
f95b7c5877
Fix incorrect fragment origin when YNegate is enabled (#4673)
* Fix incorrect fragment origin when YNegate is enabled

* Shader cache version bump

* Do not update support buffer if shader does not read gl_FragCoord

* Pass unscaled viewport size to the support buffer
2023-07-29 18:47:03 -03:00
TSRBerry
eb528ae0f0
Add workflow to automatically check code style issues for PRs (#4670)
* Add workflow to perform automated checks for PRs

* Downgrade Microsoft.CodeAnalysis to 4.4.0

This is a workaround to fix issues with dotnet-format.
See:
- https://github.com/dotnet/format/issues/1805
- https://github.com/dotnet/format/issues/1800

* Adjust editorconfig to be more compatible with Ryujinx code-style

* Adjust .editorconfig line endings to match .gitattributes

* Disable 'prefer switch expression' rule

* Remove naming styles

These are the default rules, so we don't need to override them.

* Silence IDE0060 in .editorconfig

* Slightly adjust .editorconfig

* Add lost workflow changes

* Move .editorconfig comment to the top

* .editorconfig: private static readonly fields should be _lowerCamelCase

* .editorconfig: Remove alignment for declarations as well

* editorconfig: Add rule for local constants

* Disable CA1822 for HLE services

* Disable CA1822 for ViewModels

Bindings won't work with static members, but this issue is silently ignored.

* Run dotnet format for the whole solution

* Check result code of SDL_GetDisplayBounds

* Fix dotnet format style issues

* Add missing trailing commas

* Update Microsoft.CodeAnalysis.CSharp to 4.6.0

Skipping 4.5.0 since it breaks dotnet format

* Restore old default naming rules for dotnet format

* Add naming rule exception for CPU tests

* checks: Include all files before excluding paths

* Fix dotnet format issues

* Check dotnet format version

* checks: Run dotnet format with severity info again

* checks: Disable naming style rules until they won't crash the process anymore

* Remove unread private member

* checks: Attempt to run analyzers 3 times before giving up

* checks: Enable naming style rules again with the new retry logic
2023-07-24 18:35:04 +02:00
gdkchan
9c6071a645
Move support buffer update out of the backends (#5411)
* Move support buffer update out of the backends

* Fix render scale init and remove redundant state from SupportBufferUpdater

* Stop passing texture scale to the backends

* XML docs for SupportBufferUpdater
2023-07-11 14:07:41 -03:00
gdkchan
1c7a90ef35
Stop identifying shader textures with handle and cbuf, use binding instead (#5266)
* Stop identifying shader textures with handle and cbuf, use binding instead

* Remove now unused code

* Consider image operations as having accurate type information too

I don't know why that was not the case before

* Fix missing unscale on InsertCoordNormalization, stop calling SetUsageFlagsForTextureQuery when not needed

* Shader cache version bump

* Change get texture methods to return descriptors created from ResourceManager state

 This is required to ensure that reserved textures and images will not be bound as a guest texture/image

* Fix BindlessElimination.SetHandle inserting coords at the wrong place
2023-07-03 14:29:27 -03:00
TSRBerry
3b46bb73f7
[Ryujinx.Graphics.Gpu] Address dotnet-format issues (#5367)
* dotnet format style --severity info

Some changes were manually reverted.

* dotnet format analyzers --serverity info

Some changes have been minimally adapted.

* Restore a few unused methods and variables

* Silence dotnet format IDE0060 warnings

* Silence dotnet format IDE0052 warnings

* Address dotnet format CA1816 warnings

* Address or silence dotnet format CA1069 warnings

* Address or silence dotnet format CA2211 warnings

* Address remaining dotnet format analyzer warnings

* Address review comments

* Address most dotnet format whitespace warnings

* Apply dotnet format whitespace formatting

A few of them have been manually reverted and the corresponding warning was silenced

* Format if-blocks correctly

* Run dotnet format whitespace after rebase

* Run dotnet format style after rebase

* Another rebase, another dotnet format run

* Run dotnet format style after rebase

* Run dotnet format after rebase and remove unused usings

- analyzers
- style
- whitespace

* Disable 'prefer switch expression' rule

* Add comments to disabled warnings

* Remove a few unused parameters

* Replace MmeShadowScratch with Array256<uint>

* Simplify properties and array initialization, Use const when possible, Remove trailing commas

* Start working on disabled warnings

* Fix and silence a few dotnet-format warnings again

* Run dotnet format after rebase

* Address IDE0251 warnings

* Silence IDE0060 in .editorconfig

* Revert "Simplify properties and array initialization, Use const when possible, Remove trailing commas"

This reverts commit 9462e4136c0a2100dc28b20cf9542e06790aa67e.

* dotnet format whitespace after rebase

* First pass of dotnet format

* Add unsafe dotnet format changes

* Fix typos

* Add trailing commas

* Disable formatting for FormatTable

* Address review feedback
2023-07-02 02:47:54 +02:00
TSRBerry
fbaf62c230
Apply new naming rule to all projects except Vp9 (#5407) 2023-06-28 01:18:19 +02:00
Marco Carvalho
7608cb37ab
"Exists" method should be used instead of the "Any" extension (#5345) 2023-06-23 01:37:25 +02:00
Kurochi51
d604e98227
Fix regression introduced by 1.1.1733 on Intel GPUs (#5311)
* Fix regression introduced by 1.1733 on Intel iGPUs

* Should have actually figured the variable, oops.

* maybe something goes wrong here? honestly lost

* Shader cache bump
2023-06-22 21:35:06 +02:00
gdkchan
f92921a6d1
Implement Load/Store Local/Shared and Atomic shared using new instructions (#5241)
* Implement Load/Store Local/Shared and Atomic shared using new instructions

* Remove now unused code

* Fix base offset register overwrite

* Fix missing storage buffer set index when generating GLSL for Vulkan

* Shader cache version bump

* Remove more unused code

* Some PR feedback
2023-06-15 17:31:53 -03:00
Marco Carvalho
82f90704a0
Blocks should be synchronized on read-only fields (#5212)
* Blocks should be synchronized on read-only fields

* more readonlys

* fix alignment

* more

* Update ISelfController.cs

* simplify new

* simplify new
2023-06-15 00:34:55 +00:00
gdkchan
eb0bb36bbf
Implement transform feedback emulation for hardware without native support (#5080)
* Implement transform feedback emulation for hardware without native support

* Stop doing some useless buffer updates and account for non-zero base instance

* Reduce redundant updates even more

* Update descriptor init logic to account for ResourceLayout

* Fix transform feedback and storage buffers not being updated in some cases

* Shader cache version bump

* PR feedback

* SetInstancedDrawVertexCount must be always called after UpdateState

* Minor typo
2023-06-10 18:31:38 -03:00
Marco Carvalho
86de288142
Removing shift by 0 (#5249)
* Integral numbers should not be shifted by zero or more than their number of bits-1

* more
2023-06-09 11:23:44 +02:00
gdkchan
2cdcfe46d8
Remove barrier on Intel if control flow is potentially divergent (#5044)
* Remove barrier on Intel if control flow is potentially divergent

* Shader cache version bump
2023-06-08 17:43:16 -03:00
gdkchan
fe30c03cac
Implement soft float64 conversion on shaders when host has no support (#5159)
* Implement soft float64 conversion on shaders when host has no support

* Shader cache version bump

* Fix rebase error
2023-06-08 17:09:14 -03:00
gdkchan
af1906ea04
Fix wrong unaligned SB state when fetching compute shaders (#5223) 2023-06-05 14:01:33 +02:00
riperiperi
d2f3adbf69
Texture: Fix layout conversion when gobs in z is used with depth = 1 (#5220)
* Texture: Fix layout conversion when gobs in z is used with depth = 1

The size calculator methods deliberately reduce the gob size of textures if they are deemed too small for it. This is required to get correct sizes when iterating mip levels of a texture.

Rendering to a slice of a 3D texture can produce a 3D texture with depth 1, but a gob size matching a much larger texture. We _can't_ "correct" this gob size, as it is intended as a slice of a larger 3D texture. Ignoring it causes layout conversion to break on read and flush.

This caused an issue in Tears of the Kingdom where the compressed 3D texture used for the gloom would always break on OpenGL, and seemingly randomly break on Vulkan. In the first case, the data is forcibly flushed to decompress the BC4 texture on the CPU to upload it as 3D, which was broken due to the incorrect layout. In the second, the data may be randomly flushed if it falls out of the cache, but it will appear correct if it's able to form copy dependencies.

This change only allows gob sizes to be reduced once per mip level. For the purpose of aligned size, it can still be reduced infinitely as our texture cache isn't properly able to handle a view being _misaligned_.

The SizeCalculator has also been changed to reduce the size of rendered depth slices to only include the exact range a single depth slice will cover. (before, the size was way too small with gobs in z reduced to 1, and too large when using the correct value)

Gobs in Y logic remains untouched, we don't support Y slices of textures so it's fine as is.

This is probably worth testing in a few games as it also affects texture size and view logic.

* Improve wording

* Maybe a bit better
2023-06-04 20:25:57 +00:00
gdkchan
21c9ac6240
Implement shader storage buffer operations using new Load/Store instructions (#4993)
* Implement storage buffer operations using new Load/Store instruction

* Extend GenerateMultiTargetStorageOp to also match access with constant offset, and log and comments

* Remove now unused code

* Catch more complex cases of global memory usage

* Shader cache version bump

* Extend global access elimination to work with more shared memory cases

* Change alignment requirement from 16 bytes to 8 bytes, handle cases where we need more than 16 storage buffers

* Tweak preferencing to catch more cases

* Enable CB0 elimination even when host storage buffer alignment is > 16 (for Intel)

* Fix storage buffer bindings

* Simplify some code

* Shader cache version bump

* Fix typo

* Extend global memory elimination to handle shared memory with multiple possible offsets and local memory
2023-06-03 20:12:18 -03:00
riperiperi
4741a05df9
Vulkan: Include DepthMode in ProgramPipelineState (#5185) 2023-06-01 09:05:39 +02:00
riperiperi
c6676007bf
GPU: Dispose Renderer after running deferred actions (#5144)
* GAL: Dispose Renderer after running deferred actions

Deferred actions from disposing physical memory instances always dispose the resources in their caches. The renderer can't be disposed before these resources get disposed, otherwise the dispose actions will not actually run, and the ThreadedRenderer may get stuck trying to enqueue too many commands when there is nothing consuming them.

This should fix most instances of the emulator freezing on close.

* Wait for main render commands to finish, but keep RenderThread alive til dispose

* Address some feedback.

* No parameterize needed

* Set thread name as part of constructor

* Port to Ava and SDL2
2023-05-31 21:43:20 +00:00
gdkchan
c27e453fd3
Share ResourceManager vertex vertex A and B shaders (#5181) 2023-05-31 17:17:50 -03:00
cstamford
dc0dbc50ab
Add support for VK_EXT_depth_clip_control. (#5027)
* Add support for VK_EXT_depth_clip_control.

* Code review feedback

Minor formatting

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Check .DepthClipControl to make sure the host actually supports the feature.

* Review feedback: remove Vulkan platform switch, relying on QueryHostSupportsDepthClipControl to drive the behaviour - OpenGL returns true, and any future platforms that don't support the [-1, 1] depth mode can return false for the transformation.

---------

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2023-05-28 23:31:56 +02:00
gdkchan
3b375525fb
Force reciprocal operation with value biased by constant to be precise on macOS (#5110)
* Force operations to be precise in some cases on SPIR-V

* Make it a bit more strict, add comments

* Shader cache version bump
2023-05-26 15:19:37 -03:00
gdkchan
e6658c133c
Fix resolution scaling of image operation coordinates (#5102)
* Fix resolution scaling of image operation coordinates

* Shader cache version bump
2023-05-25 23:42:49 -03:00
gdkchan
8f0c89ffd6
Generate scaling helper functions on IR (#4714)
* Generate scaling helper functions on IR

* Delete unused code

* Split RewriteTextureSample and move gather bias add to an earlier pass

* Remove using

* Shader cache version bump
2023-05-25 17:46:58 -03:00
makigumo
6cb6b15612
Implement p2rc, p2ri, p2rr and r2p.cc shaders (#5031)
* implement P2rC, P2rI, P2rR shaders

* implement R2p.CC shader

* bump CodeGenVersion

* address feedback
2023-05-22 17:32:15 -03:00
gdkchan
5626f2ca1c
Replace ShaderBindings with new ResourceLayout structure for Vulkan (#5025)
* Introduce ResourceLayout

* Part 1: Use new ResourceSegments array on UpdateAndBind

* Part 2: Use ResourceLayout to build PipelineLayout

* Delete old code

* XML docs

* Fix shader cache load NRE

* Fix typo
2023-05-21 14:04:21 -03:00
gdkchan
402f05b8ef
Replace constant buffer access on shader with new Load instruction (#4646) 2023-05-20 16:19:26 -03:00
gdkchan
fb27042e01
Limit compute storage buffer size (#5028) 2023-05-20 16:15:07 +00:00
riperiperi
69a9de33d3
SPIR-V: Only allow implicit LOD sampling on fragment (#5026) 2023-05-20 15:52:26 +02:00
gdkchan
fc26189fe1
Eliminate redundant multiplications by gl_FragCoord.w on the shader (#4578)
* Eliminate redundant multiplications by gl_FragCoord.w on the shader

* Shader cache version bump
2023-05-19 11:52:31 -03:00
riperiperi
ecbf303266
GPU: Avoid using garbage size for non-cb0 storage buffers (#4999)
* GPU: Avoid using garbage size for non-cb0 storage buffers

In the depths area, Tears of the Kingdom uses a global memory access with address on constant buffer slot 6. This isn't standard and thus doesn't actually have a size 8 bytes after it, so we were reading back a garbage size that ended up very large (at least in version 1.1.0), and would synchronize a lot of data per frame.

This PR makes storage buffers created from addresses outside constant buffer slot 0 get their size as the number of bytes remaining in the GPU mapping starting at the given virtual address. This should bound the buffer to a reasonable size, and ideally stop it crossing into other memory.

* Limit max size

* Add TODO

* Feedback
2023-05-18 08:56:34 +02:00
OpaqueReptile
cb4b58052f
Start GPU performance counter at 0 instead of host GPU value (#4992)
* Start performance counter at 0 instead of host perf counter value

* whitespace

* init _firstTimestamp in constructer per feedback

* change comment

* punctuation

* address feedback

* revise comment
2023-05-17 15:38:59 -03:00
Mary
7271f1b18e
Bump shader cache codegen version
That was missing from #4892
2023-05-12 18:53:14 +02:00
riperiperi
95c06de4c1
GPU: Remove swizzle undefined matching and rework depth aliasing (#4896)
* GPU: Remove swizzle undefined matching and rework depth aliasing

@gdkchan pointed out that UI textures in TOTK seemed to be setting their texture swizzle incorrectly (texture was RGB but was sampling A, swizzle for A was wrong), so I determined that SwizzleComponentMatches was the problem and set on eliminating it. This PR combines existing work to select the most recently modified texture (now used when selecting which aliased texture to use) with some additional changes to remove the swizzle check and support aliased view creation.

The original observation (#1538) was that we wanted to match depth textures for the purposes of aliasing with color textures, but they often had different swizzle from what was sampled (as it's generally the identity swizzle once rendered). At the time, I decided to allow swizzles to match if only the defined components matched, which fixed the issue in all known cases but could easily be broken by a game _expecting_ a given swizzle, such as a 1/0 value on a component.

This error case could also occur in textures that don't even depth alias, such as R11G11B10, as the rule was created to generally apply to all cases.

The solution is now to fail this exact match test, and allow the search for an R32 texture to create a swizzled view of a D32 texture (and other such cases). This allows the creation of a view that mismatches the requested format, which wasn't present before and was the reason for the swizzle matching approach.

The exact match and view creation rules now follow the same rules over what textures to select when there are multiple options (such as a "perfect" match and an "aliased" match at the same time). It now selects the most recently modified texture, which is done with a new sequence number in the GpuContext (because we don't have enough of these).

Reportedly fixes UI having weird coloured backgrounds in TOTK. This also fixes an issue in MK8D where returning from a race resulted in the character selection cubemaps being broken. May work around issues introduced by the "short texture cache" PR due to modification ordering, though they won't be truly fixed.

Should allow (#4365) to avoid copies in more cases. Need to test that.

I tested a bunch of games #1538 originally affected and they seem to be fine. This change affects all games so it would be good to get some wide testing on it.

* Address feedback 1, fix an issue

* Workaround: Do not allow copies for format alias.

These should be removed when D32<->R32 copy dependencies become legal
2023-05-11 21:30:47 -03:00