Commit graph

86 commits

Author SHA1 Message Date
riperiperi
9f1cf6458c
Vulkan: Migrate buffers between memory types to improve GPU performance (#4540)
* Initial implementation of migration between memory heaps

- Missing OOM handling
- Missing `_map` data safety when remapping
  - Copy may not have completed yet (needs some kind of fence)
  - Map may be unmapped before it is done being used. (needs scoped access)
- SSBO accesses are all "writes" - maybe pass info in another way.
- Missing keeping map type when resizing buffers (should this be done?)

* Ensure migrated data is in place before flushing.

* Fix issue where old waitable would be signalled.

- There is a real issue where existing Auto<> references need to be replaced.

* Swap bound Auto<> instances when swapping buffer backing

* Fix conversion buffers

* Don't try move buffers if the host has shared memory.

* Make GPU methods return PinnedSpan with scope

* Storage Hint

* Fix stupidity

* Fix rebase

* Tweak rules

Attempt to sidestep BOTW slowdown

* Remove line

* Migrate only when command buffers flush

* Change backing swap log to debug

* Address some feedback

* Disallow backing swap when the flush lock is held by the current thread

* Make PinnedSpan from ReadOnlySpan explicitly unsafe

* Fix some small issues

- Index buffer swap fixed
- Allocate DeviceLocal buffers using a separate block list to images.

* Remove alternative flags

* Address feedback
2023-03-19 17:56:48 -03:00
riperiperi
6e9bd4de13
GPU: Scale counter results before addition (#4471)
* GPU: Scale counter results before addition

Counter results were being scaled on ReportCounter, which meant that the _total_ value of the counter was being scaled. Not only could this result in very large numbers and weird overflows if the game doesn't clear the counter, but it also caused the result to change drastically.

This PR changes scaling to be done when the value is added to the counter on the backend. This should evaluate the scale at the same time as before, on report counter, but avoiding the issue with scaling the total.

Fixes scaling in Warioware, at least in the demo, where it seems to compare old/new counters and broke down when scaling was enabled.

* Fix issues when result is partially uploaded.

Drivers tend to write the low half first, then the high half. Retry if the high half is FFFFFFFF.
2023-03-12 18:01:15 +01:00
Mary
d56d335c0b
misc: Some dependencies cleanup (#4507)
* Remove dependencies on libraries provided by .NET standard library

* Use System.IO.Hashing instead of Crc32.NET
2023-03-12 03:24:11 +01:00
jhorv
23c844b2aa
Misc performance tweaks (#4509)
* use Array.Empty() where instead of allocating new zero-length arrays

* structure for loops in a way that the JIT will elide array/Span bounds checking

* avoiding function calls in for loop condition tests

* avoid LINQ in a hot path

* conform with code style

* fix mistake in GetNextWaitingObject()

* fix GetNextWaitingObject() possibility of returning null if all list items have TimePoint == long.MaxValue

* make GetNextWaitingObject() behave FIFO behavior for multiple items with the same TimePoint
2023-03-11 17:05:48 -03:00
gdkchan
4f3af839be
Minor code formatting (#4498) 2023-03-04 14:43:08 +01:00
Emmanuel Hansen
80b4972139
Add Support for Post Processing Effects (#3616)
* Add Post Processing Effects

* fix events and shader issues

* fix gtk upscale slider value

* fix bgra games

* don't swap swizzle if already swapped

* restore opengl texture state after effects run

* addressed review

* use single pipeline for smaa and fsr

* call finish on all pipelines

* addressed review

* attempt fix file case

* attempt fixing file case

* fix filter level tick frequency

* adjust filter slider margins

* replace fxaa shaders with original shader

* addressed review
2023-02-27 18:11:55 -03:00
gdkchan
5d85468302
Vulkan: Support list topology primitive restart (#4483) 2023-02-26 19:19:00 -03:00
gdkchan
cedd200745
Move gl_Layer to vertex shader if geometry is not supported (#4368)
* Set gl_Layer on vertex shader if it's set on the geometry shader and it does nothing else

* Shader cache version bump

* PR feedback

* Fix typo
2023-02-25 10:39:51 +00:00
jhorv
58207685c0
Perform bounds checking before list indexer to avoid frequent exceptions (#4438)
* Perform bounds checking before list indexer to avoid frequent ArgumentOutOfRangeExceptions

* do a single compare after casting id and .Count to uint
2023-02-25 10:26:39 +00:00
gdkchan
c3a5716a95
Add copy dependency for some incompatible texture formats (#4380)
* Add copy dependency for some incompatible texture formats

* Simplify compatibility check
2023-02-21 19:21:57 -03:00
gdkchan
7aa430f1a5
Add support for advanced blend (part 1/2) (#2801)
* Add blend microcode registers

* Add advanced blend support using host extension

* Remove debug message

* Use pre-generated table for blend functions

* XML docs

* Rename AdvancedBlendMode to AdvancedBlendOp for consistency

* Remove redundant code

* Fix some advanced blend related issues on Vulkan

* Formatting
2023-02-19 22:37:37 -03:00
Mary
17078ad929
vulkan: Respect VK_KHR_portability_subset vertex stride alignment (#4419)
* vulkan: Respect VK_KHR_portability_subset vertex stride alignment

We were hardcoding alignment to 4, but by specs it can be any values that
is a power of 2.

This also enable VK_KHR_portability_subset if present as per specs
requirements.

* address gdkchan's comment

* Make NeedsVertexBufferAlignment internal
2023-02-15 08:41:48 +00:00
Mary
32450d45de
vulkan: Clean up MemoryAllocator (#4418)
This started as an attempt to remove vkGetPhysicalDeviceMemoryProperties
in FindSuitableMemoryTypeIndex (As this could have some overhead and
shouldn't change at runtime) and turned in a little bigger cleanup.
2023-02-15 07:50:26 +01:00
Mary
fe9c49949a
vulkan: Enforce Vulkan 1.2+ at instance API level and 1.1+ at device level (#4408)
* vulkan: Enforce Vulkan 1.2+ at instance API level and 1.1+ at device level

This ensure we don't end up trying to initialize with anything currently incompatible.

* Address riperiperi's comment
2023-02-13 22:04:55 +00:00
Mary
052b23c83c
vulkan: Do not call vkCmdSetViewport when viewportCount is 0 (#4406)
This fix validation error "VUID-vkCmdSetViewport-viewportCount-arraylength".
2023-02-13 20:32:20 +00:00
riperiperi
e4f68592c3
Fix partial updates for textures. (#4401)
I was forcing some types of texture to partially update when investigating performance with games that stream in data, and noticed that partially loading texture data was really broken on both backends.

Fixes Vulkan texture set by getting the correct expected size for the texture. Fixes partial upload on both backends for both Texture 2D Array and Cubemap using the wrong offset and uploading to the first layer/level for a handle. 3D might also be affected.

This might fix textures randomly having incorrect data in games that render to it - jumbled in the case of OpenGL, and outdated/black in the case of Vulkan. This case typically happens in UE4 games.
2023-02-12 10:30:26 +01:00
riperiperi
b3f0978869
Vulkan: Flush command buffers for queries less aggressively (#4387)
The AutoFlushCounter would flush command buffers on any attachment change (write mask or bindings change) if there was a pending query. This is to get query results as soon as possible for draw skips, but it's assuming that a full occlusion query _pass_ happened, that we want to flush it's data before getting onto draws, rather than the queries being randomly interspersed throughout a pass that also draws.

Xenoblade 2 repeatedly switches between performing a samples passed query and outputting to a render target on each draw, and flips the write mask to do so. Flushing the command buffer every 2 draws isn't ideal, so it's best that we only do this if the pattern matches the large block style of occlusion query.

This change makes this flush only happen after a few consecutive query reports. "Consecutive" is interrupted by attachment changes or command buffer flush.

This doesn't really solve the issue where it resets more queries than it uses, it just stops the game doing it as often. I'm not sure of the best way to do that. The cost of resetting could probably be reduced by using query pools with more than one element and resetting in bulk.
2023-02-09 02:03:41 +01:00
gdkchan
f6d5499a16
Fix some Vulkan validation errors (#4357) 2023-02-08 14:34:22 +01:00
gdkchan
f8beeeb7d3
Support safe blit on non-2D textures (#4374)
* Support safe blit on non-2D textures (except multisample)

* Change safe blit with different levels and layers to match CmdBlitImage path

* Remove now unused variables

* Multisample safe blit support
2023-02-07 13:55:59 -03:00
gdkchan
7528f94536
Implement safe depth-stencil blit using stencil export extension (#4356)
* Implement safe depth-stencil blit using stencil export extension

* Delete depth-stencil blit with buffer path
2023-02-06 00:19:31 -03:00
gdkchan
296c4a3d01
Relax Vulkan requirements (#4282)
* Relax Vulkan requirements

* Fix MaxColorAttachmentIndex

* Fix ColorBlendAttachmentStateCount value mismatch for background pipelines

* Change query capability check to check for pipeline statistics query rather than geometry shader support
2023-01-26 18:34:35 -03:00
riperiperi
e7cf4e6eaf
Vulkan: Reset queries on same command buffer (#4329)
* Reset queries on same command buffer

Vulkan seems to complain when the queries are reset on another command buffer. No idea why, the spec really could be written better in this regard. This fixes complaints, and hopefully any implementations that care extensively about them.

This change _guesses_ how many queries need to be reset and resets as many as possible at the same time to avoid splitting render passes. If it resets too many queries, we didn't waste too much time - if it runs out of resets it will batch reset 10 more.

The number of queries reset is the maximum number of queries in the last 3 frames. This has been worked into the AutoFlushCounter so that it only resets up to 32 if it is yet to force a command buffer submission in this attachment.

This is only done for samples passed queries right now, as they have by far the most resets.

* Address Feedback
2023-01-24 13:32:56 -03:00
Fliperworld
bb89e36fd8
Vulkan: Destroy old swapchain on swapchain recreation (#3889)
* Destroy old swapchain on swapchain recreation

* vkDeviceWaitIdle before DestroySwapchain

* Update Ryujinx.Graphics.Vulkan/Window.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Avoid unsafe code on RecreateSwapchain()

* Destroying old Swapchain on a queue.

* Cleanup and fix on destroying old Swapchain.

* Update Ryujinx.Graphics.Vulkan/Window.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Ryujinx.Graphics.Vulkan/Window.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Ryujinx.Graphics.Vulkan/Window.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

* Update Window.cs

Done.

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2023-01-19 21:31:25 -03:00
riperiperi
de3134adbe
Vulkan: Explicitly enable precise occlusion queries (#4292)
The only guarantee of the occlusion query type in Vulkan is that it will be zero when no samples pass, and non-zero when any samples pass. Of course, most GPUs implement this by just placing the # of samples in the result and calling it a day. However, this lax restriction means that GPUs could just report a boolean (1/0) or report a value after one is recorded, but before all samples have been counted.

MoltenVK falls in the first category - by default it only reports 1/0 for occlusion queries. Thankfully, there is a feature and flag that you can use to force compatible drivers to provide a "precise" query result, that being the real # of samples passed.

Should fix ink collision in Splatoon 2/3 on MoltenVK.
2023-01-19 00:30:42 +00:00
gdkchan
065c4e520d
Specify image view usage flags on Vulkan (#4283)
* Specify image view usage flags on Vulkan

* PR feedback
2023-01-15 23:12:52 +01:00
Ac_K
85faa9d8fa
Revert "Relax Vulkan requirements (#4228)" (#4279)
This reverts commit dca5b14493.
2023-01-13 06:04:59 +00:00
gdkchan
dca5b14493
Relax Vulkan requirements (#4228) 2023-01-13 06:09:48 +01:00
riperiperi
8fa248ceb4
Vulkan: Add workarounds for MoltenVK (#4202)
* Add MVK basics.

* Use appropriate output attribute types

* 4kb vertex alignment, bunch of fixes

* Add reduced shader precision mode for mvk.

* Disable ASTC on MVK for now

* Only request robustnes2 when it is available.

* It's just the one feature actually

* Add triangle fan conversion

* Allow NullDescriptor on MVK for some reason.

* Force safe blit on MoltenVK

* Use ASTC only when formats are all available.

* Disable multilevel 3d texture views

* Filter duplicate render targets (on backend)

* Add Automatic MoltenVK Configuration

* Do not create color attachment views with formats that are not RT compatible

* Make sure that the host format matches the vertex shader input types for invalid/unknown guest formats

* FIx rebase for Vertex Attrib State

* Fix 4b alignment for vertex

* Use asynchronous queue submits for MVK

* Ensure color clear shader has correct output type

* Update MoltenVK config

* Always use MoltenVK workarounds on MacOS

* Make MVK supersede all vendors

* Fix rebase

* Various fixes on rebase

* Get portability flags from extension

* Fix some minor rebasing issues

* Style change

* Use LibraryImport for MVKConfiguration

* Rename MoltenVK vendor to Apple

Intel and AMD GPUs on moltenvk report with the those vendors - only apple silicon reports with vendor 0x106B.

* Fix features2 rebase conflict

* Rename fragment output type

* Add missing check for fragment output types

Might have caused the crash in MK8

* Only do fragment output specialization on MoltenVK

* Avoid copy when passing capabilities

* Self feedback

* Address feedback

Co-authored-by: gdk <gab.dark.100@gmail.com>
Co-authored-by: nastys <nastys@users.noreply.github.com>
2023-01-13 01:31:21 +01:00
riperiperi
e20abbf9cc
Vulkan: Don't flush commands when creating most sync (#4087)
* Vulkan: Don't flush commands when creating most sync

When the WaitForIdle method is called, we create sync as some internal GPU method may read back written buffer data. Some games randomly intersperse compute dispatch into their render passes, which result in this happening an unbounded number of times depending on how many times they run compute.

Creating sync in Vulkan is expensive, as we need to flush the current command buffer so that it can be waited on. We have a limited number of active command buffers due to how we track resource usage, so submitting too many command buffers will force us to wait for them to return to the pool.

This PR allows less "important" sync (things which are less likely to be waited on) to wait on a command buffer's result without submitting it, instead relying on AutoFlush or another, more important sync to flush it later on.

Because of the possibility of us waiting for a command buffer that hasn't submitted yet, any thread needs to be able to force the active command buffer to submit. The ability to do this has been added to the backend multithreading via an "Interrupt", though it is not supported without multithreading.

OpenGL drivers should already be doing something similar so they don't blow up when creating lots of sync, which is why this hasn't been a problem for these games over there.

Improves Vulkan performance on Xenoblade DE, Pokemon Scarlet/Violet, and Zelda BOTW (still another large issue here)

* Add strict argument

This is technically a separate concern from whether the sync is a host syncpoint.

* Remove _interrupted variable

* Actually wait for the invoke

This is required by AMD GPUs, and also may have caused some issues on other GPUs.

* Remove unused using.

* I don't know why it added these ones.

* Address Feedback

* Fix typo
2022-12-29 15:39:04 +01:00
riperiperi
470be03c2f
GPU: Add fallback when 16-bit formats are not supported (#4108)
* Add conversion for 16 bit RGBA formats (not supported in Rosetta)

* Rebase fix

Rebase fix

* Forgot to remove this

* Fix RGBA16 format conversion

* Add RGBA4 -> RGBA8 conversion

* Handle host stride alignment

* Address Feedback Part 1

* Can't count

* Don't zero out rgb when alpha is 0

* Separate RGBA4 and 5-bit component formats

Not sure of a better way to name them...

* Add A1B5G5R5 conversion

* Put this in the right place.

* Make format naming consistent for capabilities

* Change method names
2022-12-26 15:50:27 -03:00
Hunter
c963b3c804
Added Generic Math to BitUtils (#3929)
* Generic Math Update

Updated Several functions in Ryujinx.Common/Utilities/BitUtils to use generic math

* Updated BitUtil calls

* Removed Whitespace

* Switched decrement

* Fixed changed method calls.

The method calls were originally changed on accident due to me relying too much on intellisense doing stuff for me

* Update Ryujinx.Common/Utilities/BitUtils.cs

Co-authored-by: gdkchan <gab.dark.100@gmail.com>

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
2022-12-26 14:11:05 +00:00
gdkchan
f906eb06c2
Implement a software ETC2 texture decoder (#4121)
* Implement a software ETC2 texture decoder

* Fix output size calculation for non-2D textures

* Address PR feedback
2022-12-21 20:39:58 -03:00
Georg Lehmann
0f50de72be
Vulkan: enable VK_EXT_custom_border_color features (#4116)
* Vulkan: enable VK_EXT_custom_border_color features

radv only create the border color bo if this feature is enabled, so it crashed when creating samplers with custom border colors
Fixes #4072
Fixes #3993

* Address gdkchan's comment

Co-authored-by: Mary <mary@mary.zone>
2022-12-14 20:53:33 -03:00
Andrey Sukharev
535fbec675
Use NuGet Central Package Management to manage package versions solution-wise (#4095) 2022-12-12 16:03:10 +01:00
Isaac Marovitz
851d81d24a
Fix Redundant Qualifer Warnings (#4091)
* Fix Redundant Qualifer Warnings

* Remove unnecessary using
2022-12-10 21:21:13 +01:00
riperiperi
e211c3f00a
UI: Add Metal surface creation for MoltenVK (#3980)
* Initial implementation of metal surface across UIs

* Fix SDL2 on windows

* Update Ryujinx/Ryujinx.csproj

Co-authored-by: Mary-nyan <thog@protonmail.com>

* Address Feedback

Co-authored-by: Mary-nyan <thog@protonmail.com>
2022-12-06 19:00:25 -03:00
Andrey Sukharev
4da44e09cb
Make structs readonly when applicable (#4002)
* Make all structs readonly when applicable. It should reduce amount of needless defensive copies

* Make structs with trivial boilerplate equality code record structs

* Remove unnecessary readonly modifiers from TextureCreateInfo

* Make BitMap structs readonly too
2022-12-05 14:47:39 +01:00
Mary-nyan
ae13f0ab4d
misc: Fix obsolete warnings in Ryujinx.Graphics.Vulkan (#4020)
Was caused by some merges after the Silk.NET update
2022-12-05 12:57:11 +00:00
gdkchan
17a1cab5d2
Allow SNorm buffer texture formats on Vulkan (#3957)
* Allow SNorm buffer texture formats on Vulkan

* Shader cache version bump
2022-12-04 15:36:03 -03:00
gdkchan
73aed239c3
Implement non-MS to MS copies with draws (#3958)
* Implement non-MS to MS copies with draws, simplify MS to non-MS copies and supports any host sample count

* Remove unused program
2022-12-04 15:07:11 -03:00
Andrey Sukharev
3868a00206
Use source generated regular expressions (#4005) 2022-12-04 00:43:23 +00:00
Mary-nyan
ce92e8cd04
chore: Update Silk.NET to 2.16.0 (#3953) 2022-12-01 19:11:56 +01:00
riperiperi
458452279c
GPU: Track buffer migrations and flush source on incomplete copy (#3952)
* Track buffer migrations and flush source on incomplete copy

Makes sure that the modified range list is always from the latest iteration of the buffer, and flushes earlier iterations of a buffer if the data has not been migrated yet.

* Cleanup 1

* Reduce cost for redundant signal checks on Vulkan

* Only inherit the range list if there are pending ranges.

* Fix OpenGL

* Address Feedback

* Whoops
2022-12-01 16:30:13 +01:00
gdkchan
4905101df1
Remove shader dependency on SPV_KHR_shader_ballot and SPV_KHR_subgroup_vote extensions (#3943)
* Remove shader dependency on SPV_KHR_shader_ballot and SPV_KHR_subgroup_vote extensions

* Shader cache version bump
2022-11-30 18:24:15 -03:00
Mary-nyan
d41c95dcff
chore: Update OpenTK to 4.7.5 (#3944) 2022-11-29 13:32:40 +00:00
riperiperi
1fc0f569de
GPU: Always draw polygon topology as triangle fan (#3932)
Polygon topology wasn't really supported and would only work on OpenGL on drivers that haven't removed it. As an alternative, this PR makes all cases of polygon topology use triangle fan. The topology type and transform feedback type have not been changed, as I don't think geo shader/tfb should be used with polygons.

The OpenGL spec states:
Only convex polygons are guaranteed to be drawn correctly by the GL.

For convex polygons, triangle fan is equivalent to polygon. I imagine this is probably how it works on device, as this get-out-of-jail-free card is too enticing to pass up.

This fixes the stat display in Pokemon S/V.
2022-11-28 19:18:22 -03:00
Ac_K
a1ddaa2736
ui: Fixes disposing on GTK/Avalonia and Firmware Messages on Avalonia (#3885)
* ui: Only wait on _exitEvent when MainLoop is active under GTK

This fixes a dispose issue under Horizon/GTK, we don't check if the ApplicationClient is null so it throw NCE. We don't check if the main loop is active and waiting an event which is set in the main loop... So that could lead to a freeze.

Everything works fine in GTK now.

Related issue: https://github.com/Ryujinx/Ryujinx/issues/3873

As a side note, same kind of issue appear in Avalonia UI too. Firmware's popup doesn't show anything and the emulator just freeze.

* TSRBerry's change

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>

* Fix Avalonia crashing/freezing

* Add Avalonia OpenGL fixes

* Fix firmware popup on windows

* Fixes everything

* Add _initialized bool to VulkanRenderer and OpenGL Window

Co-authored-by: TSRBerry <20988865+TSRBerry@users.noreply.github.com>
2022-11-24 15:08:27 +01:00
riperiperi
ece36b274d
GAL: Send all buffer assignments at once rather than individually (#3881)
* GAL: Send all buffer assignments at once rather than individually

The `(int first, BufferRange[] ranges)` method call has very significant performance implications when the bindings are spread out, which they generally always are in Vulkan. This change makes it so that these methods are only called a maximum of one time per draw.

Significantly improves GPU thread performance in Pokemon Scarlet/Violet.

* Address Feedback

Removed SetUniformBuffers(int first, ReadOnlySpan<BufferRange> buffers)
2022-11-24 07:50:59 +00:00
gdkchan
2e43d01d36
Move gl_Layer from vertex to geometry if GPU does not support it on vertex (#3866)
* Move gl_Layer from vertex to geometry if GPU does not support it on vertex

* Shader cache version bump

* PR feedback
2022-11-18 23:27:54 -03:00
riperiperi
7373ec5792
Vulkan: Clear dummy texture to (0,0,0,0) on creation (#3867)
This might fix an issue with AMD gpus on linux where the data could contain random garbage data. On the switch, it always samples as 0.
2022-11-18 23:11:34 -03:00