Commit graph

235 commits

Author SHA1 Message Date
Lody
535bc61b4c vk_rasterizer: fix stencil test when two faces are disabled 2022-05-06 14:47:55 +08:00
Morph
99ceb03a1c general: Convert source file copyright comments over to SPDX
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-23 05:55:32 -04:00
bunnei
af04f8b8e9
Revert "Memory GPU <-> CPU: reduce infighting in the texture cache by adding CPU Cached memory." 2022-03-26 12:38:30 -07:00
Fernando Sahmkow
7a9d9e575b Texture Cache: Add Cached CPU system. 2022-03-25 04:24:05 +01:00
ameerj
1bc7d61b57 video_core: Reduce unused includes 2022-03-19 15:01:31 -04:00
Fernando Sahmkow
8a6e6465a7 Rasterizer: Refactor inlineToMemory. 2022-02-01 01:47:28 +01:00
Fernando Sahmkow
4258d515e6 Rasterizer: Implement Inline2Memory Acceleration. 2022-01-29 22:53:27 +01:00
FernandoS27
d37d10e7a7 TextureCache: fix rescaling in aliases and overlap joins. 2021-11-16 22:11:31 +01:00
Fernando Sahmkow
4ad22c7d2b Video Core: fix building for GCC. 2021-11-16 22:11:31 +01:00
FernandoS27
826a350e2b Vulkan Rasterizer: Fix clears on integer textures. 2021-11-16 22:11:31 +01:00
FernandoS27
f3ff8bdc0e TextureCache: Fix blitting filter in Vulkan and correct viewport/scissor calculation when downscaling. 2021-11-16 22:11:31 +01:00
Fernando Sahmkow
19ca0c9ab5 TextureCache: Base fixes on rescaling. 2021-11-16 22:11:30 +01:00
ameerj
122ddeb7ff vk_rasterizer: Fix scaling on Y_NEGATE 2021-11-16 22:11:30 +01:00
ReinUsesLisp
526e47f148 vk_rasterizer: Minor style change 2021-11-16 22:11:29 +01:00
Fernando Sahmkow
778700ff9d TextureCache: Modify Viewports/Scissors according to Rescale. 2021-11-16 22:11:27 +01:00
Fernando Sahmkow
ad8afaf1ef Vulran Rasterizer: address feedback. 2021-10-23 23:46:29 +02:00
Fernando Sahmkow
60a3980561 Vulkan Rasterizer: Correct DepthBias/PolygonOffset on Vulkan. 2021-09-23 03:49:10 +02:00
ameerj
678f73069f vk_rasterizer: Fix dynamic StencilOp updating when two faces are enabled
This function was incorrectly using the stencil_two_side_enable register when dynamically updating the StencilOp.
2021-09-12 16:19:12 -04:00
ameerj
e0397f00d0 vk_rasterizer: Only clear depth and stencil buffers when set in attachment aspect mask
Silences validation errors for clearing the depth/stencil buffers of framebuffer attachments that were not specified to have depth/stencil usage.
2021-08-21 02:37:15 -04:00
yzct12345
5566f3dbc0
texture_cache: Address ameerj's review 2021-08-05 20:46:24 +00:00
ReinUsesLisp
b185567a03 vk_rasterizer: Flip viewport on Y_NEGATE
Matches OpenGL's behavior. I don't believe this register flips geometry,
but we have to try to match behavior on both backends.
2021-07-29 02:17:53 -03:00
ameerj
41493fbe89 renderers: Fix clang formatting 2021-07-22 21:51:40 -04:00
ReinUsesLisp
fba6bd92d4 vk_rasterizer: Workaround bug in VK_EXT_vertex_input_dynamic_state
Workaround potential bug on Nvidia's driver where only updating high
attributes leaves low attributes out dated.
2021-07-22 21:51:39 -04:00
ReinUsesLisp
57a8921e01 vk_graphics_pipeline: Implement line width 2021-07-22 21:51:39 -04:00
ReinUsesLisp
395bed3a0a shader: Unify shader stage types 2021-07-22 21:51:39 -04:00
ReinUsesLisp
8fb2048934 vk_rasterizer: Exit render passes on fragment barriers 2021-07-22 21:51:35 -04:00
ReinUsesLisp
ea038d6653 vulkan: Add VK_EXT_vertex_input_dynamic_state support
Reduces the number of total pipelines generated on Vulkan.
Tested on Super Smash Bros. Ultimate.
2021-07-22 21:51:35 -04:00
ReinUsesLisp
3025b2f605 vk_rasterizer: Implement first index 2021-07-22 21:51:35 -04:00
ReinUsesLisp
cffd4716c5 vk_pipeline_cache,shader_notify: Add shader notifications 2021-07-22 21:51:35 -04:00
ReinUsesLisp
2a0aeaa3d2 vk_rasterizer: Flush work on clear and dispatches 2021-07-22 21:51:34 -04:00
ReinUsesLisp
77372443c3 vulkan: Enable depth bounds and use it conditionally
Intel devices pre-Xe don't support this.
2021-07-22 21:51:34 -04:00
ReinUsesLisp
d621e96d0d shader: Initial OpenGL implementation 2021-07-22 21:51:30 -04:00
ReinUsesLisp
53acdda772 vk_scheduler: Allow command submission on worker thread
This changes how Scheduler::Flush works. It queues the current command
buffer to be sent to the GPU but does not do it immediately. The Vulkan
worker thread takes care of that. Users will have to use
Scheduler::Flush + Scheduler::WaitWorker to get the previous behavior.

Scheduler::Finish is unchanged.

To avoid waiting on work never queued, Scheduler::Wait sends the current
command buffer if that's what the caller wants to wait.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
025b20f96a shader: Move pipeline cache logic to separate files
Move code to separate files to be able to reuse it from OpenGL. This
greatly simplifies the pipeline cache logic on Vulkan.

Transform feedback state is not yet abstracted and it's still
intrusively stored inside vk_pipeline_cache. It will be moved when
needed on OpenGL.
2021-07-22 21:51:29 -04:00
ReinUsesLisp
1030b612a3 vk_rasterizer: Request outside render pass execution context for compute 2021-07-22 21:51:27 -04:00
ReinUsesLisp
7cb2ab3585 shader: Implement SULD and SUST 2021-07-22 21:51:26 -04:00
ReinUsesLisp
5b3c6d59c2 vk_compute_pass: Fix compute passes 2021-07-22 21:51:26 -04:00
ReinUsesLisp
2fc698b040 vulkan: Build pipelines in parallel at runtime
Wait from the worker thread for a pipeline to build before binding it to
the command buffer. This allows queueing pipelines to multiple threads.
2021-07-22 21:51:25 -04:00
ReinUsesLisp
ec005be99d shader: Fix rasterizer integration order issues 2021-07-22 21:51:24 -04:00
ReinUsesLisp
f8115a6a9e vk_pipeline_cache: Add pipeline cache 2021-07-22 21:51:24 -04:00
ReinUsesLisp
260743f371 shader: Add partial rasterizer integration 2021-07-22 21:51:23 -04:00
ReinUsesLisp
ab46371247 shader: Initial support for textures and TEX 2021-07-22 21:51:23 -04:00
ReinUsesLisp
6db69990da spirv: Add lower fp16 to fp32 pass 2021-07-22 21:51:22 -04:00
ReinUsesLisp
85cce78583 shader: Primitive Vulkan integration 2021-07-22 21:51:22 -04:00
ReinUsesLisp
c67d64365a shader: Remove old shader management 2021-07-22 21:51:22 -04:00
bunnei
c53b688411
Merge pull request #6629 from FernandoS27/accel-dma-2
DMAEngine: Accelerate BufferClear [accelerateDMA Part 2]
2021-07-20 17:35:05 -04:00
ameerj
e0978931e8 vk_rasterizer: Only clear valid color attachments 2021-07-13 16:04:27 -04:00
Fernando Sahmkow
b780d5b5c5 DMAEngine: Accelerate BufferClear 2021-07-13 03:49:47 +02:00
Fernando Sahmkow
be1a3f7a0f accelerateDMA: Accelerate Buffer Copies. 2021-07-11 01:33:17 +02:00
Fernando Sahmkow
4a09517336 Fence Manager: remove reference fencing. 2021-07-09 22:20:36 +02:00
Fernando Sahmkow
cf38faee9b Fence Manager: Force ordering on WFI. 2021-07-09 22:20:36 +02:00
Fernando Sahmkow
63915bf2de Fence Manager: Add fences on Reference Count. 2021-07-09 22:20:36 +02:00
Fernando Sahmkow
38165fb7e3 Texture Cache: Initial Implementation of Sparse Textures. 2021-07-04 22:32:03 +02:00
ameerj
859ba21f6d buffer_cache: Simplify uniform disabling logic 2021-06-01 13:26:58 -04:00
bunnei
a4c6712a4b common: Move settings to common from core.
- Removes a dependency on core and input_common from common.
2021-04-14 16:24:03 -07:00
ameerj
20eb368e14 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
ReinUsesLisp
24d0cc3ab8 vk_rasterizer: Fix loading shader addresses twice
This was recently introduced on a wrongly rebased commit.
2021-02-15 21:34:13 -03:00
ReinUsesLisp
70353649d7 fixed_pipeline_cache: Use dirty flags to lazily update key
Use dirty flags to avoid building pipeline key from scratch on each draw
call. This saves a bit of unnecesary work on each draw call.
2021-02-13 17:44:47 -03:00
ReinUsesLisp
82c2601555 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
ReinUsesLisp
72541af3bc vulkan_memory_allocator: Add "download" memory usage hint
Allow users of the allocator to hint memory usage for downloads. This
removes the non-descriptive boolean passed for "host visible" or not
host visible memory commits, and uses an enum to hint device local,
upload and download usages.
2021-01-15 16:19:39 -03:00
ReinUsesLisp
c2b550987b renderer_vulkan: Rename Vulkan memory manager to memory allocator
"Memory manager" collides with the guest GPU memory manager, and a
memory allocator sounds closer to what the abstraction aims to be.
2021-01-15 16:19:39 -03:00
ReinUsesLisp
154a7653f9 vk_fence_manager: Use timeline semaphores instead of spin waits
With timeline semaphores we can avoid creating objects. Instead of
creating an event, grab the current tick from the scheduler and flush
the current command buffer. When the fence has to be queried/waited, we
can do so against the master semaphore instead of spinning on an event.

If Vulkan supported NVN like events or fences, we could signal from the
command buffer and wait for that without splitting things in two
separate command buffers.
2021-01-08 02:47:28 -03:00
bunnei
275b96a0e2
Merge pull request #5289 from ReinUsesLisp/vulkan-device
vulkan_common: Move device abstraction to the common directory and allow surfaceless devices
2021-01-05 17:44:56 -08:00
LC
2a6e6306d8
Merge pull request #5292 from ReinUsesLisp/empty-set
vk_rasterizer: Skip binding empty descriptor sets on compute
2021-01-04 21:32:57 -05:00
ReinUsesLisp
1ccf805367 vk_rasterizer: Skip binding empty descriptor sets on compute
Fixes unit tests where compute shaders had no descriptors in the set,
making Vulkan drivers crash when binding an empty set.
2021-01-04 17:56:39 -03:00
ReinUsesLisp
3753553b6a renderer_vulkan: Move device abstraction to vulkan_common 2021-01-04 02:22:22 -03:00
ReinUsesLisp
974d731926 renderer_vulkan: Rename VKDevice to Device
The "VK" prefix predates the "Vulkan" namespace. It was carried around
the codebase for consistency. "VKDevice" currently is a bad alias with
"VkDevice" (only an upcase character of difference) that can cause
confusion. Rename all instances of it.
2021-01-03 17:51:48 -03:00
ReinUsesLisp
d1435009ed vulkan_common: Rename renderer_vulkan/wrapper.h to vulkan_common/vulkan_wrapper.h
Allows sharing Vulkan wrapper code between different rendering backends.
2020-12-31 02:07:14 -03:00
ReinUsesLisp
9764c13d6d video_core: Rewrite the texture cache
The current texture cache has several points that hurt maintainability
and performance. It's easy to break unrelated parts of the cache
when doing minor changes. The cache can easily forget valuable
information about the cached textures by CPU writes or simply by its
normal usage.The current texture cache has several points that hurt
maintainability and performance. It's easy to break unrelated parts
of the cache when doing minor changes. The cache can easily forget
valuable information about the cached textures by CPU writes or simply
by its normal usage.

This commit aims to address those issues.
2020-12-30 03:38:50 -03:00
Lioncash
f95602f152 video_core: Resolve more variable shadowing scenarios pt.3
Cleans out the rest of the occurrences of variable shadowing and makes
any further occurrences of shadowing compiler errors.
2020-12-05 16:02:23 -05:00
Lioncash
414a87a4f4 video_core: Resolve more variable shadowing scenarios pt.2
Migrates the video core code closer to enabling variable shadowing
warnings as errors.

This primarily sorts out shadowing occurrences within the Vulkan code.
2020-12-05 06:39:35 -05:00
ReinUsesLisp
e4e0abc418 vk_graphics_pipeline: Manage primitive topology as fixed state
Vulkan has requirements for primitive topologies that don't play nicely
with yuzu's. Since it's only 4 bits, we can move it to fixed state
without changing the size of the pipeline key.

- Fixes a regression on recent Nvidia drivers on Fire Emblem: Three
  Houses.
2020-10-13 04:08:33 -03:00
ReinUsesLisp
58b0ae84b5 renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore
This reworks how host<->device synchronization works on the Vulkan
backend. Instead of "protecting" resources with a fence and signalling
these as free when the fence is known to be signalled by the host GPU,
use timeline semaphores.

Vulkan timeline semaphores allow use to work on a subset of D3D12
fences. As far as we are concerned, timeline semaphores are a value set
by the host or the device that can be waited by either of them.

Taking advantange of this, we can have a monolithically increasing
atomic value for each submission to the graphics queue. Instead of
protecting resources with a fence, we simply store the current logical
tick (the atomic value stored in CPU memory). When we want to know if a
resource is free, it can be compared to the current GPU tick.

This greatly simplifies resource management code and the free status of
resources should have less false negatives.

To workaround bugs in validation layers, when these are attached there's
a thread waiting for timeline semaphores.
2020-09-19 01:46:37 -03:00
ReinUsesLisp
9e87193725 video_core: Remove all Core::System references in renderer
Now that the GPU is initialized when video backends are initialized,
it's no longer needed to query components once the game is running: it
can be done when yuzu is booting.

This allows us to pass components between constructors and in the
process remove all Core::System references in the video backend.
2020-09-06 05:28:48 -03:00
ReinUsesLisp
aed6011d7c vk_state_tracker: Fix primitive topology
State track the current primitive topology with a regular comparison
instead of using dirty flags.

This fixes a bug in dirty flags for this particular state and it also
avoids unnecessary state changes as this property is stored in a
frequently changed bit field.
2020-08-20 23:07:30 -03:00
ameerj
1b829fbd7a move thread 1/4 count computation into allocate workers method 2020-08-16 12:02:22 -04:00
ameerj
31a76410e8 Address feedback, add shader compile notifier, update setting text 2020-08-16 12:02:22 -04:00
ameerj
4539073ce1 Address feedback. Bruteforce delete duplicates 2020-08-16 12:02:22 -04:00
ameerj
6ac97405df Vk Async pipeline compilation 2020-08-16 12:02:22 -04:00
Lioncash
06809ad7bc vulkan: Silence more -Wmissing-field-initializer warnings 2020-08-03 12:28:57 -04:00
Lioncash
4b369126c4 vk_rasterizer: Remove unused variable in Clear()
The relevant values are already assigned further down in the lambda, so
this can be removed entirely.
2020-07-21 00:49:10 -04:00
Lioncash
01f297f2e0 vk_rasterizer: Make use of designated initializers where applicable 2020-07-16 18:49:42 -04:00
ReinUsesLisp
fca26980a2 vk_rasterizer: Pass <pSizes> to CmdBindVertexBuffers2EXT
This has been fixed in Nvidia's public beta driver 451.74. The previous
beta driver will be broken, people using these will have to update.
2020-07-10 18:15:32 -03:00
ReinUsesLisp
9d55e5586f vk_rasterizer: Use nullptr for <pSizes> in CmdBindVertexBuffers2EXT
Disable this temporarily.
2020-06-26 20:57:22 -03:00
ReinUsesLisp
8584a77eb2 vk_pipeline_cache: Avoid hashing and comparing dynamic state when possible
With extended dynamic states, some bytes don't have to be collected from
the pipeline key, hence we can avoid hashing and comparing them on
lookups.
2020-06-26 20:57:22 -03:00
ReinUsesLisp
c94b398f14 vk_rasterizer: Use VK_EXT_extended_dynamic_state 2020-06-26 20:57:22 -03:00
ReinUsesLisp
c387a72c76 fixed_pipeline_state: Add requirements for VK_EXT_extended_dynamic_state
This moves dynamic state present in VK_EXT_extended_dynamic_state to a
separate structure in FixedPipelineState. This is structure is at the
bottom allowing us to hash and memcmp only when the extension is not
supported.
2020-06-26 20:55:15 -03:00
bunnei
78d3b54ea7
Merge pull request #4111 from ReinUsesLisp/preserve-contents-vk
vk_rasterizer: Don't preserve contents on full screen clears
2020-06-26 18:48:12 -04:00
ReinUsesLisp
32485917ba gl_buffer_cache: Mark buffers as resident
Make stream buffer and cached buffers as resident and query their
address. This allows us to use GPU addresses for several proprietary
Nvidia extensions.
2020-06-24 02:36:14 -03:00
Rodrigo Locatti
406d298457
Merge pull request #4110 from ReinUsesLisp/direct-upload-sets
vk_update_descriptor: Upload descriptor sets data directly
2020-06-22 05:02:13 -03:00
ReinUsesLisp
cf137ea40b vk_rasterizer: Don't preserve contents on full screen clears
There's no need to load contents from the CPU when a clear resets all
the contents of the underlying memory. This is already implemented on
OpenGL and the texture cache.
2020-06-18 18:18:33 -03:00
ReinUsesLisp
7d763f060e vk_update_descriptor: Upload descriptor sets data directly
Instead of copying to a temporary payload before sending the update task
to the worker thread, insert elements to the payload directly.
2020-06-18 17:47:19 -03:00
MerryMage
69f38355ed vk_rasterizer: BindTransformFeedbackBuffersEXT accepts a size of type VkDeviceSize 2020-06-18 15:47:44 +01:00
bunnei
c2ea1e1bcb
Merge pull request #4049 from ReinUsesLisp/separate-samplers
shader/texture: Join separate image and sampler pairs offline
2020-06-13 13:48:27 -04:00
bunnei
5633887569
Merge pull request #3986 from ReinUsesLisp/shader-cache
shader_cache: Implement a generic runtime shader cache
2020-06-12 23:14:48 -04:00
ReinUsesLisp
c95c254f3e texture_cache: Implement rendering to 3D textures
This allows rendering to 3D textures with more than one slice.
Applications are allowed to render to more than one slice of a texture
using gl_Layer from a VTG shader.

This also requires reworking how 3D texture collisions are handled, for
now, this commit allows rendering to slices but not to miplevels. When a
render target attempts to write to a mipmap, we fallback to the previous
implementation (copying or flushing as needed).

- Fixes color correction 3D textures on UE4 games (rainbow effects).
- Allows Xenoblade games to render to 3D textures directly.
2020-06-08 05:01:00 -03:00
Rodrigo Locatti
2293e8a11a
Merge pull request #4034 from ReinUsesLisp/storage-texels
vk_rasterizer: Implement storage texels and atomic image operations
2020-06-07 18:43:24 -03:00
ReinUsesLisp
678f95e4f8 vk_pipeline_cache: Use generic shader cache
Trivial port the generic shader cache to Vulkan.
2020-06-07 04:32:57 -03:00
bunnei
98671b4cfe
Merge pull request #4013 from ReinUsesLisp/skip-no-xfb
vk_rasterizer: Skip transform feedbacks when extension is unavailable
2020-06-05 11:14:36 -04:00
ReinUsesLisp
5b2b6d594c shader/texture: Join separate image and sampler pairs offline
Games using D3D idioms can join images and samplers when a shader
executes, instead of baking them into a combined sampler image. This is
also possible on Vulkan.

One approach to this solution would be to use separate samplers on
Vulkan and leave this unimplemented on OpenGL, but we can't do this
because there's no consistent way of determining which constant buffer
holds a sampler and which one an image. We could in theory find the
first bit and if it's in the TIC area, it's an image; but this falls
apart when an image or sampler handle use an index of zero.

The used approach is to track for a LOP.OR operation (this is done at an
IR level, not at an ISA level), track again the constant buffers used as
source and store this pair. Then, outside of shader execution, join
the sample and image pair with a bitwise or operation.

This approach won't work on games that truly use separate samplers in a
meaningful way. For example, pooling textures in a 2D array and
determining at runtime what sampler to use.

This invalidates OpenGL's disk shader cache :)

- Used mostly by D3D ports to Switch
2020-06-05 00:24:51 -03:00