Commit graph

5239 commits

Author SHA1 Message Date
Morph
81b1b71993 common: fs: Remove [[nodiscard]] attribute on Remove* functions
There are a lot of scenarios where we don't particularly care whether or not the removal operation and just simply attempt a removal.

As such, removing the [[nodiscard]] attribute is best for these functions.
2021-06-22 13:36:24 -04:00
ReinUsesLisp
4009ae1da2 bootmanager: Use std::stop_source for stopping emulation
Use its std::stop_token to abort shader cache loading.

Using std::stop_token instead of std::atomic_bool allows the usage of
other utilities like std::stop_callback.
2021-06-22 00:04:57 -03:00
ReinUsesLisp
cf116a28a6 vk_master_semaphore: Use jthread for debug thread 2021-06-21 19:56:07 -03:00
lat9nq
a01459df3d gl_device: Expand on Mesa driver names
Makes this list a bit more capable at identifying Mesa drivers. Tries to
deal with two of the overloaded vendor strings in a more generic
fashion.
2021-06-20 23:04:07 -04:00
ameerj
fb16cbb17e video_core: Add GPU vendor name to window title bar 2021-06-20 23:04:07 -04:00
Fernando Sahmkow
569a1962c0 Reaper: Guarantee correct deletion. 2021-06-20 19:11:41 +02:00
ameerj
851c76233d util_shaders: Specify ASTC decoder memory barrier bits 2021-06-19 11:16:25 -04:00
ameerj
ace20ba4a4 astc_decoder.comp: Remove unnecessary LUT SSBOs
We can move them to instead be compile time constants within the shader.
2021-06-19 10:56:13 -04:00
ameerj
31b125ef57 astc: Various robustness enhancements for the gpu decoder
These changes should help in reducing crashes/drivers panics that may
occur due to synchronization issues between the shader completion and
later access of the decoded texture.
2021-06-19 09:00:33 -04:00
ameerj
0b172d12c0 vulkan_debug_callback: Skip logging known false-positive validation errors
Avoids overwhelming the log with validation errors that are not applicable
2021-06-17 22:16:32 -04:00
Fernando Sahmkow
719a6dd5a1 Reaper: Correct size calculation on Vulkan. 2021-06-17 08:48:41 +02:00
Ameer J
c5b517aa5f
Merge pull request #6469 from ReinUsesLisp/blit-view-compat
texture_cache/util: Avoid relaxed image views on different bytes per block
2021-06-16 21:08:07 -04:00
Fernando Sahmkow
ca6f47c686 Reaper: Change memory restrictions on TC depending on host memory on VK. 2021-06-17 00:29:48 +02:00
Fernando Sahmkow
0dd98842bf Reaper: Address Feedback. 2021-06-16 21:35:03 +02:00
Fernando Sahmkow
954ad2a61e Reaper: Setup settings and final tuning. 2021-06-16 21:35:03 +02:00
Fernando Sahmkow
d8ad6aa187 Reaper: Tune it up to be an smart GC. 2021-06-16 21:35:02 +02:00
ReinUsesLisp
a11bc4a382 Initial Reaper Setup
WIP
2021-06-16 21:35:02 +02:00
ReinUsesLisp
5b1efe522e vulkan_memory_allocator: Release allocations with no commits 2021-06-16 21:35:01 +02:00
ameerj
5fc8393125 astc_decoder: Fix LDR CEM1 endpoint calculation
Per the spec, L1 is clamped to the value 0xff if it is greater than 0xff. An oversight caused us to take the maximum of L1 and 0xff, rather than the minimum.

Huge thanks to wwylele for finding this.

Co-Authored-By: Weiyi Wang <wwylele@gmail.com>
2021-06-15 20:19:01 -04:00
ameerj
b2955479e5 configure_graphics: Add Accelerate ASTC decoding setting 2021-06-15 20:19:00 -04:00
ameerj
c4ff7ecf51 textures: Reintroduce CPU ASTC decoder
Users may want to fall back to the CPU ASTC texture decoder due to hangs
and crashes that may be caused by keeping the GPU under compute heavy
loads for extended periods of time. This is especially the case in games
such as Astral Chain which make extensive use of ASTC textures.
2021-06-15 20:19:00 -04:00
ReinUsesLisp
3d89398b84 texture_cache/util: Avoid relaxed image views on different bytes per pixel
Avoids API usage errors on UE4 titles leading to crashes.
2021-06-14 21:03:57 -03:00
lat9nq
932c0184a7 cmake: Fix find_program usage for 3.15
yuzu requires CMake 3.15 yet find_program was using REQUIRED, which is
only available on 3.18 and later. Instead, we check for
"<VAR>-NOTFOUND".

In addition, check for additional requirements before building libusb or
FFmpeg with autotools. Otherwise, CMake configuration will pass yet
compilation will fail.
2021-06-13 01:15:54 -04:00
Fernando Sahmkow
588ab44470 GPUTHread: Remove async reads from Normal Accuracy. 2021-06-11 17:27:17 +02:00
ReinUsesLisp
7b0d8bd1fb rasterizer: Update pages in batches 2021-06-11 17:27:17 +02:00
Markus Wick
6755025310 Fix GCC undefined behavior sanitizer.
* Wrong alignment in u64 LOG_DEBUG -> memcpy.
* Huge shift exponent in stride calculation for linear buffer, unused result -> skipped.
* Large shift in buffer cache if word = 0, skip checking for set bits.

Non of those were critical, so this should not change any behavior.
At least with the assumption, that the last one used masking behavior, which always yield continuous_bits = 0.
2021-06-10 21:07:27 +02:00
bunnei
df91c9f5e6
Merge pull request #6410 from lat9nq/avoid-oob
decoders: Avoid out-of-bounds access
2021-06-07 10:51:17 -07:00
lat9nq
287a0f72a5 decoders: Break instead of continue
continue causes a memory leak in A Hat in Time.
2021-06-04 05:12:14 -04:00
lat9nq
1feefabeba decoders: Avoid out-of-bounds access
This is not a real fix, so assert here and continue before crashing.
2021-06-04 05:03:54 -04:00
ameerj
859ba21f6d buffer_cache: Simplify uniform disabling logic 2021-06-01 13:26:58 -04:00
bunnei
0a6f685ad0
Merge pull request #6367 from ReinUsesLisp/vma-host
vulkan_memory_allocator: Allow textures to be allocated in host memory
2021-05-31 23:35:11 -07:00
bunnei
8592f8a2b4 video_core: gpu: WaitFence: Do not block threads during shutdown.
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur.
- Commonly observed when stopping emulation in Super Mario Odyssey.
2021-05-29 01:06:04 -07:00
Markus Wick
5a8cd1b118 Fix two GCC 11 warnings: Unneeded copies.
std::move created an unneeded copy.
iterating without reference also created copies.
2021-05-29 08:57:44 +02:00
bunnei
4b95b0df97 video_core: rasterizer_cache: Use u16 for cached page count.
- Greatly reduces the risk of overflow, at the cost of doubling the size of this array.
2021-05-27 14:47:24 -07:00
ReinUsesLisp
19454e71d8 vulkan_memory_allocator: Allow textures to be allocated in host memory
Allow Vulkan's allocator to use host memory when there's no more device
local memory. This delays OOM, but it will eventually still happen.
2021-05-27 05:50:48 -03:00
Morph
065867e2c2
common: fs: Rework the Common Filesystem interface to make use of std::filesystem (#6270)
* common: fs: fs_types: Create filesystem types

Contains various filesystem types used by the Common::FS library

* common: fs: fs_util: Add std::string to std::u8string conversion utility

* common: fs: path_util: Add utlity functions for paths

Contains various utility functions for getting or manipulating filesystem paths used by the Common::FS library

* common: fs: file: Rewrite the IOFile implementation

* common: fs: Reimplement Common::FS library using std::filesystem

* common: fs: fs_paths: Add fs_paths to replace common_paths

* common: fs: path_util: Add the rest of the path functions

* common: Remove the previous Common::FS implementation

* general: Remove unused fs includes

* string_util: Remove unused function and include

* nvidia_flags: Migrate to the new Common::FS library

* settings: Migrate to the new Common::FS library

* logging: backend: Migrate to the new Common::FS library

* core: Migrate to the new Common::FS library

* perf_stats: Migrate to the new Common::FS library

* reporter: Migrate to the new Common::FS library

* telemetry_session: Migrate to the new Common::FS library

* key_manager: Migrate to the new Common::FS library

* bis_factory: Migrate to the new Common::FS library

* registered_cache: Migrate to the new Common::FS library

* xts_archive: Migrate to the new Common::FS library

* service: acc: Migrate to the new Common::FS library

* applets/profile: Migrate to the new Common::FS library

* applets/web: Migrate to the new Common::FS library

* service: filesystem: Migrate to the new Common::FS library

* loader: Migrate to the new Common::FS library

* gl_shader_disk_cache: Migrate to the new Common::FS library

* nsight_aftermath_tracker: Migrate to the new Common::FS library

* vulkan_library: Migrate to the new Common::FS library

* configure_debug: Migrate to the new Common::FS library

* game_list_worker: Migrate to the new Common::FS library

* config: Migrate to the new Common::FS library

* configure_filesystem: Migrate to the new Common::FS library

* configure_per_game_addons: Migrate to the new Common::FS library

* configure_profile_manager: Migrate to the new Common::FS library

* configure_ui: Migrate to the new Common::FS library

* input_profiles: Migrate to the new Common::FS library

* yuzu_cmd: config: Migrate to the new Common::FS library

* yuzu_cmd: Migrate to the new Common::FS library

* vfs_real: Migrate to the new Common::FS library

* vfs: Migrate to the new Common::FS library

* vfs_libzip: Migrate to the new Common::FS library

* service: bcat: Migrate to the new Common::FS library

* yuzu: main: Migrate to the new Common::FS library

* vfs_real: Delete the contents of an existing file in CreateFile

Current usages of CreateFile expect to delete the contents of an existing file, retain this behavior for now.

* input_profiles: Don't iterate the input profile dir if it does not exist

Silences an error produced in the log if the directory does not exist.

* game_list_worker: Skip parsing file if the returned VfsFile is nullptr

Prevents crashes in GetLoader when the virtual file is nullptr

* common: fs: Validate paths for path length

* service: filesystem: Open the mod load directory as read only
2021-05-25 19:32:56 -04:00
bunnei
5068279f23
Merge pull request #6248 from A-w-x/intelmesa
gl_device: Intel: Disable texture view formats workaround on mesa
2021-05-20 23:47:14 -07:00
bunnei
7d86a6ff02
Merge pull request #6317 from ameerj/fps-fix
perf_stats: Rework FPS counter to be more accurate
2021-05-18 19:56:29 -07:00
bunnei
93bc59b62d
Merge pull request #6322 from ameerj/fast-null-buffer
buffer_cache: Ensure null buffers cannot take the fast uniform bind path
2021-05-17 15:45:36 -07:00
ameerj
acf22336ec buffer_cache: Ensure null buffers cannot take the fast uniform bind path
Fixes a crash in New Pokemon Snap
2021-05-16 07:43:40 -04:00
bunnei
a1138028a8
Merge pull request #6289 from ameerj/oob-blit
texture_cache: Handle out of bound texture blits
2021-05-15 21:32:37 -07:00
ameerj
5bef54618a perf_stats: Rework FPS counter to be more accurate
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case.

This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics.
The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values.

The status bar update frequency was also changed from 2 seconds to 500ms.
2021-05-15 20:34:20 -04:00
ameerj
3671fd0a97 texture_cache: Handle out of bound texture blits
Some games interleave a texture blit using regions which are out-of-bounds. This addresses the interleaving to avoid oob reads from the src texture.
2021-05-07 22:14:21 -04:00
bunnei
2a7eff57a8 hle: kernel: Rename Process to KProcess. 2021-05-05 16:40:52 -07:00
A-w-x
6a2084a204 gl_device: Intel: Disable texture view formats workaround on mesa 2021-04-26 18:14:10 +02:00
bunnei
3c5fb53634
Merge pull request #6237 from ameerj/nvdec-end-fix
nvhost_vic: Fix device closure
2021-04-25 23:05:58 -07:00
ameerj
ae758a236f vk_texture_cache: Swap R and B channels of color flipped format
Swaps the Red and Blue channels of the A1B5G5R5_UNORM texture format, which was being incorrectly rendered.
2021-04-24 23:59:42 -04:00
ameerj
75e0d16caa nvhost_vic: Fix device closure
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation.

Also cleans up some of the surrounding code.
2021-04-24 19:22:09 -04:00
Lioncash
17b7f0389a texture_cache/util: Fix src being used instead of dst within DeduceBlitImages
This line can only ever be reached if src is null, so dereferencing it
here is a logic bug that slipped through.

Instead, we dereference dst instead which is guaranteed to be valid.
2021-04-19 13:01:50 -04:00
bunnei
9ad77ba6d3
Merge pull request #6125 from ogniK5377/nvdec-close-dev
nvdrv: Cleanup CDMA Processor on device closure
2021-04-16 23:14:44 -07:00
Chloe Marcec
edb1d5d242 Address issues 2021-04-16 13:52:32 +10:00
bunnei
de5bf640b7
Merge pull request #6196 from bunnei/asserts-setting
core: settings: Add setting for debug assertions and disable by default.
2021-04-14 17:47:18 -07:00
bunnei
a4c6712a4b common: Move settings to common from core.
- Removes a dependency on core and input_common from common.
2021-04-14 16:24:03 -07:00
bunnei
8146c8c5e7
Merge pull request #6191 from lioncash/vdtor
engine_interface: Add missing virtual destructor
2021-04-13 19:59:10 -07:00
bunnei
12a343ed8d
Merge pull request #6190 from lioncash/constfn2
vk_master_semaphore: Add missing const qualifier for IsFree()
2021-04-13 17:52:38 -07:00
bunnei
62b560e8e3
Merge pull request #6188 from lioncash/bits
vk_texture_cache: Make use of bit_cast where applicable
2021-04-13 16:44:49 -07:00
bunnei
154eb3cfbe
Merge pull request #6187 from lioncash/sign-conv
texure_cache/util: Resolve implicit sign conversions with std::reduce
2021-04-13 09:46:32 -07:00
Lioncash
31932904c5 engine_interface: Add missing virtual destructor
Eliminates a potential bug vector related to inheritance. Plus, we
should generally be specifying the destructor as virtual within purely
virtual interfaces to begin with.
2021-04-12 09:53:55 -04:00
Lioncash
9b331a5fb5 vk_master_semaphore: Deduplicate atomic access within IsFree()
We can just reuse the already existing KnownGpuTick() to deduplicate the
access.
2021-04-12 09:41:55 -04:00
Lioncash
c5f5d6e7f6 vk_master_semaphore: Add missing const qualifier for IsFree()
This member function doesn't modify class state.
2021-04-12 09:41:23 -04:00
Lioncash
4198c92ed0 vk_texture_cache: Make use of Common::BitCast where applicable
Also clarify the TODO comment a little more on the lacking
implementations for std::bit_cast.
2021-04-12 09:17:36 -04:00
Lioncash
fddb278aa3 texure_cache/util: Resolve implicit sign conversions with std::reduce
Amends implicit sign conversions occurring with usages of std::reduce
and also relocates it to its own utility function to reduce verbosity a
little bit.
2021-04-12 05:21:53 -04:00
Lioncash
4209588505 query_cache: Make use of std::erase_if
Same behavior, but much more straightforward to read.
2021-04-12 04:51:18 -04:00
Rodrigo Locatti
ddbd1387aa
Merge pull request #6181 from Joshua-Ashton/robustness_features
vulkan_device: Enable EXT_robustness2 features
2021-04-11 20:42:14 -03:00
Joshua Ashton
0ec6cb942d
vk_buffer_cache: Fix offset for NULL vertex buffers
The Vulkan spec states:
If an element of pBuffers is VK_NULL_HANDLE, then the corresponding element of pOffsets must be zero.

https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/vkCmdBindVertexBuffers2EXT.html#VUID-vkCmdBindVertexBuffers2EXT-pBuffers-04112
2021-04-11 10:34:52 +01:00
Joshua Ashton
08337a492d
vulkan_device: Enable EXT_robustness2 features
When this was being made mandatory, these enablement of these features was removed, but this is still needed.

Fixes: 757fd1e917 ("vulkan_device: Require VK_EXT_robustness2")
2021-04-11 09:48:38 +01:00
Joshua Ashton
bcf58c8210
renderer_vulkan: Check return value of AcquireNextImage
We can get into a really bad state by ignoring this
leading to device loss and using incorrect resources.
2021-04-11 09:27:50 +01:00
Markus Wick
e8bd9aed8b video_core: Use a CV for blocking commands.
There is no need for a busy loop here. Let's just use a condition variable to save some power.
2021-04-07 22:38:52 +02:00
Markus Wick
e6fb49fa4b video_core/gpu_thread: Keep the write lock for allocating the fence.
Else the fence might get submited out-of-order into the queue, which makes testing them pointless.
Overhead should be tiny as the mutex is just moved from the queue to the writing code.
2021-04-07 22:38:52 +02:00
Markus Wick
5145133a60 video_core/gpu_thread: Implement a ShutDown method.
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone.

This should fix a race condition while removing the other subsystems while the GPU is still active.
2021-04-07 22:38:52 +02:00
Markus Wick
4aec060f6d common/threadsafe_queue: Provide Wait() method.
It shall block until there is something to consume in the queue.

And use it for the GPU emulation instead of the spin loop.
This is only in booting the emulator, however in BOTW this is the case for about 1 second.
2021-04-07 22:38:52 +02:00
lat9nq
a60653dcd3 vp9: Avoid memcpy with null pointers
Avoid sending null pointer to memcpy as reported by Undefined Behaviour
Sanitizer. Replaces the std::memcpy calls in SpliceVectors with
std::copy calls. Opting to replace all the memcpy's with copy's.

Co-authored-by: LC <mathew1800@gmail.com>
2021-04-05 00:44:38 -04:00
Rodrigo Locatti
5ee669466f
Merge pull request #5927 from ameerj/astc-compute
video_core: Accelerate ASTC texture decoding using compute shaders
2021-03-30 19:31:52 -03:00
Chloe Marcec
bf1c1788ca nvdrv: Cleanup CDMA Processor on device closure
Brings us a step closer to unifying all channels to share a common interface.
2021-03-30 20:37:40 +11:00
Jan Beich
9b50b23a50 vulkan_common: enable OpenGL interop on other Unices 2021-03-30 00:25:25 +00:00
ameerj
2f83d9a61b astc_decoder: Refactor for style and more efficient memory use 2021-03-25 16:53:51 -04:00
Jan Beich
8c016b02e7 gl_device: unblock async shaders on other Unix systems
Mesa is the primary OpenGL provider on all FreeDesktop systems.
For example, iris is used on Intel GPU + FreeBSD by default.
2021-03-24 19:59:20 +00:00
lat9nq
538f097f97 gl_device: Block async shaders on AMD and Intel
Currently, the Windows versions of the Intel OpenGL driver and the AMD
proprietary OpenGL driver do not properly support (or in fact degrade)
when asynchronous shader compilation is enabled. This blocks
specifically those drivers from using this feature. This affects
AMDGPU-PRO on Linux, and AMD's and Intel's OpenGL drivers on Windows.
2021-03-21 01:25:45 -04:00
Rodrigo Locatti
2f30c10584 astc_decoder: Reimplement Layers
Reimplements the approach to decoding layers in the compute shader. Fixes multilayer astc decoding when using Vulkan.
2021-03-13 12:16:03 -05:00
ameerj
c7553abe89 astc_decoder: Fix out of bounds memory access
resolves a crash with some anamolous textures found in Astral Chain.
2021-03-13 12:16:03 -05:00
ameerj
20eb368e14 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
ameerj
f6566338eb host_shaders: Modify shader cmake integration to allow for larger shaders
using a raw string to encapsulate the entire shader code limits us to shaders of size less than 2KB. This change overcomes this limitation.
2021-03-13 12:16:03 -05:00
ameerj
2985e5e94c renderer_opengl: Accelerate ASTC texture decoding with a compute shader
ASTC texture decoding is currently handled by a CPU decoder for GPU's without native ASTC decoding support (most desktop GPUs). This is the cause for noticeable performance degradation in titles which use the format extensively.

This commit adds support to accelerate ASTC decoding using a compute shader on OpenGL for GPUs without native support.
2021-03-13 12:16:03 -05:00
bunnei
4735d18bb9
Merge pull request #6028 from bunnei/raster-cache
video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
2021-03-12 21:57:27 -08:00
bunnei
a9d24b0df3 video_core: rasterizer_accelerated: Fix un/signed mismatch. 2021-03-12 21:52:49 -08:00
Rodrigo Locatti
daf5c5060b
Merge pull request #5891 from ameerj/bgra-ogl
renderer_opengl: Use compute shaders to swizzle BGR textures on copy
2021-03-09 02:47:51 -03:00
bunnei
d1a7b2eca7
Merge pull request #6021 from ReinUsesLisp/skip-cache-heuristic
buffer_cache: Heuristically decide to skip cache on uniform buffers
2021-03-08 17:48:55 -08:00
ameerj
5213f70230 texture_cache: Blacklist BGRA8 copies and views on OpenGL
In order to force the BGRA8 conversion on Nvidia using OpenGL, we need to forbid texture copies and views with other formats.

This commit also adds a boolean relating to this, as this needs to be done only for the OpenGL api, Vulkan must remain unchanged.
2021-03-04 14:14:49 -05:00
ameerj
0639244d85 renderer_opengl: Swizzle BGR textures on copy
OpenGL does not natively support BGR internal formats, which causes many BGR textures to render incorrectly, with Red and Blue channels swapped.

This commit aims to address this by swizzling the blue and red channels on texture copies when a BGR format is encountered.
2021-03-04 14:14:19 -05:00
bunnei
b8b5891585
Merge pull request #5989 from ReinUsesLisp/cmdpool
vk_command_pool: Reduce the command pool size from 4096 to 4
2021-03-04 11:07:31 -08:00
bunnei
50ee9c46ab video_core: rasterizer_accelerated: Fix delta check ordering. 2021-03-02 17:48:02 -08:00
bunnei
6ab839462c video_core: rasterizer_accelerated: Improve error handling & fix implicit conversion. 2021-03-02 17:44:02 -08:00
bunnei
94da1e8a7e video_core: rasterizer_accelerated: Use a flat array instead of interval_map for cached pages.
- Uses a fixed 64MB for the cache instead of an ever growing map.
- Slightly faster by using atomics instead of a single mutex for access.
- Thanks for Rodrigo for the idea.
2021-03-02 16:57:53 -08:00
ReinUsesLisp
5ad62e7bfc buffer_cache: Heuristically decide to skip cache on uniform buffers
Some games benefit from skipping caches (Pokémon Sword), and others
don't (Animal Crossing: New Horizons). Add an heuristic to decide this
at runtime.

The cache hit ratio has to be ~98% or better to not skip the cache.
There are 16 frames of buffer.
2021-03-02 02:44:19 -03:00
ameerj
52e9d7fa49 gpu_thread: Remove Async NVDEC placeholders
This commit removes early placeholders for an implementation of async nvdec. With recent changes to the source code, the placeholders are no longer accurate, and can cause a nullptr dereference due to the nature of the cdma_pusher lifetime.
2021-02-28 22:03:00 -05:00
bunnei
55f556c53e
Merge pull request #5984 from jbeich/gcc-freebsd
common,video-core: unbreak GCC 11 build on FreeBSD 13
2021-02-27 14:15:00 -07:00
bunnei
09f7c355c6
Merge pull request #5953 from bunnei/memory-refactor-1
Kernel Rework: Memory updates and refactoring (Part 1)
2021-02-27 12:48:35 -07:00
Kelebek1
d31dbb1bc1 Implement glDepthRangeIndexeddNV 2021-02-24 22:26:53 +00:00
ReinUsesLisp
aae399c1a8 vk_command_pool: Reduce the command pool size from 4096 to 4
This allows drivers to reuse memory more easily and preallocate less.
The optimal number has been measured booting Pokémon Sword.
2021-02-23 19:08:24 -03:00
Jan Beich
1841ca4b9b video_core: add missing header after 468bd9c1b0
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkShaderComplete()':
src/video_core/shader_notify.cpp:33:10: error: 'unique_lock' is not a member of 'std'
   33 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:6:1: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
    5 | #include "video_core/shader_notify.h"
  +++ |+#include <mutex>
    6 |
src/video_core/shader_notify.cpp: In member function 'void VideoCore::ShaderNotify::MarkSharderBuilding()':
src/video_core/shader_notify.cpp:38:10: error: 'unique_lock' is not a member of 'std'
   38 |     std::unique_lock lock{mutex};
      |          ^~~~~~~~~~~
src/video_core/shader_notify.cpp:38:10: note: 'std::unique_lock' is defined in header '<mutex>'; did you forget to '#include <mutex>'?
2021-02-23 00:04:36 +00:00
bunnei
20245e660f
Merge pull request #5936 from Kelebek1/Offsets
Offsets for TexelFetch and TextureGather in Vulkan
2021-02-21 21:23:45 -07:00
Morph
1a5d4d7840 gl_disk_shader_cache: Log total shader entries count on game load 2021-02-20 11:08:19 -05:00
bunnei
728ee181eb
Merge pull request #5924 from ReinUsesLisp/inline-bindings
vk_update_descriptor: Inline and improve code for binding buffers
2021-02-19 12:27:10 -08:00
bunnei
93e20867b0 hle: kernel: Migrate PageHeap/PageTable to KPageHeap/KPageTable. 2021-02-18 16:16:25 -08:00
bunnei
9cae3e6e90
Merge pull request #4973 from ameerj/nvdec-opt
nvdec: Reuse allocated buffers and general cleanup
2021-02-18 15:12:07 -08:00
ReinUsesLisp
24d0cc3ab8 vk_rasterizer: Fix loading shader addresses twice
This was recently introduced on a wrongly rebased commit.
2021-02-15 21:34:13 -03:00
bunnei
cffa6f4e62
Merge pull request #5923 from ReinUsesLisp/vk-dirty-pipeline
fixed_pipeline_cache: Use dirty flags to lazily update key
2021-02-15 13:17:27 -08:00
Kelebek1
9d8f793969 Review 1 2021-02-15 05:26:28 +00:00
Kelebek1
fb54c38631 Implement texture offset support for TexelFetch and TextureGather and add offsets for Tlds
Formatting
2021-02-15 00:36:37 +00:00
bunnei
eae9f2e440 yuzu: Various frontend improvements to avoid crashes and improve experience on Linux. 2021-02-14 00:20:41 -08:00
ReinUsesLisp
b8ffdbb167 vk_resource_pool: Load GPU tick once and compare with it
Other minor style improvements. Rename free_iterator to hint_iterator,
to describe better what it does.
2021-02-13 17:53:58 -03:00
ReinUsesLisp
21b40de318 vk_update_descriptor: Inline and improve code for binding buffers
Allow compilers with our settings inline hot code.
2021-02-13 17:46:24 -03:00
ReinUsesLisp
70353649d7 fixed_pipeline_cache: Use dirty flags to lazily update key
Use dirty flags to avoid building pipeline key from scratch on each draw
call. This saves a bit of unnecesary work on each draw call.
2021-02-13 17:44:47 -03:00
ameerj
c7325c6a4c gl_texture_cache: Lazily create non-sRGB texture views for sRGB formats
This creates non-sRGB texture views for sRGB texture formats to allow for interfacing with these views in compute shaders using imageLoad and imageStore.

Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-02-13 13:27:50 -05:00
ameerj
b675c44e49 rebase, fix name shadowing, more const 2021-02-13 13:07:56 -05:00
ameerj
3c37d66c28 Address PR feedback
Co-Authored-By: LC <712067+lioncash@users.noreply.github.com>
2021-02-13 13:07:56 -05:00
ameerj
09722cb4a7 streamline cdma_pusher/command_classes 2021-02-13 13:07:56 -05:00
ameerj
77564f987c streamline cdma_pusher/command_classes 2021-02-13 13:07:53 -05:00
ameerj
ac265a72ce nvdec cleanup 2021-02-13 13:07:31 -05:00
Morph
83227ad981
Merge pull request #5919 from ReinUsesLisp/stream-buffer-tragic
gl_stream_buffer/vk_staging_buffer_pool: Fix size check
2021-02-13 21:25:45 +08:00
ReinUsesLisp
dd9caf9aa0 vk_master_semaphore: Mark gpu_tick atomic operations with relaxed order 2021-02-13 05:57:28 -03:00
ReinUsesLisp
6171566296 vk_staging_buffer_pool: Inline tick tests
Load the current tick to a local variable, moving it out of an atomic
and allowing us to compare the value without going through a pointer
each time. This should make the loop more optimizable.
2021-02-13 05:14:11 -03:00
ReinUsesLisp
682d82faf3 gl_stream_buffer/vk_staging_buffer_pool: Fix size check
Fix a tragic off-by-one condition that causes Vulkan's stream buffer to
think it's always full, using fallback memory. The OpenGL was also
affected by this bug to a lesser extent.
2021-02-13 05:11:48 -03:00
LC
6f1ad6aa9f
Merge pull request #5916 from ameerj/maxwell-gl-unused
maxwell_to_gl: Remove unused code
2021-02-13 02:55:59 -05:00
ReinUsesLisp
757fd1e917 vulkan_device: Require VK_EXT_robustness2
We are already using robustness2 features without requiring it
explicitly, causing potential crashes on drivers without the extension.
Requiring this at boot allows better diagnostics for it and formalizes
our usage on the extension.
2021-02-13 03:31:50 -03:00
ReinUsesLisp
5b35b01070 video_core: Fix clang build issues 2021-02-13 02:26:47 -03:00
ReinUsesLisp
025fe458ae vk_staging_buffer_pool: Fix softlock when stream buffer overflows
There was still a code path that could wait on a timeline semaphore tick
that would never be signalled.

While we are at it, make use of more STL algorithms.
2021-02-13 02:18:38 -03:00
ReinUsesLisp
3a2eefb16c vk_buffer_cache: Add support for null index buffers
Games can bind a null index buffer (size=0) where all indices are
evaluated as zero. VK_EXT_robustness2 doesn't support this and all
drivers segfault when a null index buffer is passed to
vkCmdBindIndexBuffer.

Workaround this by creating a 4 byte buffer and filling it with zeroes.
If it's read out of bounds, robustness takes care of returning zeroes as
indices.
2021-02-13 02:18:38 -03:00
ReinUsesLisp
0b8b961442 buffer_cache: Add extra bytes to guest SSBOs
Bind extra bytes beyond the guest API's bound range.
This is due to some games like Astral Chain operating out of bounds.
Binding the whole map range would be technically correct, but games
have large maps that make this approach unaffordable for now.
2021-02-13 02:18:38 -03:00
ReinUsesLisp
93a69b6cc8 Merge branch 'bytes-to-map-end' into new-bufcache-wip 2021-02-13 02:18:35 -03:00
ReinUsesLisp
7402442442 vk_staging_buffer_pool: Get a staging buffer instead of waiting
Avoids waiting idle while the GPU finishes to do work, and fixes an
issue where we'd wait forever if a single command buffer (logic tick)
all the data.
2021-02-13 02:18:05 -03:00
ReinUsesLisp
0b631f22fc renderer_opengl: Remove interop
Remove unused interop code from the OpenGL backend.
2021-02-13 02:18:04 -03:00
ReinUsesLisp
3da87d3f12 gl_buffer_cache: Drop interop based parameter buffer workarounds
Sacrify runtime performance to avoid generating kernel exceptions on
Windows due to our abusive aliasing of interop buffer objects.
2021-02-13 02:17:24 -03:00
ReinUsesLisp
2b95c137ff buffer_cache: Heuristically detect stream buffers
Detect when a memory region has been joined several times and increase
the size of the created buffer on those instances. The buffer is assumed
to be a "stream buffer", increasing its size should stop us from
constantly recreating it and fragmenting memory.
2021-02-13 02:17:24 -03:00
ReinUsesLisp
ec9354d6d9 buffer_cache: Split CreateBuffer in separate functions
Allow adding functionality to each function without making CreateBuffer
more complex.
2021-02-13 02:17:24 -03:00
ReinUsesLisp
a02b4e1df6 buffer_cache: Skip cache on small uploads on Vulkan
Ports from OpenGL the optimization to skip small 3D uniform buffer
uploads. This will take advantage of the previously introduced stream
buffer.

Fixes instances where the staging buffer offset was being ignored.
2021-02-13 02:17:24 -03:00
ReinUsesLisp
35df1d1864 vk_staging_buffer_pool: Add stream buffer for small uploads
This uses a ring buffer similar to OpenGL's stream buffer for small
uploads. This stops us from allocating several small buffers, reducing
memory fragmentation and cache locality.

It uses dedicated allocations when possible.
2021-02-13 02:17:24 -03:00
ReinUsesLisp
8fd518ec40 vulkan_device: Enable robustBufferAccess
Fix regression on Pascal on Animal Crossing: New Horizons, fixing a
validation error.
2021-02-13 02:17:23 -03:00
ReinUsesLisp
82c2601555 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
ReinUsesLisp
a39d9c5194 vulkan_common: Expose interop and headless devices 2021-02-13 02:16:21 -03:00
ReinUsesLisp
47d5ec6cfc vulkan_common: Make interop extensions mandatory 2021-02-13 02:16:21 -03:00
ReinUsesLisp
40ed0cb920 vulkan_device: Enable robust buffers 2021-02-13 02:16:21 -03:00
ReinUsesLisp
1a987054c5 vulkan_device: Use designated initializers for features 2021-02-13 02:16:21 -03:00
ReinUsesLisp
79afdeaf08 vulkan_wrapper: Add memory barrier pipeline barrier helper 2021-02-13 02:16:21 -03:00
ReinUsesLisp
004a8d6a7a vulkan_device: Fix formatting of constants 2021-02-13 02:16:21 -03:00
ReinUsesLisp
16f97ded21 vulkan_wrapper: Add interop functions 2021-02-13 02:16:21 -03:00
ReinUsesLisp
9735c34f5d vulkan_instance: Initialize Vulkan instance in a separate thread
Workaround an issue on Nvidia where creating a Vulkan instance from an
active OpenGL thread disables threaded optimization on the driver.
This optimization is important to have good performance on Nvidia
OpenGL.
2021-02-13 02:16:21 -03:00
ReinUsesLisp
dde19e7d75 vulkan_wrapper: Pull Windows symbols 2021-02-13 02:16:21 -03:00
ReinUsesLisp
75ccd9959c gpu: Report renderer errors with exceptions
Instead of using a two step initialization to report errors, initialize
the GPU renderer and rasterizer on the constructor and report errors
through std::runtime_error.
2021-02-13 02:16:19 -03:00
ReinUsesLisp
9d8ca6cc4a buffer_base: Add support for cached CPU writes
Some games usually write memory pages currently used by the GPU, causing
rendering issues (e.g. flashing geometry and shadows on Link's
Awakening). To workaround this issue, Guest CPU writes are delayed until
the command buffer finishes processing, but the pages are updated
immediately.

The overall behavior is:
- CPU writes are cached until they are flushed, they update the page
  state, but don't change the modification state. Cached writes stop
  pages from being flushed, in case games have meaningful data in it.
- Command processing writes (e.g. push constants) update the page state
  and are marked to the command processor as dirty. They don't remove
  the state of cached writes.
2021-02-13 02:15:29 -03:00