Commit graph

247 commits

Author SHA1 Message Date
ReinUsesLisp
82c2601555 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
Ameer J
26669d9e13
Merge pull request #5880 from lat9nq/ffmpeg-external
cmake: FFmpeg linking rework
2021-02-08 21:13:10 -05:00
Chloe Marcec
c5f109bc50 video_core: Delete morton
moron.h & morton.cpp are not used anywhere and are just empty files
2021-02-08 10:20:21 +11:00
lat9nq
1d19eac415 CMake: Port citra-emu/citra FindFFmpeg.cmake
Also renames related CMake variables to match both the Find*FFmpeg* and
variables defined within the file. Fixes odd errors produced by the old
FindFFmpeg.

Citra's FindFFmpeg is slightly modified here: adds Citra's copyright at
the beginning, renames FFmpeg_INCLUDES to FFmpeg_INCLUDE_DIR, disables a
few components in _FFmpeg_ALL_COMPONENTS, and adds the missing avutil
component to the comment above.
2021-02-05 15:39:19 -05:00
lat9nq
47401016bf CMake: Implement YUZU_USE_BUNDLED_FFMPEG
For Linux, instructs CMake to use the FFmpeg submodule in externals.
This is HEAVILY based on our usage of the late Unicorn.  Minimal change
to MSVC as it uses the yuzu-emu/ext-windows-bin. MinGW now targets the
same ext-windows-bin libraries as MSVC for FFmpeg. Adds FFMPEG_LIBRARIES
to WIN32 and simplifies video_core/CMakeLists.txt a bit.
2021-02-05 14:49:51 -05:00
ReinUsesLisp
966896daad video_core/cmake: Properly generate fatal errors on Aftermath
Fix "message(ERROR ..." to "message(FATAL_ERROR ..." to properly stop
cmake when Nsight Aftermath can't be configured.
2021-01-23 04:15:30 -03:00
Rodrigo Locatti
fd873fd369
Merge pull request #5262 from ReinUsesLisp/buffer-base
buffer_cache/buffer_base: Add a range tracking buffer container and tests
2021-01-16 19:48:26 -03:00
ReinUsesLisp
fade63b58e vulkan_common: Move allocator to the common directory
Allow using the abstraction from the OpenGL backend.
2021-01-15 16:19:39 -03:00
ReinUsesLisp
cc2c3e447f video_core/cmake: Remove Werror flags already defined code-base wide
These flags are already defined in src/cmake.
2021-01-15 03:37:34 -03:00
ReinUsesLisp
a4bfae1b55 buffer_cache/buffer_base: Add a range tracking buffer container
It keeps track of the modified CPU and GPU ranges on a CPU page
granularity, notifying the given rasterizer about state changes
in the tracking behavior of the buffer.

Use a small vector optimization to store buffers smaller than 256 KiB
locally instead of using free store memory allocations.
2021-01-13 04:14:58 -03:00
ReinUsesLisp
d235cf3933 renderer_vulkan/nsight_aftermath_tracker: Move to vulkan_common 2021-01-04 02:22:22 -03:00
ReinUsesLisp
3753553b6a renderer_vulkan: Move device abstraction to vulkan_common 2021-01-04 02:22:22 -03:00
Rodrigo Locatti
7265e80c12
Merge pull request #5230 from ReinUsesLisp/vulkan-common
vulkan_common: Move reusable Vulkan abstractions to a separate directory
2021-01-03 17:38:29 -03:00
bunnei
25d607f5f6
Merge pull request #5208 from bunnei/service-threads
Service threads
2020-12-30 22:06:05 -08:00
ReinUsesLisp
11f0f7598d renderer_vulkan: Initialize surface in separate file
Move surface initialization code to a separate file. It's unlikely to
use this code outside of Vulkan, but keeping platform-specific code
(Win32, Xlib, Wayland) in its own translation unit keeps things cleaner.
2020-12-31 02:07:33 -03:00
ReinUsesLisp
47843b4f09 renderer_vulkan: Create debug callback on separate file and throw
Initialize debug callbacks (messenger) from a separate file. This allows
sharing code with different backends.

Change our Vulkan error handling to use exceptions instead of error
codes, simplifying the initialization process.
2020-12-31 02:07:33 -03:00
ReinUsesLisp
25f88d99ce renderer_vulkan: Move instance initialization to a separate file
Simplify Vulkan's backend initialization code by moving it to a separate
file, allowing us to initialize a Vulkan instance from different
backends.
2020-12-31 02:07:33 -03:00
ReinUsesLisp
d1435009ed vulkan_common: Rename renderer_vulkan/wrapper.h to vulkan_common/vulkan_wrapper.h
Allows sharing Vulkan wrapper code between different rendering backends.
2020-12-31 02:07:14 -03:00
ReinUsesLisp
d937421422 vulkan_common: Move dynamic library load to a separate file
Allows us to initialize a Vulkan dynamic library from different backends
without duplicating code.
2020-12-31 02:02:48 -03:00
ReinUsesLisp
9764c13d6d video_core: Rewrite the texture cache
The current texture cache has several points that hurt maintainability
and performance. It's easy to break unrelated parts of the cache
when doing minor changes. The cache can easily forget valuable
information about the cached textures by CPU writes or simply by its
normal usage.The current texture cache has several points that hurt
maintainability and performance. It's easy to break unrelated parts
of the cache when doing minor changes. The cache can easily forget
valuable information about the cached textures by CPU writes or simply
by its normal usage.

This commit aims to address those issues.
2020-12-30 03:38:50 -03:00
ReinUsesLisp
9106ac1e6b video_core: Add a delayed destruction ring abstraction 2020-12-30 02:10:19 -03:00
bunnei
14c825bd1c video_core: gpu: Refactor out synchronous/asynchronous GPU implementations.
- We must always use a GPU thread now, even with synchronous GPU.
2020-12-28 16:33:48 -08:00
Rodrigo Locatti
0dc4ab42cc
Merge pull request #5226 from ReinUsesLisp/c4715-vc
video_core: Enforce C4715 (not all control paths return a value)
2020-12-25 03:11:47 -03:00
ReinUsesLisp
1b9e08ab78 cmake: Always enable Vulkan
Removes the unnecesary burden of maintaining separate #ifdef paths and
allows us sharing generic Vulkan code across APIs.
2020-12-24 21:07:24 -03:00
ReinUsesLisp
1e191cc837 video_core: Enforce C4715 (not all control paths return a value)
Most of the time people write code that always returns a value,
terminates execution, throws an exception, or uses an unconventional
jump primitive.

This is not always true when we build without asserts on mainline builds.
To avoid introducing undefined behavior on our most used builds, enforce
this warning signalling an error and stopping the build from shipping.
2020-12-24 21:01:23 -03:00
Lioncash
f95602f152 video_core: Resolve more variable shadowing scenarios pt.3
Cleans out the rest of the occurrences of variable shadowing and makes
any further occurrences of shadowing compiler errors.
2020-12-05 16:02:23 -05:00
LC
978e7897a3
Merge pull request #4848 from ReinUsesLisp/type-limits
video_core: Enforce -Werror=type-limits
2020-10-28 03:16:10 -04:00
ReinUsesLisp
79da90cea8 video_core: Enforce -Wredundant-move and -Wpessimizing-move
Silence three warnings and make them errors to avoid introducing more in the future.
2020-10-28 02:44:50 -03:00
ReinUsesLisp
4a451e5849 video_core: Enforce -Werror=type-limits
Silences one warning and avoids introducing more in the future.
2020-10-28 02:37:47 -03:00
ameerj
eb67a45ca8 video_core: NVDEC Implementation
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.

The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.

To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.

Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.

Async GPU is not properly implemented at the moment.

Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
2020-10-26 23:07:36 -04:00
Lioncash
678d012c2c video_core: Conditially activate relevant compiler warnings
These compiler flags aren't shared with clang, so specifying these flags
unconditionally can lead to a bit of warning spam.

While we're in the area, we can also enable -Wunused-but-set-parameter
given this is almost always a bug.
2020-10-20 20:28:25 -04:00
ReinUsesLisp
e1600b0962 video_core: Enforce -Wclass-memaccess 2020-10-09 16:46:11 -03:00
ReinUsesLisp
2a24b1c973 video_core: Enforce -Wunused-variable and -Wunused-but-set-variable 2020-10-02 21:19:35 -03:00
ReinUsesLisp
58b0ae84b5 renderer_vulkan: Make unconditional use of VK_KHR_timeline_semaphore
This reworks how host<->device synchronization works on the Vulkan
backend. Instead of "protecting" resources with a fence and signalling
these as free when the fence is known to be signalled by the host GPU,
use timeline semaphores.

Vulkan timeline semaphores allow use to work on a subset of D3D12
fences. As far as we are concerned, timeline semaphores are a value set
by the host or the device that can be waited by either of them.

Taking advantange of this, we can have a monolithically increasing
atomic value for each submission to the graphics queue. Instead of
protecting resources with a fence, we simply store the current logical
tick (the atomic value stored in CPU memory). When we want to know if a
resource is free, it can be compared to the current GPU tick.

This greatly simplifies resource management code and the free status of
resources should have less false negatives.

To workaround bugs in validation layers, when these are attached there's
a thread waiting for timeline semaphores.
2020-09-19 01:46:37 -03:00
ReinUsesLisp
eb914b6c50 video_core: Enforce -Werror=switch
This forces us to fix all -Wswitch warnings in video_core.
2020-09-16 17:48:01 -03:00
ReinUsesLisp
91df2beee3 video_core/host_shaders: Add CMake integration for string shaders
Add the necessary CMake code to copy the contents in a string source
shader (GLSL or GLASM) to a header file then consumed by video_core
files.

This allows editting GLSL in its own files without having to maintain
them in source files.

For now, only OpenGL presentation shaders are moved, but we can add
GLASM presentation shaders and static SPIR-V generation through
glslangValidator in the future.
2020-08-23 21:37:20 -03:00
David Marcec
468bd9c1b0 async shaders 2020-07-17 14:24:57 +10:00
ReinUsesLisp
1d6be9febf video_core/compatible_formats: Table to test if two formats are legal to view or copy
Add a flat table to test if it's legal to create a texture view between
two formats or copy betweem them.

This table is based on ARB_copy_image and ARB_texture_view. Copies are
more permissive than views.
2020-06-26 19:28:11 -03:00
David Marcec
6ce5f3120b Macro HLE support 2020-06-24 12:09:01 +10:00
bunnei
798ec003ce
Merge pull request #4041 from ReinUsesLisp/arb-decomp
gl_arb_decompiler: Implement an assembly shader decompiler
2020-06-16 14:56:23 -04:00
ReinUsesLisp
a63a0daa5e gl_arb_decompiler: Implement an assembly shader decompiler
Emit code compatible with NV_gpu_program5.
This should emit code compatible with Fermi, but it wasn't tested on
that architecture. Pascal has some issues not present on Turing GPUs.
2020-06-11 22:12:07 -03:00
ReinUsesLisp
abcea1bb18 rasterizer_cache: Remove files and includes
The rasterizer cache is no longer used. Each cache has its own generic
implementation optimized for the cached data.
2020-06-07 04:32:57 -03:00
ReinUsesLisp
dc27252352 shader_cache: Implement a generic shader cache
Implement a generic shader cache for fast lookups and invalidations.
Invalidations are cheap but expensive when a shader is invalidated.

Use two mutexes instead of one to avoid locking invalidations for
lookups and vice versa. When a shader has to be removed, lookups are
locked as expected.
2020-06-07 04:32:32 -03:00
David Marcec
b032ebdfee Implement macro JIT 2020-05-30 11:40:04 +10:00
David Marcec
d0bdd26c26 Add xbyak external 2020-05-30 10:55:27 +10:00
ReinUsesLisp
a2dcc642c1 map_interval: Add interval allocator and drop hack
Drop the std::list hack to allocate memory indefinitely.

Instead use a custom allocator that keeps references valid until
destruction. This allocates fixed chunks of memory and puts pointers in
a free list. When an allocation is no longer used put it back to the
free list, this doesn't heap allocate because std::vector doesn't change
the capacity. If the free list is empty, allocate a new chunk.
2020-05-21 16:44:00 -03:00
bunnei
41682e0888
Merge pull request #3815 from FernandoS27/command-list-2
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
2020-05-05 17:12:42 -04:00
Fernando Sahmkow
9df67b2095 Clang Format and Documentation. 2020-04-28 14:02:51 -04:00
ReinUsesLisp
ddd82ef42b shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
bunnei
bf2ddb8fd5
Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00