This file is intended to be included instead of vulkan/vulkan.hpp. It
includes declarations of unique handlers using a dynamic dispatcher
instead of a static one (which would require linking to a Vulkan
library).
When I originally added the compute assert I used the wrong
documentation. This addresses that.
The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
Those implementations are quite costly, so there is no need to inline them to the caller.
Ressource deletion is often a performance bug, so in this way, we support to add breakpoints to them.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
The Ryujinx macro interpreter and envydis were used as reference.
Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
- no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
- GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
- GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
- GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
Modules didn't correctly define their dependencies before, which relied
on the frontends implicitly including every module for linking to
succeed.
Also changed every target_link_libraries call to specify visibility of
dependencies to avoid leaking definitions to dependents when not
necessary.
This removes explicit checks sprinkled all over the codebase to instead
just have the SW rasterizer expose an implementation with no-ops for
most operations.
The main advantage of switching to glad from glLoadGen is that, apart
from being actively maintained, it supports a customizable entrypoint
loader function, which makes it possible to also support OpenGL ES.
Several cleanups to the buildsystem:
- Do better factoring of common libs between platforms.
- Add support to building on Windows.
- Remove Qt4 support.
- Re-sort file lists and add missing headers.
This should fix the GL loading errors that occur in some drivers due to
the use of deprecated functions by GLEW. Side benefits are more accurate
auto-completion (deprecated function and symbols don't exist) and faster
pointer loading (less entrypoints to load). In addition it removes an
external library depency, simplifying the build system a bit and
eliminating one set of binary libraries for Windows.
Screen contents are now displayed using textured quads. This can be updated to expose an FBO once an OpenGL backend for when Pica rendering is being worked on. That FBO's texture can then be applied to the quads.
Previously, FBO blitting was used in order to display screen contents, which did not work on OS X. The new textured quad approach is less of a compatibility risk.
I wrote most of this for ppsspp, so I hold full copyright over it.
In addition to the original release in ppsspp, this provides functionality to easily extend e.g. two-dimensional vectors to three-dimensional vectors.