Previously, we were mixing the raw CPU frequency and CNTFRQ.
The raw CPU frequency (1020 MHz) should've never been used as CNTPCT (whose frequency is CNTFRQ) is the only counter available.
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows.
Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution.
This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
- Fixes a hang on shutdown when NVFlinger thread is waiting on a syncpoint that will never occur.
- Commonly observed when stopping emulation in Super Mario Odyssey.
The FPS counter was based on metrics in the nvdisp swapbuffers call. This metric would be accurate if the gpu thread/renderer were synchronous with the nvdisp service, but that's no longer the case.
This commit moves the frame counting responsibility onto the concrete renderers after their frame draw calls. Resulting in more meaningful metrics.
The displayed FPS is now made up of the average framerate between the previous and most recent update, in order to avoid distracting FPS counter updates when framerate is oscillating between close values.
The status bar update frequency was also changed from 2 seconds to 500ms.
Implements the OnClose method of the nvhost_vic device, and removes the remnants of an older implementation.
Also cleans up some of the surrounding code.
This was implicitly done by `is_powered_on = false`, however the explicit method allows us to block until the GPU is actually gone.
This should fix a race condition while removing the other subsystems while the GPU is still active.
This commit removes early placeholders for an implementation of async nvdec. With recent changes to the source code, the placeholders are no longer accurate, and can cause a nullptr dereference due to the nature of the cdma_pusher lifetime.
Instead of using a two step initialization to report errors, initialize
the GPU renderer and rasterizer on the constructor and report errors
through std::runtime_error.
fmt now automatically prints the numeric value of an enum class member
by default, so we don't need to use casts any more.
Reduces the line noise a bit.
- Use .at() instead of raw indexing when dealing with untrusted indices.
- For the special case of WaitFence with syncpoint id UINT32_MAX,
instead of crashing, log an error and ignore. This is what I get when
running Super Mario Maker 2.
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.
The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.
To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.
Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.
Async GPU is not properly implemented at the moment.
Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>