The RDTSC frequency reported by CPUID is not accurate to its true frequency.
We will spawn a separate thread to calculate the true RDTSC frequency after a measurement period of 30 seconds has elapsed.
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows.
Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution.
This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
This implementation provides a consistent, high performance, and high resolution clock where/when std::chrono::steady_clock does not provide sufficient precision.
StoppableTimedWait allows for a timed wait to be stopped immediately after a stop is requested.
This is useful in cases where long duration thread sleeps are needed and allows for immediate joining of waiting threads after a stop is requested.
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
As an optional feature which can be enabled in the advanced graphics configuration, all pipelines that get built at the initial shader loading are stored in a VkPipelineCache object and are dumped to the disk.
These vendor specific pipeline cache files are located at `/shader/GAME_ID/vulkan_pipelines.bin`. This feature was mainly added because of an issue with the AMD driver (see yuzu-emu#8507) causing invalidation of the cache files the driver builds automatically.
Narrows the include in the header to <cstddef>, since that's what houses
size_t's definition, meanwhile the <cstdint> include can be moved into
the cpp file.
Visual Studio has an option to search all files in a solution, so I
did a search in there for "default:" looking for any missing break
statements.
I've left out default statements that return something, and that throw
something, even if via ThrowInvalidType. UNREACHABLE leads towards throw
R_THROW macro leads towards a return
This also covers std::span, which does not have a const iterator.
Also renames IsSTLContainer to IsContiguousContainer to explicitly convey its semantics.
Ensures that a fixed-point value is always initialized
This likely also fixes several cases of uninitialized values being
operated on, since we have multiple areas in the codebase where the
default constructor is being used like:
Common::FixedPoint<50, 14> current_sample{};
and is then followed up with an arithmetic operation like += or
something else, which operates directly on FixedPoint's internal data
member, which would previously be uninitialized.
This calls round_up(), which is a non-const member function, so if a
fixed-point instantiation ever calls to_uint(), it'll result in a
compiler error.
This allows the member function to work.
While we're at it, we can actually mark to_long_floor() as const, since
it's not modifying any member state.
Right now this looks like a distro specific problem, but we'll have to see.
Over on Gentoo: with lz4 1.9.3 there is a lz4::lz4 library target, with 1.9.4 it's no longer
mentioned in the cmake files provided by the package. (/usr/lib64/cmake/lz4)
arch and openSUSE have lz4 1.9.4 available so I checked there,
they only have .pc files for pkg-config, so asking for "lz4::lz4" works as usual
MSVC does require "lz4::lz4" to be asked for
The startup check apparently confuses other programs when yuzu launches
2 processes and then quickly closes one of them. Though this isn't
really our issues it's also not a big deal for me to add an option to
work around that issue.
Some header files, specifically for OSX and Musl libc define PAGE_SIZE to be a number
This is great except in yuzu we're using PAGE_SIZE as a variable
Specific example
`static constexpr u64 PAGE_SIZE = u64(1) << PAGE_BITS;`
PAGE_SIZE PAGE_BITS PAGE_MASK are all similar variables.
Simply deleted the underscores, and then added YUZU_ prefix
Might be worth noting that there are multiple uses in different classes/namespaces
This list may not be exhaustive
Core::Memory 12 bits (4096)
QueryCacheBase 12 bits
ShaderCache 14 bits (16384)
TextureCache 20 bits (1048576, or 1MB)
Fixes#8779
[REUSE] is a specification that aims at making file copyright
information consistent, so that it can be both human and machine
readable. It basically requires that all files have a header containing
copyright and licensing information. When this isn't possible, like
when dealing with binary assets, generated files or embedded third-party
dependencies, it is permitted to insert copyright information in the
`.reuse/dep5` file.
Oh, and it also requires that all the licenses used in the project are
present in the `LICENSES` folder, that's why the diff is so huge.
This can be done automatically with `reuse download --all`.
The `reuse` tool also contains a handy subcommand that analyzes the
project and tells whether or not the project is (still) compliant,
`reuse lint`.
Following REUSE has a few advantages over the current approach:
- Copyright information is easy to access for users / downstream
- Files like `dist/license.md` do not need to exist anymore, as
`.reuse/dep5` is used instead
- `reuse lint` makes it easy to ensure that copyright information of
files like binary assets / images is always accurate and up to date
To add copyright information of files that didn't have it I looked up
who committed what and when, for each file. As yuzu contributors do not
have to sign a CLA or similar I couldn't assume that copyright ownership
was of the "yuzu Emulator Project", so I used the name and/or email of
the commit author instead.
[REUSE]: https://reuse.software
Follow-up to 01cf05bc75
Between packages breaking, Conan always being a moving target for
minimum required CMake support, and now their moves to Conan 2.0 causing
existing packages to break, I suppose this was a long time coming. vcpkg
isn't without its drawbacks, but at the moment it seems easier on the
project to use for external packages.
Mostly removes the logic for Conan from the root CMakeLists file,
leaving basic find_package()'s in its place. Sets only the
find_package()'s that require CONFIG mode as necessary. clang and linux
CI now use the vcpkg toolchain file configured in the Docker container
when possible.
mingw CI turns off YUZU_TESTS because there's no way on the container to
run Windows executables on a Linux host anyway, and it's not easy to get
Catch2 there.
- Avoids new GCC 12 warnings when Type is of form std::optional<T>
- Makes more sense this way, because ranged is not a property which would change over time
The latest git version of GCC has issues with my diamond inheritance
shenanigans. Since that's now two compilers that don't like it I thought
it'd be best to just axe all of it and just have the two templates like
before.
This rolls the features of BasicRangedSetting into BasicSetting, and
likewise RangedSetting into Setting. It also renames them from
BasicSetting and Setting to Setting and SwitchableSetting respectively.
Now longer name corresponds to more complex thing.
While this is the primary change, we also:
- Remove the mpsc namespace and rename Queue to MPSCQueue
- Make Slot a private struct within MPSCQueue
- Remove the AlignedAllocator template argument, as we use std::allocator
- Replace instances of mask + 1 with capacity, and mask + 2 with capacity + 1
GCC/Clang treats variables within lambdas as potentially shadowing those outside the lambda, despite them not being captured inside the lambda's capture list.
Clang (rightfully) warns that we are checking for the existence of
pointer to something just allocated on the stack, which is always true.
Instead, check whether GetModuleFileNameW failed.
Co-authored-by: Mai M <mathew1800@gmail.com>
Qt's QString::toStdU16String doesn't work when compiling against the
latest libstdc++, at least when using Clang. This function effectively
does the same thing as the aforementioned one.
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
Adds detection of additional CPU flags to cpu_detect and additions to telemetry output.
This is not exhaustive but guided by features that [dynarmic utilizes](bcfe377aaa/src/dynarmic/backend/x64/host_feature.h (L12-L33)) as well as features that are currently utilized but not reported to telemetry(invariant_tsc). This is intended to guide future optimizations.
AVX512 in particular is broken up into its individual subsets and some other processor features such as [sha](https://en.wikipedia.org/wiki/Intel_SHA_extensions) and [gfni](https://en.wikipedia.org/wiki/AVX-512#GFNI) are added to have some forward-facing data-points.
What used to be a single `CPU_Extension_x64_AVX512` telemetry field
is also broken up into individual `CPU_Extension_x64_AVX512{F,VL,CD,...}` fields.
Set the zero-enum value to Unknown
Move the Manufacterer enum into the CPUCaps structure namespace
Add "ParseManufacturer" utility-function
Fix cpu/brand string buffer sizes(!)
It is possible for virtual_offset to not be 0 when the iterator is at the beginning, and thus, std::prev(it) may be evaluated, leading to a crash in debug mode.
Co-Authored-By: Fernando S. <1731197+FernandoS27@users.noreply.github.com>
Was getting an unhandled `invalid_argument` [exception](https://en.cppreference.com/w/cpp/thread/thread/join) during
shutdown on my linux machine. This removes the need for a `StopBackendThread` function entirely since `jthread`
[automatically handles both checking if the thread is joinable and stopping the token before attempting to join](https://en.cppreference.com/w/cpp/thread/jthread/~jthread) in the case that `StartBackendThread` was never called.
Inlines implementation of exclusive instructions into JITted code,
improving performance of applications relying heavily on these
instructions.
We also fastmem these instructions for additional speed, with
support for appropriate recompilation on fastmem failure.
An unsafe optimization to disable the intercore global_monitor is also
provided, should one wish to rely solely on cmpxchg semantics for
safety.
See also: merryhime/dynarmic#664
Addresses https://github.com/yuzu-emu/yuzu/issues/7881 to fix linux
builds.
`YUZU_NON_COPYABLE` deletes the `T(const T&)` constructor which will
cause the implicitly defined default ctor/dtor to no-longer generate.
These functions allow to construct a string view from an input buffer, avoiding the copy done by the non string view counterparts. However, callers must be cognizant of the viewed buffer's lifetime to avoid a use-after-free.
The string constructor of UUID states:
Should the input string not meet the above requirements, an assert will be triggered and an invalid UUID is set instead.
This is a fixed and revised implementation of UUID that uses an array of bytes as its internal representation of a UUID instead of a u128 (which was an array of 2 u64s).
In addition to this, the generation of RFC 4122 Version 4 compliant UUIDs is also implemented.
In addition to requiring nanosecond precision, using the native clock requires that the hardware TSC has a precision greater than the emulated CPU and its clock counter.
If the build is from a non-repository, these functions will return empty. This
patch allows using defines to CMake to set version info such as
-DGIT_BRANCH=master.
CallbackStatus instances aren't the cheapest things to copy around
(relative to everything else), given that they're currently 520 bytes in
size and are currently copied numerous times when callbacks are invoked.
Instead, we can pass the status by const reference to avoid all the
copying.
In my testing, waiting for 200ms provided the same level of precision as the previous implementation when estimating the RDTSC frequency.
This significantly improves the yuzu executable launch times since we reduced the wait time from 3 seconds to 200 milliseconds.
Loop on stop_token and remove final_entry in Entry.
Move Backend thread out of Impl Constructor to its own function.
Add Start function for backend thread.
Use stop token in PopWait and check if entry filename is nullptr before logging.
VS2022 seems to introduce an optimization when moving vectors to check for equality of the element values. AlignmentAllocator needed to overload the equality operator to fix compilation of its usage in vector moving.
To keep the TAS inputs synced to the game speed even through lag spikes and loading zones, deeper access is required.
First, the `TAS::UpdateThread` has to be executed exactly once per frame. This is done by connecting it to the service method the game calls to pass parameters to the GPU: `Service::VI::QueueBuffer`.
Second, the loading time of new subareas and/or kingdoms (SMO) can vary. To counteract that, the `CPU_BOOST_MODE` can be detected: In the `APM`-interface, the call to enabling/disabling the boost mode can be caught and forwarded to the TASing system, which can pause the script execution if neccessary and enabled in the settings.
First of all, TASing requires a script to play back. The user can select the parent directory at `System -> Filesystem`, next to an option to pause TAS during loads: This requires a "hacky" setup deeper in the code and will be added in the last commit.
Also, Hotkeys are being introduced: CTRL+F5 for playback start/stop, CTRL+F6 for re-reading the script and CTRL+F7 for recording a new script.
The base playback system supports up to 8 controllers (specified by `PLAYER_NUMBER` in `tas_input.h`), which all change their inputs simulataneously when `TAS::UpdateThread` is called.
The recording system uses the controller debugger to read the state of the first controller and forwards that data to the TASing system for recording. Currently, this process sadly is not frame-perfect and pixel-accurate.
Co-authored-by: Naii-the-Baf <sfabian200@gmail.com>
Co-authored-by: Narr-the-Reg <juangerman-13@hotmail.com>
The log filter was being ignored on initialization due to the logging instance being initialized before the config instance, so the log filter was set to its default value.
This fixes that oversight, along with using descriptive exceptions instead of abort() calls.
Some system configurations may see visual regressions or lower performance using GPU decoding compared to CPU decoding. This setting provides the option for users to specify their decoding preference.
Co-Authored-By: yzct12345 <87620833+yzct12345@users.noreply.github.com>
This fixes a lost wakeup in SPSCQueue. If the reader is in just the right position, the writer's notification will be lost and this will be a problem if the writer then does something to wait on the reader.
This was discovered to affect my upcoming stacktrace PR. I don't think any performance decrease will be noticeable because an uncontended mutex is smart enough to skip the syscall. This PR might also resolve some rare deadlocks but I don't know of any examples.
This implements backtraces so we don't have to tell users how to use gdb anymore.
This prints a backtrace after abort or segfault is detected. It also fixes the log getting cut off with the last line containing only a bracket. This change lets us know what caused a crash not just what happened the few seconds before it.
I only know how to add support for Linux with GCC. Also this doesn't work outside of C/C++ such as in dynarmic or certain parts of graphics drivers. The good thing is that it'll try and just crash again but the stack frames are still there so the core dump will work just like before.
This simplifies the logging system.
This also fixes some lost messages on startup.
The simplification is simple. I removed unused functions and moved most things in the .h to the .cpp. I replaced the unnecessary linked list with its contents laid out as three member variables. Anything that went through the linked list now directly accesses the backends. Generic functions are replaced with those for each specific use case and there aren't many. This change increases coupling but we gain back more KISS and encapsulation.
With those changes it was easy to make it thread-safe. I just removed the mutex and turned a boolean atomic. I was planning to use this thread-safety in my next PR about stacktraces. It was actually async-signal-safety at first but I ended up using a different approach. Anyway getting rid of the linked list is important for that because have the list of backends constantly changing complicates things.