There are still some other issues not addressed here, but it's a start.
Workarounds for false-positive reports:
- `RasterizerAccelerated`: Put a gigantic array behind a `unique_ptr`,
because UBSan has a [hardcoded limit](https://stackoverflow.com/questions/64531383/c-runtime-error-using-fsanitize-undefined-object-has-a-possibly-invalid-vp)
of how big it thinks objects can be, specifically when dealing with
offset-to-top values used with multiple inheritance. Hopefully this
doesn't have a performance impact.
- `QueryCacheBase::QueryCacheBase`: Avoid an operation that UBSan thinks
is UB even though it at least arguably isn't. See the link in the
comment for more information.
Fixes for correct reports:
- `PageTable`, `Memory`: Use `uintptr_t` values instead of pointers to
avoid UB from pointer overflow (when pointer arithmetic wraps around
the address space).
- `KScheduler::Reload`: `thread->GetOwnerProcess()` can be `nullptr`;
avoid calling methods on it in this case. (The existing code returns
a garbage reference to a field, which is then passed into
`LoadWatchpointArray`, and apparently it's never used, so it's
harmless in practice but still triggers UBSan.)
- `KAutoObject::Close`: This function calls `this->Destroy()`, which
overwrites the beginning of the object with junk (specifically a free
list pointer). Then it calls `this->UnregisterWithKernel()`. UBSan
complains about a type mismatch because the vtable has been
overwritten, and I believe this is indeed UB. `UnregisterWithKernel`
also loads `m_kernel` from the 'freed' object, which seems to be
technically safe (the overwriting doesn't extend as far as that
field), but seems dubious. Switch to a `static` method and load
`m_kernel` in advance.
MinGW's strftime implementation does not work and cannot be used to
determine the time zone. Besides that, the string operations are
actually unnecessary since we can get the offset from
std::localtime.
Compare localtime to gmtime to find the zone offset on all platforms.
Even though it compiles and runs fine on the latest Windows versions,
older LTSC builds will crash due to lacking support somewhere in the OS.
For now just disable it for MSVC until either Microsoft fixes this or we
no longer support 1809 LTSC.
Adds <version> since we are looking at C++ implementation version
details. Also moves exception header includes into the if preprocessor
command since we only use it there.
Windows will let you select time zones that will fail in their
own C++ implementation library. Evidently from the stack trace, we get a
runtime error to work with, so catch it and use the fallback.
This implements some missing network APIs including a large chunk of the SSL
service, enough for Mario Maker (with an appropriate mod applied) to connect to
the fan server [Open Course World](https://opencourse.world/).
Connecting to first-party servers is out of scope of this PR and is a
minefield I'd rather not step into.
## TLS
TLS is implemented with multiple backends depending on the system's 'native'
TLS library. Currently there are two backends: Schannel for Windows, and
OpenSSL for Linux. (In reality Linux is a bit of a free-for-all where there's
no one 'native' library, but OpenSSL is the closest it gets.) On macOS the
'native' library is SecureTransport but that isn't implemented in this PR.
(Instead, all non-Windows OSes will use OpenSSL unless disabled with
`-DENABLE_OPENSSL=OFF`.)
Why have multiple backends instead of just using a single library, especially
given that Yuzu already embeds mbedtls for cryptographic algorithms? Well, I
tried implementing this on mbedtls first, but the problem is TLS policies -
mainly trusted certificate policies, and to a lesser extent trusted algorithms,
SSL versions, etc.
...In practice, the chance that someone is going to conduct a man-in-the-middle
attack on a third-party game server is pretty low, but I'm a security nerd so I
like to do the right security things.
My base assumption is that we want to use the host system's TLS policies. An
alternative would be to more closely emulate the Switch's TLS implementation
(which is based on NSS). But for one thing, I don't feel like reverse
engineering it. And I'd argue that for third-party servers such as Open Course
World, it's theoretically preferable to use the system's policies rather than
the Switch's, for two reasons
1. Someday the Switch will stop being updated, and the trusted cert list,
algorithms, etc. will start to go stale, but users will still want to
connect to third-party servers, and there's no reason they shouldn't have
up-to-date security when doing so. At that point, homebrew users on actual
hardware may patch the TLS implementation, but for emulators it's simpler to
just use the host's stack.
2. Also, it's good to respect any custom certificate policies the user may have
added systemwide. For example, they may have added custom trusted CAs in
order to use TLS debugging tools or pass through corporate MitM middleboxes.
Or they may have removed some CAs that are normally trusted out of paranoia.
Note that this policy wouldn't work as-is for connecting to first-party
servers, because some of them serve certificates based on Nintendo's own CA
rather than a publicly trusted one. However, this could probably be solved
easily by using appropriate APIs to adding Nintendo's CA as an alternate
trusted cert for Yuzu's connections. That is not implemented in this PR
because, again, first-party servers are out of scope.
(If anything I'd rather have an option to _block_ connections to Nintendo
servers, but that's not implemented here.)
To use the host's TLS policies, there are three theoretical options:
a) Import the host's trusted certificate list into a cross-platform TLS
library (presumably mbedtls).
b) Use the native TLS library to verify certificates but use a cross-platform
TLS library for everything else.
c) Use the native TLS library for everything.
Two problems with option a). First, importing the trusted certificate list at
minimum requires a bunch of platform-specific code, which mbedtls does not have
built in. Interestingly, OpenSSL recently gained the ability to import the
Windows certificate trust store... but that leads to the second problem, which
is that a list of trusted certificates is [not expressive
enough](https://bugs.archlinux.org/task/41909) to express a modern certificate
trust policy. For example, Windows has the concept of [explicitly distrusted
certificates](https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/dn265983(v=ws.11)),
and macOS requires Certificate Transparency validation for some certificates
with complex rules for when it's required.
Option b) (using native library just to verify certs) is probably feasible, but
it would miss aspects of TLS policy other than trusted certs (like allowed
algorithms), and in any case it might well require writing more code, not less,
compared to using the native library for everything.
So I ended up at option c), using the native library for everything.
What I'd *really* prefer would be to use a third-party library that does option
c) for me. Rust has a good library for this,
[native-tls](https://docs.rs/native-tls/latest/native_tls/). I did search, but
I couldn't find a good option in the C or C++ ecosystem, at least not any that
wasn't part of some much larger framework. I was surprised - isn't this a
pretty common use case? Well, many applications only need TLS for HTTPS, and they can
use libcurl, which has a TLS abstraction layer internally but doesn't expose
it. Other applications only support a single TLS library, or use one of the
aforementioned larger frameworks, or are platform-specific to begin with, or of
course are written in a non-C/C++ language, most of which have some canonical
choice for TLS. But there are also many applications that have a set of TLS
backends just like this; it's just that nobody has gone ahead and abstracted
the pattern into a library, at least not a widespread one.
Amusingly, there is one TLS abstraction layer that Yuzu already bundles: the
one in ffmpeg. But it is missing some features that would be needed to use it
here (like reusing an existing socket rather than managing the socket itself).
Though, that does mean that the wiki's build instructions for Linux (and macOS
for some reason?) already recommend installing OpenSSL, so no need to update
those.
## Other APIs implemented
- Sockets:
- GetSockOpt(`SO_ERROR`)
- SetSockOpt(`SO_NOSIGPIPE`) (stub, I have no idea what this does on Switch)
- `DuplicateSocket` (because the SSL sysmodule calls it internally)
- More `PollEvents` values
- NSD:
- `Resolve` and `ResolveEx` (stub, good enough for Open Course World and
probably most third-party servers, but not first-party)
- SFDNSRES:
- `GetHostByNameRequest` and `GetHostByNameRequestWithOptions`
- `ResolverSetOptionRequest` (stub)
## Fixes
- Parts of the socket code were previously allocating a `sockaddr` object on
the stack when calling functions that take a `sockaddr*` (e.g. `accept`).
This might seem like the right thing to do to avoid illegal aliasing, but in
fact `sockaddr` is not guaranteed to be large enough to hold any particular
type of address, only the header. This worked in practice because in
practice `sockaddr` is the same size as `sockaddr_in`, but it's not how the
API is meant to be used. I changed this to allocate an `sockaddr_in` on the
stack and `reinterpret_cast` it. I could try to do something cleverer with
`aligned_storage`, but casting is the idiomatic way to use these particular
APIs, so it's really the system's responsibility to avoid any aliasing
issues.
- I rewrote most of the `GetAddrInfoRequest[WithOptions]` implementation. The
old implementation invoked the host's getaddrinfo directly from sfdnsres.cpp,
and directly passed through the host's socket type, protocol, etc. values
rather than looking up the corresponding constants on the Switch. To be
fair, these constants don't tend to actually vary across systems, but
still... I added a wrapper for `getaddrinfo` in
`internal_network/network.cpp` similar to the ones for other socket APIs, and
changed the `GetAddrInfoRequest` implementation to use it. While I was at
it, I rewrote the serialization to use the same approach I used to implement
`GetHostByNameRequest`, because it reduces the number of size calculations.
While doing so I removed `AF_INET6` support because the Switch doesn't
support IPv6; it might be nice to support IPv6 anyway, but that would have to
apply to all of the socket APIs.
I also corrected the IPC wrappers for `GetAddrInfoRequest` and
`GetAddrInfoRequestWithOptions` based on reverse engineering and hardware
testing. Every call to `GetAddrInfoRequestWithOptions` returns *four*
different error codes (IPC status, getaddrinfo error code, netdb error code,
and errno), and `GetAddrInfoRequest` returns three of those but in a
different order, and it doesn't really matter but the existing implementation
was a bit off, as I discovered while testing `GetHostByNameRequest`.
- The new serialization code is based on two simple helper functions:
```cpp
template <typename T> static void Append(std::vector<u8>& vec, T t);
void AppendNulTerminated(std::vector<u8>& vec, std::string_view str);
```
I was thinking there must be existing functions somewhere that assist with
serialization/deserialization of binary data, but all I could find was the
helper methods in `IOFile` and `HLERequestContext`, not anything that could
be used with a generic byte buffer. If I'm not missing something, then
maybe I should move the above functions to a new header in `common`...
right now they're just sitting in `sfdnsres.cpp` where they're used.
- Not a fix, but `SocketBase::Recv`/`Send` is changed to use `std::span<u8>`
rather than `std::vector<u8>&` to avoid needing to copy the data to/from a
vector when those methods are called from the TLS implementation.
Previously, we were mixing the raw CPU frequency and CNTFRQ.
The raw CPU frequency (1020 MHz) should've never been used as CNTPCT (whose frequency is CNTFRQ) is the only counter available.
Moves it from Settings to Common::TimeZone, since this algorithm doesn't
depend on the setting. It also lets us use it in other libraries.
common: Various fixes
time_zone: Don't double up the std::abs
Too many absolute values were causing mirrored time zones to resolve
as the same.
Prevents needing to deduce the non-Switch setting in core. Instead, we
deduce the meaning of this setting where the heresy is committed, in
common.
settings: Remove strftime usage
GetTimeZoneString: Use standard features
Also forces GMT on MinGW due to broken strftime.
Track the private anonymous placeholder mappings created by Unmap() and
wherever possible, replace existing placeholders with larger ones
instead of creating many small ones.
This helps with the buildup of mappings in /proc/YUZU_PID/maps after a
longer gaming session, improving stability without having to increase
vm.max_map_count to a ridiculous value. The amount of placeholder
mappings will no longer outgrow the amount of actual memfd mappings in
cases of high memory fragmentation.
Previously, yuzu would try and guess which vsync mode to use given
different scenarios, but apparently we didn't always get it right. This
exposes the separate modes in a drop-down the user can select.
If a mode isn't available in Vulkan, it defaults to FIFO.
MicroSleep allows the processor to pause for a "short" amount of time (in the microsecond range). This is useful for spin-waiting that does not require nanosecond precision.
This uses the new TPAUSE instruction introduced on Intel's newest processors as part of the waitpkg instructions. For CPUs that do not support waitpkg instructions, this is equivalent to yield().
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
Adds the PushModes Try and Wait to allow producers to specify how they want to push their data to the queue if the queue is full.
If the queue is full:
- Try will fail to push to the queue, returning false. Try only returns true if it successfully pushes to the queue. This may result in items not being pushed into the queue.
- Wait will wait until a slot is available to push to the queue, resulting in potential for deadlock if a consumer is not running.
The RDTSC frequency reported by CPUID is not accurate to its true frequency.
We will spawn a separate thread to calculate the true RDTSC frequency after a measurement period of 30 seconds has elapsed.
The precision of sleep_for and wait_for is limited to 1-1.5ms on Windows.
Using SleepForOneTick() allows us to sleep for exactly one interval of the current timer resolution.
This allows us to take advantage of systems that have a timer resolution of 0.5ms to reduce CPU overhead in the event loop.
This implementation provides a consistent, high performance, and high resolution clock where/when std::chrono::steady_clock does not provide sufficient precision.
StoppableTimedWait allows for a timed wait to be stopped immediately after a stop is requested.
This is useful in cases where long duration thread sleeps are needed and allows for immediate joining of waiting threads after a stop is requested.
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
As an optional feature which can be enabled in the advanced graphics configuration, all pipelines that get built at the initial shader loading are stored in a VkPipelineCache object and are dumped to the disk.
These vendor specific pipeline cache files are located at `/shader/GAME_ID/vulkan_pipelines.bin`. This feature was mainly added because of an issue with the AMD driver (see yuzu-emu#8507) causing invalidation of the cache files the driver builds automatically.
Narrows the include in the header to <cstddef>, since that's what houses
size_t's definition, meanwhile the <cstdint> include can be moved into
the cpp file.
Visual Studio has an option to search all files in a solution, so I
did a search in there for "default:" looking for any missing break
statements.
I've left out default statements that return something, and that throw
something, even if via ThrowInvalidType. UNREACHABLE leads towards throw
R_THROW macro leads towards a return
This also covers std::span, which does not have a const iterator.
Also renames IsSTLContainer to IsContiguousContainer to explicitly convey its semantics.
Ensures that a fixed-point value is always initialized
This likely also fixes several cases of uninitialized values being
operated on, since we have multiple areas in the codebase where the
default constructor is being used like:
Common::FixedPoint<50, 14> current_sample{};
and is then followed up with an arithmetic operation like += or
something else, which operates directly on FixedPoint's internal data
member, which would previously be uninitialized.
This calls round_up(), which is a non-const member function, so if a
fixed-point instantiation ever calls to_uint(), it'll result in a
compiler error.
This allows the member function to work.
While we're at it, we can actually mark to_long_floor() as const, since
it's not modifying any member state.
Right now this looks like a distro specific problem, but we'll have to see.
Over on Gentoo: with lz4 1.9.3 there is a lz4::lz4 library target, with 1.9.4 it's no longer
mentioned in the cmake files provided by the package. (/usr/lib64/cmake/lz4)
arch and openSUSE have lz4 1.9.4 available so I checked there,
they only have .pc files for pkg-config, so asking for "lz4::lz4" works as usual
MSVC does require "lz4::lz4" to be asked for