* kernel: Implement Thread pinning support
This commit adds support for 8.x thread pinning changes and implement SynchronizePreemptionState syscall.
Based on kernel 13.x reverse.
* Address gdkchan's comment
* kernel: fix missing critical section leave in SetActivity
Fix Unity games
* Implement missing bits on the interrupt handler and inline update pinning function as it cannot be generic
* Fix some bugs in SetActivity and SetCoreAndAffinityMask
* Address gdkchan's comments
* kernel: Define InfoTYpe and make it less obscure when reading GetInfo
Also map ThreadTickCount to 25 instead of 0xF0000002 like 13.x kernel.
* kernel: Implement GetInfo IsApplication
* kernel: Implement GetInfo FreeThreadCount
* kernel: Fix sleep timing accuracy
This commit corrects some mistake while comparing reversing of kernel
13.x with our own.
WaitAndCheckScheduledObjects timing accuracy was also improved.
* Make KTimeManager.WaitAndCheckScheduledObjects spin wait for sub milliseconds
Fix performance regression on Pokemon Let's Go games and possibly
others.
* Address rip's comment
* kernel: Fix issues with timeout of -1 (0xFFFFFFFF)
Fixes possible hang on Pokemon DP and possibly others
Add basic support for the CFI value being passed in X18 since 11.0.0 by the official kernel.
We do not implement any random generator atm in the kernel and as such the KSystemControl.GenerateRandom function is stubbed
* kernel: Clear pages allocated with SetHeapSize
Before this commit, all new pages allocated by SetHeapSize were not
cleared by the kernel.
This would cause undefined data to be pass to the userland and possibly
resulting in weird memory corruption.
This commit also add support for custom fill heap and ipc value (that is also
supported by the official kernel)
* Remove dots at the end of KPageTableBase.MapPages new documentation
* Remove unused _stackFillValue
* kernel: Add resource limit related syscalls
This commit implements all resource limit related syscalls.
* Fix register mapping being wrong for SetResourceLimitLimitValue
* Address gdkchan's comment
* kernel: Implement SetMemoryPermission syscall
This commit implement the SetMemoryPermission syscall accurately.
This also fix KMemoryPermission not being an unsigned 32 bits type and
add the "DontCare" bit (used by shared memory, currently unused in
Ryujinx)
* Update MemoryPermission mask
* Address gdkchan's comments
* Fix a nit
* Address gdkchan's comment
* Add a "Pause Emulation" option and hotkey
Closes Ryujinx#1604
* Refactoring how pause is handled
* Applied suggested changes from review
* Applied suggested fixes
* Pass correct suspend type to threads for suspend/resume
* Fix NRE after stoping emulation
* Removing SimulateWakeUpMessage call after resuming emulation
* Skip suspending non game process
* Pause the tickCounter in the ExecutionContext
* Refactoring tickCounter pause/resume as suggested
* Fix Config migration to add pause hotkey
* Fixed pausing only application threads
* Fix exiting emulator while paused
* Avoid pause/resume while already paused/resumed
* Cleanup unused code
* Avoid restarting audio if stopping emulation while in pause.
* Added suggested changes
* Fix ConfigurationState
* Make GPU memory manager a member of GPU channel
* Move physical memory instance to the memory manager, and the caches to the physical memory
* PR feedback
* Add AddressTable<T>
* Use AddressTable<T> for dispatch
* Remove JumpTable & co.
* Add fallback for out of range addresses
* Add PPTC support
* Add documentation to `AddressTable<T>`
* Make AddressTable<T> configurable
* Fix table walk
* Fix IsMapped check
* Remove CountTableCapacity
* Add PPTC support for fast path
* Rename IsMapped to IsValid
* Remove stale comment
* Change format of address in exception message
* Add TranslatorStubs
* Split DispatchStub
Avoids recompilation of stubs during tests.
* Add hint for 64bit or 32bit
* Add documentation to `Symbol`
* Add documentation to `TranslatorStubs`
Make `TranslatorStubs` disposable as well.
* Add documentation to `SymbolType`
* Add `AddressTableEventSource` to monitor function table size
Add an EventSource which measures the amount of unmanaged bytes
allocated by AddressTable<T> instances.
dotnet-counters monitor -n Ryujinx --counters ARMeilleure
* Add `AllowLcqInFunctionTable` optimization toggle
This is to reduce the impact this change has on the test duration.
Before everytime a test was ran, the FunctionTable would be initialized
and populated so that the newly compiled test would get registered to
it.
* Implement unmanaged dispatcher
Uses the DispatchStub to dispatch into the next translation, which
allows execution to stay in unmanaged for longer and skips a
ConcurrentDictionary look up when the target translation has been
registered to the FunctionTable.
* Remove redundant null check
* Tune levels of FunctionTable
Uses 5 levels instead of 4 and change unit of AddressTableEventSource
from KB to MB.
* Use 64-bit function table
Improves codegen for direct branches:
mov qword [rax+0x408],0x10603560
- mov rcx,sub_10603560_OFFSET
- mov ecx,[rcx]
- mov ecx,ecx
- mov rdx,JIT_CACHE_BASE
- add rdx,rcx
+ mov rcx,sub_10603560
+ mov rdx,[rcx]
mov rcx,rax
Improves codegen for dispatch stub:
and rax,byte +0x1f
- mov eax,[rcx+rax*4]
- mov eax,eax
- mov rcx,JIT_CACHE_BASE
- lea rax,[rcx+rax]
+ mov rax,[rcx+rax*8]
mov rcx,rbx
* Remove `JitCacheSymbol` & `JitCache.Offset`
* Turn `Translator.Translate` into an instance method
We do not have to add more parameter to this method and related ones as
new structures are added & needed for translation.
* Add symbol only when PTC is enabled
Address LDj3SNuD's feedback
* Change `NativeContext.Running` to a 32-bit integer
* Fix PageTable symbol for host mapped
* Refactoring of KMemoryManager class
* Replace some trivial uses of DRAM address with VA
* Get rid of GetDramAddressFromVa
* Abstracting more operations on derived page table class
* Run auto-format on KPageTableBase
* Managed to make TryConvertVaToPa private, few uses remains now
* Implement guest physical pages ref counting, remove manual freeing
* Make DoMmuOperation private and call new abstract methods only from the base class
* Pass pages count rather than size on Map/UnmapMemory
* Change memory managers to take host pointers
* Fix a guest memory leak and simplify KPageTable
* Expose new methods for host range query and mapping
* Some refactoring of MapPagesFromClientProcess to allow proper page ref counting and mapping without KPageLists
* Remove more uses of AddVaRangeToPageList, now only one remains (shared memory page checking)
* Add a SharedMemoryStorage class, will be useful for host mapping
* Sayonara AddVaRangeToPageList, you served us well
* Start to implement host memory mapping (WIP)
* Support memory tracking through host exception handling
* Fix some access violations from HLE service guest memory access and CPU
* Fix memory tracking
* Fix mapping list bugs, including a race and a error adding mapping ranges
* Simple page table for memory tracking
* Simple "volatile" region handle mode
* Update UBOs directly (experimental, rough)
* Fix the overlap check
* Only set non-modified buffers as volatile
* Fix some memory tracking issues
* Fix possible race in MapBufferFromClientProcess (block list updates were not locked)
* Write uniform update to memory immediately, only defer the buffer set.
* Fix some memory tracking issues
* Pass correct pages count on shared memory unmap
* Armeilleure Signal Handler v1 + Unix changes
Unix currently behaves like windows, rather than remapping physical
* Actually check if the host platform is unix
* Fix decommit on linux.
* Implement windows 10 placeholder shared memory, fix a buffer issue.
* Make PTC version something that will never match with master
* Remove testing variable for block count
* Add reference count for memory manager, fix dispose
Can still deadlock with OpenAL
* Add address validation, use page table for mapped check, add docs
Might clean up the page table traversing routines.
* Implement batched mapping/tracking.
* Move documentation, fix tests.
* Cleanup uniform buffer update stuff.
* Remove unnecessary assignment.
* Add unsafe host mapped memory switch
On by default. Would be good to turn this off for untrusted code (homebrew, exefs mods) and give the user the option to turn it on manually, though that requires some UI work.
* Remove C# exception handlers
They have issues due to current .NET limitations, so the meilleure one fully replaces them for now.
* Fix MapPhysicalMemory on the software MemoryManager.
* Null check for GetHostAddress, docs
* Add configuration for setting memory manager mode (not in UI yet)
* Add config to UI
* Fix type mismatch on Unix signal handler code emit
* Fix 6GB DRAM mode.
The size can be greater than `uint.MaxValue` when the DRAM is >4GB.
* Address some feedback.
* More detailed error if backing memory cannot be mapped.
* SetLastError on all OS functions for consistency
* Force pages dirty with UBO update instead of setting them directly.
Seems to be much faster across a few games. Need retesting.
* Rebase, configuration rework, fix mem tracking regression
* Fix race in FreePages
* Set memory managers null after decrementing ref count
* Remove readonly keyword, as this is now modified.
* Use a local variable for the signal handler rather than a register.
* Fix bug with buffer resize, and index/uniform buffer binding.
Should fix flickering in games.
* Add InvalidAccessHandler to MemoryTracking
Doesn't do anything yet
* Call invalid access handler on unmapped read/write.
Same rules as the regular memory manager.
* Make unsafe mapped memory its own MemoryManagerType
* Move FlushUboDirty into UpdateState.
* Buffer dirty cache, rather than ubo cache
Much cleaner, may be reusable for Inline2Memory updates.
* This doesn't return anything anymore.
* Add sigaction remove methods, correct a few function signatures.
* Return empty list of physical regions for size 0.
* Also on AddressSpaceManager
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Add CPU register printout when guest crashes/breaks execution
* Print out registers when undefined instruction is hit
* Apply suggestions from code review
Co-authored-by: Ac_K <Acoustik666@gmail.com>
* Fixes after rebase
* Address gdkchan's comments
Co-authored-by: Ac_K <Acoustik666@gmail.com>
Co-authored-by: Mary <me@thog.eu>
* Make all title id instances unsigned
* Replace address and size with ulong instead of signed types
Long overdue change.
Also change some logics here and there to optimize with the new memory
manager.
* Address Ac_K's comments
* Remove uneeded cast all around
* Fixes some others misalignment
* Add initial implementation of the Tamper Machine
* Implement Atmosphere opcodes 0, 4 and 9
* Add missing TamperCompilationException class
* Implement Atmosphere conditional and loop opcodes 1, 2 and 3
* Inplement input conditional opcode 8
* Add register store opcode A
* Implement extended pause/resume opcodes FF0 and FF1
* Implement extended log opcode FFF
* Implement extended register conditional opcode C0
* Refactor TamperProgram to an interface
* Moved Atmosphere classes to a separate subdirectory
* Fix OpProcCtrl class not setting process
* Implement extended register save/restore opcodes C1, C2 and C3
* Refactor code emitters to separate classes
* Supress memory access errors from the Tamper Machine
* Add debug information to tamper register and memory writes
* Add block stack check to Atmosphere Cheat compiler
* Add handheld input support to Tamper Machine
* Fix code styling
* Fix build id and cheat case mismatch
* Fix invalid immediate size selection
* Print build ids of the title
* Prevent Tamper Machine from change code regions
* Remove Atmosphere namespace
* Remove empty cheats from the list
* Prevent code modification without disabling the tampering
* Fix missing addressing mode in LoadRegisterWithMemory
* Fix wrong addressing in RegisterConditional
* Add name to the tamper machine thread
* Fix code styling
* Rewrite scheduler context switch code
* Fix race in UnmapIpcRestorePermission
* Fix thread exit issue that could leave the scheduler in a invalid state
* Change context switch method to not wait on guest thread, remove spin wait, use SignalAndWait to pass control
* Remove multi-core setting (it is always on now)
* Re-enable assert
* Remove multicore from default config and schema
* Fix race in KTimeManager
* IPC refactor part 2: Use ReplyAndReceive on HLE services and remove special handling from kernel
* Fix for applet transfer memory + some nits
* Keep handles if possible to avoid server handle table exhaustion
* Fix IPC ZeroFill bug
* am: Correctly implement CreateManagedDisplayLayer and implement CreateManagedDisplaySeparableLayer
CreateManagedDisplaySeparableLayer is requires since 10.x+ when appletResourceUserId != 0
* Make it exit properly
* Make ServiceNotImplementedException show the full message again
* Allow yielding execution to avoid starving other threads
* Only wait if active
* Merge IVirtualMemoryManager and IAddressSpaceManager
* Fix Ro loading data from the wrong process
Co-authored-by: Thog <me@thog.eu>
* Changes to allow explicit management of service threads
* Remove now unused code
* Remove ThreadCounter, its no longer needed
* Allow and use separate server per service, also fix exit issues
* New policy change: PTC version now uses PR number
* Logger class changes only
Now compile-time checking is possible with the help of Nullable Value
types.
* Misc formatting
* Manual optimizations
PrintGuestLog
PrintGuestStackTrace
Surfaceflinger DequeueBuffer
* Reduce SendVibrationXX log level to Debug
* Add Notice log level
This level is always enabled and used to print system info, etc...
Also, rewrite LogColor to switch expression as colors are static
* Unify unhandled exception event handlers
* Print enabled LogLevels during init
* Re-add App Exit disposes in proper order
nit: switch case spacing
* Revert PrintGuestStackTrace to Info logs due to #1407
PrintGuestStackTrace is now called in some critical error handlers
so revert to old behavior as KThread isn't part of Guest.
* Batch replace Logger statements
* Implement Modding Support
* Executables: Rewrite to use contiguous mem and Spans
* Reorder ExeFs, Npdm, ControlData and SaveData calls
After discussion with gdkchan, it was decided it's best to call
LoadExeFs after all other loads are done as it starts the guest process.
* Build RomFs manually instead of Layering FS
Layered FS approach has considerable latency when building the final
romfs. So, we manually replace files in a single romfs instance.
* Add RomFs modding via storage file
* Fix and cleanup MemPatch
* Add dynamically loaded NRO patching
* Support exefs file replacement
* Rewrite ModLoader to use mods-search architecture
* Disable PPTC when exefs patches are detected
Disable PPTC on exefs replacements too
* Rewrite ModLoader, again
* Increased maintainability and matches Atmosphere closely
* Creates base mods structure if it doesn't exist
* Add Exefs partition replacement
* IPSwitch: Fix nsobid parsing
* Move mod logs to new LogClass
* Allow custom suffixes to title dirs again
* Address nits
* Add a per-App "Open Mods Directory" context menu item
Creates the path if not present.
* Normalize tooltips verbiage
* Use LocalStorage and remove unused namespaces
* Implement a new physical memory manager and replace DeviceMemory
* Proper generic constraints
* Fix debug build
* Add memory tests
* New CPU memory manager and general code cleanup
* Remove host memory management from CPU project, use Ryujinx.Memory instead
* Fix tests
* Document exceptions on MemoryBlock
* Fix leak on unix memory allocation
* Proper disposal of some objects on tests
* Fix JitCache not being set as initialized
* GetRef without checks for 8-bits and 16-bits CAS
* Add MemoryBlock destructor
* Throw in separate method to improve codegen
* Address PR feedback
* QueryModified improvements
* Fix memory write tracking not marking all pages as modified in some cases
* Simplify MarkRegionAsModified
* Remove XML doc for ghost param
* Add back optimization to avoid useless buffer updates
* Add Ryujinx.Cpu project, move MemoryManager there and remove MemoryBlockWrapper
* Some nits
* Do not perform address translation when size is 0
* Address PR feedback and format NativeInterface class
* Remove ghost parameter description
* Update Ryujinx.Cpu to .NET Core 3.1
* Address PR feedback
* Fix build
* Return a well defined value for GetPhysicalAddress with invalid VA, and do not return unmapped ranges as modified
* Typo
* Implement Jump Table for Native Calls
NOTE: this slows down rejit considerably! Not recommended to be used
without codegen optimisation or AOT.
- Does not work on Linux
- A32 needs an additional commit.
* A32 Support
(WIP)
* Actually write Direct Call pointers to the table
That would help.
* Direct Calls: Rather than returning to the translator, attempt to keep within the native stack frame.
A return to the translator can still happen, but only by exceptionally
bubbling up to it.
Also:
- Always translate lowCq as a function. Faster interop with the direct
jumps, and this will be useful in future if we want to do speculative
translation.
- Tail Call Detection: after the decoding stage, detect if we do a tail
call, and avoid translating into it. Detected if a jump is made to an
address outwith the contiguous sequence of blocks surrounding the entry
point. The goal is to reduce code touched by jit and rejit.
* A32 Support
* Use smaller max function size for lowCq, fix exceptional returns
When a return has an unexpected value and there is no code block
following this one, we now return the value rather than continuing.
* CompareAndSwap (buggy)
* Ensure CompareAndSwap does not get optimized away.
* Use CompareAndSwap to make the dynamic table thread safe.
* Tail call for linux, throw on too many arguments.
* Combine CompareAndSwap 128 and 32/64.
They emit different IR instructions since their PreAllocator behaviour
is different, but now they just have one function on EmitterContext.
* Fix issues separating from optimisations.
* Use a stub to find and execute missing functions.
This allows us to skip doing many runtime comparisons and branches, and reduces the amount of code we need to emit significantly.
For the indirect call table, this stub also does the work of moving in the highCq address to the table when one is found.
* Make Jump Tables and Jit Cache dynmically resize
Reserve virtual memory, commit as needed.
* Move TailCallRemover to its own class.
* Multithreaded Translation (based on heuristic)
A poor one, at that. Need to get core count for a better one, which
means a lot of OS specific garbage.
* Better priority management for background threads.
* Bound core limit a bit more
Past a certain point the load is not paralellizable and starts stealing from the main thread. Likely due to GC, memory, heap allocation thread contention. Reduce by one core til optimisations come to improve the situation.
* Fix memory management on linux.
* Temporary solution to some sync problems.
This will make sure threads exit correctly, most of the time. There is a potential race where setting the sync counter to 0 does nothing (counter stays at what it was before, thread could take too long to exit), but we need to find a better way to do this anyways. Synchronization frequency has been tightened as we never enter blockwise segments of code. Essentially this means, check every x functions or loop iterations, before lowcq blocks existed and were worth just as much. Ideally it should be done in a better way, since functions can be anywhere from 1 to 5000 instructions. (maybe based on host timer, or an interrupt flag from a scheduler thread)
* Address feedback minus CompareAndSwap change.
* Use default ReservedRegion granularity.
* Merge CompareAndSwap with its V128 variant.
* We already got the source, no need to do it again.
* Make sure all background translation threads exit.
* Fix CompareAndSwap128
Detection criteria was a bit scuffed.
* Address Comments.
* Fix a crash when closing the main Ui
Also make sure to dispose the OpenAL context to not leak memory when
unloading the emulation context.
* Improve keys and 'game already running' dialogs
* Make sure to dispose the page table and ThreadContext
Less memory leaks!
* Fix tests
* Address gdk's comments