* ARMeilleure: Add `GFNI` detection
This is intended for utilizing the `gf2p8affineqb` instruction
* ARMeilleure: Add `gf2p8affineqb`
Not using the VEX or EVEX-form of this instruction is intentional. There
are `GFNI`-chips that do not support AVX(so no VEX encoding) such as
Tremont(Lakefield) chips as well as Jasper Lake.
13df339fe7/GenuineIntel/GenuineIntel00806A1_Lakefield_LC_InstLatX64.txt (L1297-L1299)13df339fe7/GenuineIntel/GenuineIntel00906C0_JasperLake_InstLatX64.txt (L1252-L1254)
* ARMeilleure: Add `gfni` acceleration of `Rbit_V`
Passes all `Rbit_V*` unit tests on my `i9-11900k`
* ARMeilleure: Add `gfni` acceleration of `S{l,r}i_V`
Also added a fast-path for when the shift amount is greater than the
size of the element.
* ARMeilleure: Add `gfni` acceleration of `Shl_V` and `Sshr_V`
* ARMeilleure: Increment InternalVersion
* ARMeilleure: Fix Intrinsic and Assembler Table alignment
`gf2p8affineqb` is the longest instruction name I know of. It shouldn't
get any wider than this.
* ARMeilleure: Remove SSE2+SHA requirement for GFNI
* ARMeilleure Add `X86GetGf2p8LogicalShiftLeft`
Used to generate GF(2^8) 8x8 bit-matrices for bit-shifting for the `gf2p8affineqb` instruction.
* ARMeilleure: Append `FeatureInfo7Ecx` to `FeatureInfo`
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
* Implemented in IR the managed methods of the ShlReg region of the SoftFallback class.
It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Fpsr and Fpcr freed.
Handling/isolation of Fpsr and Fpcr via register for IR and via memory for Tests and Threads, with synchronization to context exchanges (explicit for SoftFloat); without having to call managed methods. Thanks to the inlining work of the previous two PRs and others in this.
Tests performed locally in both release and debug modes, in both lowcq and highcq, with FastFP to true and false (explicit FP tests included). Tested with the title Tony Hawk's PS.
Depends on shlreg.
* Update InstEmitSimdHelper.cs
* De-magic Masks.
Remove the Stride and Len flags; Fpsr.NZCV are A32 only, then moved to Fpscr: this leads to emitting less IR in reference to Get/Set Fpsr/Fpcr/Fpscr methods in reference to Mrs/Msr (A64) and Vmrs/Vmsr (A32) instructions.
* Addressed PR feedback.
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
* Implemented in IR the managed methods of the ShlReg region of the SoftFallback class.
It also includes the last two SatQ ones (following up on https://github.com/Ryujinx/Ryujinx/pull/3665).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Update InstEmitSimdHelper.cs
* OpCodeTable: Implement Hint instructions (CSDB, SEV, SEVL, WFE, WFI, YIELD)
* A64: Remove catch-all Hint instruction
* T16: Handle unallocated hint instructions
Some thumb tests execute these assuming that they're nops.
* T32: Fill out other Hint instructions
* A32: Fill out other hint instructions
* Implement Thumb (32-bit) memory (ordered), multiply and bitfield instructions
* Remove public from interface
* Fix T32 BL immediate and implement signed and unsigned extend instructions
* Implement VRSRA, VRSHRN, VQSHRUN, VQMOVN, VQMOVUN, VQADD, VQSUB, VRHADD, VPADDL, VSUBL, VQDMULH and VMLAL Arm32 NEON instructions
* PPTC version
* Fix VQADD/VQSUB
* Improve MRC/MCR handling and exception messages
In case data is being recompiled as code, we don't want to throw at emit stage, instead we should only throw if it actually tries to execute
* Implemented in IR the managed methods of the Saturating region ...
... of the SoftFallback class (the SatQ ones).
The need to natively manage the Fpcr and Fpsr system registers is still a fact.
Contributes to https://github.com/Ryujinx/Ryujinx/issues/2917 ; I will open another PR to implement in Intrinsics-branchless the methods of the Saturation region as well (the SatXXXToXXX ones).
All instructions involved have been tested locally in both release and debug modes, in both lowcq and highcq.
* Ptc.InternalVersion = 3665
* Addressed PR feedback.
* Implement intrusive red-black tree, use it for HLE kernel block manager
* Implement TreeDictionary using IntrusiveRedBlackTree
* Implement IntervalTree using IntrusiveRedBlackTree
* Implement IntervalTree (on Ryujinx.Memory) using IntrusiveRedBlackTree
* Make PredecessorOf and SuccessorOf internal, expose Predecessor and Successor properties on the node itself
* Allocation free tree node lookup
* Initial commit with a lot of testing stuff.
* Partial Unmap Cleanup Part 1
* Fix some minor issues, hopefully windows tests.
* Disable partial unmap tests on macos for now
Weird issue.
* Goodbye magic number
* Add COMPlus_EnableAlternateStackCheck for tests
`COMPlus_EnableAlternateStackCheck` is needed for NullReferenceException handling to work on linux after registering the signal handler, due to how dotnet registers its own signal handler.
* Address some feedback
* Force retry when memory is mapped in memory tracking
This case existed before, but returning `false` no longer retries, so it would crash immediately after unprotecting the memory... Now, we return `true` to deliberately retry.
This case existed before (was just broken by this change) and I don't really want to look into fixing the issue right now. Technically, this means that on guest code partial unmaps will retry _due to this_ rather than hitting the handler. I don't expect this to cause any issues.
This should fix random crashes in Xenoblade Chronicles 2.
* Use IsRangeMapped
* Suppress MockMemoryManager.UnmapEvent warning
This event is not signalled by the mock memory manager.
* Remove 4kb mapping
* Refactor CPU interface
* Use IExecutionContext interface on SVC handler, change how CPU interrupts invokes the handlers
* Make CpuEngine take a ITickSource rather than returning one
The previous implementation had the scenario where the CPU engine had to implement the tick source in mind, like for example, when we have a hypervisor and the game can read CNTPCT on the host directly. However given that we need to do conversion due to different frequencies anyway, it's not worth it. It's better to just let the user pass the tick source and redirect any reads to CNTPCT to the user tick source
* XML docs for the public interfaces
* PPTC invalidation due to NativeInterface function name changes
* Fix build of the CPU tests
* PR feedback
* Back to the origins: Make memory manager take guest PA rather than host address once again
* Direct mapping with alias support on Windows
* Fixes and remove more of the emulated shared memory
* Linux support
* Make shared and transfer memory not depend on SharedMemoryStorage
* More efficient view mapping on Windows (no more restricted to 4KB pages at a time)
* Handle potential access violations caused by partial unmap
* Implement host mapping using shared memory on Linux
* Add new GetPhysicalAddressChecked method, used to ensure the virtual address is mapped before address translation
Also align GetRef behaviour with software memory manager
* We don't need a mirrorable memory block for software memory manager mode
* Disable memory aliasing tests while we don't have shared memory support on Mac
* Shared memory & SIGBUS handler for macOS
* Fix typo + nits + re-enable memory tests
* Set MAP_JIT_DARWIN on x86 Mac too
* Add back the address space mirror
* Only set MAP_JIT_DARWIN if we are mapping as executable
* Disable aliasing tests again (still fails on Mac)
* Fix UnmapView4KB (by not casting size to int)
* Use ref counting on memory blocks to delay closing the shared memory handle until all blocks using it are disposed
* Address PR feedback
* Make RO hold a reference to the guest process memory manager to avoid early disposal
Co-authored-by: nastys <nastys@users.noreply.github.com>
* Collapse AsSpan().Slice(..) calls into AsSpan(..)
Less code and a bit faster
* Collapse an Array.Clear(array, 0, array.Length) call to Array.Clear(array)