2018-04-20 17:40:15 +02:00
|
|
|
#define SimdReg
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
using ARMeilleure.State;
|
|
|
|
|
2018-04-20 17:40:15 +02:00
|
|
|
using NUnit.Framework;
|
|
|
|
|
2018-09-17 06:54:05 +02:00
|
|
|
using System.Collections.Generic;
|
2018-05-12 01:10:27 +02:00
|
|
|
|
2018-04-20 17:40:15 +02:00
|
|
|
namespace Ryujinx.Tests.Cpu
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
[Category("SimdReg")]
|
2018-04-20 17:40:15 +02:00
|
|
|
public sealed class CpuTestSimdReg : CpuTest
|
|
|
|
{
|
|
|
|
#if SimdReg
|
|
|
|
|
2018-09-17 06:54:05 +02:00
|
|
|
#region "ValueSource (Types)"
|
2018-08-04 21:58:54 +02:00
|
|
|
private static ulong[] _1B1H1S1D_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x000000000000007Ful,
|
|
|
|
0x0000000000000080ul, 0x00000000000000FFul,
|
|
|
|
0x0000000000007FFFul, 0x0000000000008000ul,
|
|
|
|
0x000000000000FFFFul, 0x000000007FFFFFFFul,
|
|
|
|
0x0000000080000000ul, 0x00000000FFFFFFFFul,
|
|
|
|
0x7FFFFFFFFFFFFFFFul, 0x8000000000000000ul,
|
|
|
|
0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
2018-04-21 21:15:04 +02:00
|
|
|
private static ulong[] _1D_()
|
2018-04-20 17:40:15 +02:00
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7FFFFFFFFFFFFFFFul,
|
|
|
|
0x8000000000000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
2018-08-10 19:27:15 +02:00
|
|
|
private static ulong[] _1H1S_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x0000000000007FFFul,
|
|
|
|
0x0000000000008000ul, 0x000000000000FFFFul,
|
|
|
|
0x000000007FFFFFFFul, 0x0000000080000000ul,
|
|
|
|
0x00000000FFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
|
|
|
private static ulong[] _4H2S_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7FFF7FFF7FFF7FFFul,
|
|
|
|
0x8000800080008000ul, 0x7FFFFFFF7FFFFFFFul,
|
|
|
|
0x8000000080000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
private static ulong[] _4H2S1D_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7FFF7FFF7FFF7FFFul,
|
|
|
|
0x8000800080008000ul, 0x7FFFFFFF7FFFFFFFul,
|
|
|
|
0x8000000080000000ul, 0x7FFFFFFFFFFFFFFFul,
|
|
|
|
0x8000000000000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
|
|
|
private static ulong[] _8B_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7F7F7F7F7F7F7F7Ful,
|
|
|
|
0x8080808080808080ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
2021-01-04 23:45:54 +01:00
|
|
|
private static ulong[] _8B1D_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7F7F7F7F7F7F7F7Ful,
|
|
|
|
0x8080808080808080ul, 0x7FFFFFFFFFFFFFFFul,
|
|
|
|
0x8000000000000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
2018-04-20 17:40:15 +02:00
|
|
|
private static ulong[] _8B4H2S_()
|
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7F7F7F7F7F7F7F7Ful,
|
|
|
|
0x8080808080808080ul, 0x7FFF7FFF7FFF7FFFul,
|
|
|
|
0x8000800080008000ul, 0x7FFFFFFF7FFFFFFFul,
|
|
|
|
0x8000000080000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
|
|
|
|
2018-04-21 21:15:04 +02:00
|
|
|
private static ulong[] _8B4H2S1D_()
|
2018-04-20 17:40:15 +02:00
|
|
|
{
|
|
|
|
return new ulong[] { 0x0000000000000000ul, 0x7F7F7F7F7F7F7F7Ful,
|
|
|
|
0x8080808080808080ul, 0x7FFF7FFF7FFF7FFFul,
|
|
|
|
0x8000800080008000ul, 0x7FFFFFFF7FFFFFFFul,
|
|
|
|
0x8000000080000000ul, 0x7FFFFFFFFFFFFFFFul,
|
|
|
|
0x8000000000000000ul, 0xFFFFFFFFFFFFFFFFul };
|
|
|
|
}
|
2018-09-17 06:54:05 +02:00
|
|
|
|
|
|
|
private static IEnumerable<ulong> _1S_F_()
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x00000000FF7FFFFFul; // -Max Normal (float.MinValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x0000000080800000ul; // -Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x00000000807FFFFFul; // -Max Subnormal
|
|
|
|
yield return 0x0000000080000001ul; // -Min Subnormal (-float.Epsilon)
|
|
|
|
yield return 0x000000007F7FFFFFul; // +Max Normal (float.MaxValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x0000000000800000ul; // +Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x00000000007FFFFFul; // +Max Subnormal
|
|
|
|
yield return 0x0000000000000001ul; // +Min Subnormal (float.Epsilon)
|
2018-09-17 06:54:05 +02:00
|
|
|
|
|
|
|
if (!NoZeros)
|
|
|
|
{
|
|
|
|
yield return 0x0000000080000000ul; // -Zero
|
|
|
|
yield return 0x0000000000000000ul; // +Zero
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoInfs)
|
|
|
|
{
|
|
|
|
yield return 0x00000000FF800000ul; // -Infinity
|
|
|
|
yield return 0x000000007F800000ul; // +Infinity
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoNaNs)
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x00000000FFC00000ul; // -QNaN (all zeros payload) (float.NaN)
|
|
|
|
yield return 0x00000000FFBFFFFFul; // -SNaN (all ones payload)
|
|
|
|
yield return 0x000000007FC00000ul; // +QNaN (all zeros payload) (-float.NaN) (DefaultNaN)
|
|
|
|
yield return 0x000000007FBFFFFFul; // +SNaN (all ones payload)
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
for (int cnt = 1; cnt <= RndCnt; cnt++)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong grbg = TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
ulong rnd1 = GenNormalS();
|
|
|
|
ulong rnd2 = GenSubnormalS();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
yield return (grbg << 32) | rnd1;
|
|
|
|
yield return (grbg << 32) | rnd2;
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private static IEnumerable<ulong> _2S_F_()
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0xFF7FFFFFFF7FFFFFul; // -Max Normal (float.MinValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x8080000080800000ul; // -Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x807FFFFF807FFFFFul; // -Max Subnormal
|
|
|
|
yield return 0x8000000180000001ul; // -Min Subnormal (-float.Epsilon)
|
|
|
|
yield return 0x7F7FFFFF7F7FFFFFul; // +Max Normal (float.MaxValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x0080000000800000ul; // +Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x007FFFFF007FFFFFul; // +Max Subnormal
|
|
|
|
yield return 0x0000000100000001ul; // +Min Subnormal (float.Epsilon)
|
2018-09-17 06:54:05 +02:00
|
|
|
|
|
|
|
if (!NoZeros)
|
|
|
|
{
|
|
|
|
yield return 0x8000000080000000ul; // -Zero
|
|
|
|
yield return 0x0000000000000000ul; // +Zero
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoInfs)
|
|
|
|
{
|
|
|
|
yield return 0xFF800000FF800000ul; // -Infinity
|
|
|
|
yield return 0x7F8000007F800000ul; // +Infinity
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoNaNs)
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0xFFC00000FFC00000ul; // -QNaN (all zeros payload) (float.NaN)
|
|
|
|
yield return 0xFFBFFFFFFFBFFFFFul; // -SNaN (all ones payload)
|
|
|
|
yield return 0x7FC000007FC00000ul; // +QNaN (all zeros payload) (-float.NaN) (DefaultNaN)
|
|
|
|
yield return 0x7FBFFFFF7FBFFFFFul; // +SNaN (all ones payload)
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
for (int cnt = 1; cnt <= RndCnt; cnt++)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong rnd1 = GenNormalS();
|
|
|
|
ulong rnd2 = GenSubnormalS();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
yield return (rnd1 << 32) | rnd1;
|
|
|
|
yield return (rnd2 << 32) | rnd2;
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
private static IEnumerable<ulong> _1D_F_()
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0xFFEFFFFFFFFFFFFFul; // -Max Normal (double.MinValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x8010000000000000ul; // -Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x800FFFFFFFFFFFFFul; // -Max Subnormal
|
|
|
|
yield return 0x8000000000000001ul; // -Min Subnormal (-double.Epsilon)
|
|
|
|
yield return 0x7FEFFFFFFFFFFFFFul; // +Max Normal (double.MaxValue)
|
2018-09-17 06:54:05 +02:00
|
|
|
yield return 0x0010000000000000ul; // +Min Normal
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0x000FFFFFFFFFFFFFul; // +Max Subnormal
|
|
|
|
yield return 0x0000000000000001ul; // +Min Subnormal (double.Epsilon)
|
2018-09-17 06:54:05 +02:00
|
|
|
|
|
|
|
if (!NoZeros)
|
|
|
|
{
|
|
|
|
yield return 0x8000000000000000ul; // -Zero
|
|
|
|
yield return 0x0000000000000000ul; // +Zero
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoInfs)
|
|
|
|
{
|
|
|
|
yield return 0xFFF0000000000000ul; // -Infinity
|
|
|
|
yield return 0x7FF0000000000000ul; // +Infinity
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!NoNaNs)
|
|
|
|
{
|
2018-10-06 03:45:59 +02:00
|
|
|
yield return 0xFFF8000000000000ul; // -QNaN (all zeros payload) (double.NaN)
|
|
|
|
yield return 0xFFF7FFFFFFFFFFFFul; // -SNaN (all ones payload)
|
|
|
|
yield return 0x7FF8000000000000ul; // +QNaN (all zeros payload) (-double.NaN) (DefaultNaN)
|
|
|
|
yield return 0x7FF7FFFFFFFFFFFFul; // +SNaN (all ones payload)
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
for (int cnt = 1; cnt <= RndCnt; cnt++)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong rnd1 = GenNormalD();
|
|
|
|
ulong rnd2 = GenSubnormalD();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
yield return rnd1;
|
|
|
|
yield return rnd2;
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#endregion
|
|
|
|
|
|
|
|
#region "ValueSource (Opcodes)"
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_S_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2018-12-26 18:11:36 +01:00
|
|
|
0x7EA2D420u, // FABD S0, S1, S2
|
2018-10-06 03:45:59 +02:00
|
|
|
0x1E222820u, // FADD S0, S1, S2
|
|
|
|
0x1E221820u, // FDIV S0, S1, S2
|
|
|
|
0x1E220820u, // FMUL S0, S1, S2
|
|
|
|
0x5E22DC20u, // FMULX S0, S1, S2
|
2019-06-30 01:02:48 +02:00
|
|
|
0x1E228820u, // FNMUL S0, S1, S2
|
2018-10-06 03:45:59 +02:00
|
|
|
0x1E223820u // FSUB S0, S1, S2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_D_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2018-12-26 18:11:36 +01:00
|
|
|
0x7EE2D420u, // FABD D0, D1, D2
|
2018-10-06 03:45:59 +02:00
|
|
|
0x1E622820u, // FADD D0, D1, D2
|
|
|
|
0x1E621820u, // FDIV D0, D1, D2
|
|
|
|
0x1E620820u, // FMUL D0, D1, D2
|
|
|
|
0x5E62DC20u, // FMULX D0, D1, D2
|
2019-06-30 01:02:48 +02:00
|
|
|
0x1E628820u, // FNMUL D0, D1, D2
|
2018-10-06 03:45:59 +02:00
|
|
|
0x1E623820u // FSUB D0, D1, D2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2S_4S_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2018-12-26 18:11:36 +01:00
|
|
|
0x2EA0D400u, // FABD V0.2S, V0.2S, V0.2S
|
2018-10-06 03:45:59 +02:00
|
|
|
0x0E20D400u, // FADD V0.2S, V0.2S, V0.2S
|
2018-12-26 18:11:36 +01:00
|
|
|
0x2E20D400u, // FADDP V0.2S, V0.2S, V0.2S
|
2018-10-06 03:45:59 +02:00
|
|
|
0x2E20FC00u, // FDIV V0.2S, V0.2S, V0.2S
|
|
|
|
0x2E20DC00u, // FMUL V0.2S, V0.2S, V0.2S
|
|
|
|
0x0E20DC00u, // FMULX V0.2S, V0.2S, V0.2S
|
|
|
|
0x0EA0D400u // FSUB V0.2S, V0.2S, V0.2S
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2D_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2018-12-26 18:11:36 +01:00
|
|
|
0x6EE0D400u, // FABD V0.2D, V0.2D, V0.2D
|
2018-10-06 03:45:59 +02:00
|
|
|
0x4E60D400u, // FADD V0.2D, V0.2D, V0.2D
|
2018-12-26 18:11:36 +01:00
|
|
|
0x6E60D400u, // FADDP V0.2D, V0.2D, V0.2D
|
2018-10-06 03:45:59 +02:00
|
|
|
0x6E60FC00u, // FDIV V0.2D, V0.2D, V0.2D
|
|
|
|
0x6E60DC00u, // FMUL V0.2D, V0.2D, V0.2D
|
|
|
|
0x4E60DC00u, // FMULX V0.2D, V0.2D, V0.2D
|
|
|
|
0x4EE0D400u // FSUB V0.2D, V0.2D, V0.2D
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2020-02-29 21:51:17 +01:00
|
|
|
private static uint[] _F_AcCm_EqGeGt_S_S_()
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-02-29 21:51:17 +01:00
|
|
|
0x7E22EC20u, // FACGE S0, S1, S2
|
|
|
|
0x7EA2EC20u, // FACGT S0, S1, S2
|
2018-12-26 18:11:36 +01:00
|
|
|
0x5E22E420u, // FCMEQ S0, S1, S2
|
|
|
|
0x7E22E420u, // FCMGE S0, S1, S2
|
|
|
|
0x7EA2E420u // FCMGT S0, S1, S2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2020-02-29 21:51:17 +01:00
|
|
|
private static uint[] _F_AcCm_EqGeGt_S_D_()
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-02-29 21:51:17 +01:00
|
|
|
0x7E62EC20u, // FACGE D0, D1, D2
|
|
|
|
0x7EE2EC20u, // FACGT D0, D1, D2
|
2018-12-26 18:11:36 +01:00
|
|
|
0x5E62E420u, // FCMEQ D0, D1, D2
|
|
|
|
0x7E62E420u, // FCMGE D0, D1, D2
|
|
|
|
0x7EE2E420u // FCMGT D0, D1, D2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2020-02-29 21:51:17 +01:00
|
|
|
private static uint[] _F_AcCm_EqGeGt_V_2S_4S_()
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-02-29 21:51:17 +01:00
|
|
|
0x2E20EC00u, // FACGE V0.2S, V0.2S, V0.2S
|
|
|
|
0x2EA0EC00u, // FACGT V0.2S, V0.2S, V0.2S
|
2018-12-26 18:11:36 +01:00
|
|
|
0x0E20E400u, // FCMEQ V0.2S, V0.2S, V0.2S
|
|
|
|
0x2E20E400u, // FCMGE V0.2S, V0.2S, V0.2S
|
|
|
|
0x2EA0E400u // FCMGT V0.2S, V0.2S, V0.2S
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2020-02-29 21:51:17 +01:00
|
|
|
private static uint[] _F_AcCm_EqGeGt_V_2D_()
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-02-29 21:51:17 +01:00
|
|
|
0x6E60EC00u, // FACGE V0.2D, V0.2D, V0.2D
|
|
|
|
0x6EE0EC00u, // FACGT V0.2D, V0.2D, V0.2D
|
2018-12-26 18:11:36 +01:00
|
|
|
0x4E60E400u, // FCMEQ V0.2D, V0.2D, V0.2D
|
|
|
|
0x6E60E400u, // FCMGE V0.2D, V0.2D, V0.2D
|
|
|
|
0x6EE0E400u // FCMGT V0.2D, V0.2D, V0.2D
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-11-18 03:41:16 +01:00
|
|
|
private static uint[] _F_Cmp_Cmpe_S_S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x1E222020u, // FCMP S1, S2
|
|
|
|
0x1E222030u // FCMPE S1, S2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _F_Cmp_Cmpe_S_D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x1E622020u, // FCMP D1, D2
|
|
|
|
0x1E622030u // FCMPE D1, D2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Madd_Msub_Nmadd_Nmsub_S_S_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2019-06-30 01:02:48 +02:00
|
|
|
0x1F020C20u, // FMADD S0, S1, S2, S3
|
|
|
|
0x1F028C20u, // FMSUB S0, S1, S2, S3
|
|
|
|
0x1F220C20u, // FNMADD S0, S1, S2, S3
|
|
|
|
0x1F228C20u // FNMSUB S0, S1, S2, S3
|
2018-10-06 03:45:59 +02:00
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _F_Madd_Msub_Nmadd_Nmsub_S_D_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2019-06-30 01:02:48 +02:00
|
|
|
0x1F420C20u, // FMADD D0, D1, D2, D3
|
|
|
|
0x1F428C20u, // FMSUB D0, D1, D2, D3
|
|
|
|
0x1F620C20u, // FNMADD D0, D1, D2, D3
|
|
|
|
0x1F628C20u // FNMSUB D0, D1, D2, D3
|
2018-10-06 03:45:59 +02:00
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-09-17 06:54:05 +02:00
|
|
|
private static uint[] _F_Max_Min_Nm_S_S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x1E224820u, // FMAX S0, S1, S2
|
|
|
|
0x1E226820u, // FMAXNM S0, S1, S2
|
|
|
|
0x1E225820u, // FMIN S0, S1, S2
|
|
|
|
0x1E227820u // FMINNM S0, S1, S2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _F_Max_Min_Nm_S_D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x1E624820u, // FMAX D0, D1, D2
|
|
|
|
0x1E626820u, // FMAXNM D0, D1, D2
|
|
|
|
0x1E625820u, // FMIN D0, D1, D2
|
|
|
|
0x1E627820u // FMINNM D0, D1, D2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _F_Max_Min_Nm_P_V_2S_4S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-07-13 13:08:47 +02:00
|
|
|
0x0E20F400u, // FMAX V0.2S, V0.2S, V0.2S
|
|
|
|
0x0E20C400u, // FMAXNM V0.2S, V0.2S, V0.2S
|
|
|
|
0x2E20C400u, // FMAXNMP V0.2S, V0.2S, V0.2S
|
|
|
|
0x2E20F400u, // FMAXP V0.2S, V0.2S, V0.2S
|
|
|
|
0x0EA0F400u, // FMIN V0.2S, V0.2S, V0.2S
|
|
|
|
0x0EA0C400u, // FMINNM V0.2S, V0.2S, V0.2S
|
|
|
|
0x2EA0C400u, // FMINNMP V0.2S, V0.2S, V0.2S
|
|
|
|
0x2EA0F400u // FMINP V0.2S, V0.2S, V0.2S
|
2018-09-17 06:54:05 +02:00
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _F_Max_Min_Nm_P_V_2D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
2020-07-13 13:08:47 +02:00
|
|
|
0x4E60F400u, // FMAX V0.2D, V0.2D, V0.2D
|
|
|
|
0x4E60C400u, // FMAXNM V0.2D, V0.2D, V0.2D
|
|
|
|
0x6E60C400u, // FMAXNMP V0.2D, V0.2D, V0.2D
|
|
|
|
0x6E60F400u, // FMAXP V0.2D, V0.2D, V0.2D
|
|
|
|
0x4EE0F400u, // FMIN V0.2D, V0.2D, V0.2D
|
|
|
|
0x4EE0C400u, // FMINNM V0.2D, V0.2D, V0.2D
|
|
|
|
0x6EE0C400u, // FMINNMP V0.2D, V0.2D, V0.2D
|
|
|
|
0x6EE0F400u // FMINP V0.2D, V0.2D, V0.2D
|
2018-09-17 06:54:05 +02:00
|
|
|
};
|
|
|
|
}
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-18 03:41:16 +01:00
|
|
|
private static uint[] _F_Mla_Mls_V_2S_4S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E20CC00u, // FMLA V0.2S, V0.2S, V0.2S
|
|
|
|
0x0EA0CC00u // FMLS V0.2S, V0.2S, V0.2S
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _F_Mla_Mls_V_2D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x4E60CC00u, // FMLA V0.2D, V0.2D, V0.2D
|
|
|
|
0x4EE0CC00u // FMLS V0.2D, V0.2D, V0.2D
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
private static uint[] _F_Recps_Rsqrts_S_S_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x5E22FC20u, // FRECPS S0, S1, S2
|
|
|
|
0x5EA2FC20u // FRSQRTS S0, S1, S2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
private static uint[] _F_Recps_Rsqrts_S_D_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x5E62FC20u, // FRECPS D0, D1, D2
|
|
|
|
0x5EE2FC20u // FRSQRTS D0, D1, D2
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
private static uint[] _F_Recps_Rsqrts_V_2S_4S_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E20FC00u, // FRECPS V0.2S, V0.2S, V0.2S
|
|
|
|
0x0EA0FC00u // FRSQRTS V0.2S, V0.2S, V0.2S
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
private static uint[] _F_Recps_Rsqrts_V_2D_()
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x4E60FC00u, // FRECPS V0.2D, V0.2D, V0.2D
|
|
|
|
0x4EE0FC00u // FRSQRTS V0.2D, V0.2D, V0.2D
|
|
|
|
};
|
|
|
|
}
|
2018-10-28 23:27:50 +01:00
|
|
|
|
2019-03-13 09:23:52 +01:00
|
|
|
private static uint[] _Mla_Mls_Mul_V_8B_4H_2S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E209400u, // MLA V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E209400u, // MLS V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E209C00u // MUL V0.8B, V0.8B, V0.8B
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _Mla_Mls_Mul_V_16B_8H_4S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x4E209400u, // MLA V0.16B, V0.16B, V0.16B
|
|
|
|
0x6E209400u, // MLS V0.16B, V0.16B, V0.16B
|
|
|
|
0x4E209C00u // MUL V0.16B, V0.16B, V0.16B
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-10-28 23:27:50 +01:00
|
|
|
private static uint[] _Sha1c_Sha1m_Sha1p_Sha1su0_V_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x5E000000u, // SHA1C Q0, S0, V0.4S
|
|
|
|
0x5E002000u, // SHA1M Q0, S0, V0.4S
|
|
|
|
0x5E001000u, // SHA1P Q0, S0, V0.4S
|
|
|
|
0x5E003000u // SHA1SU0 V0.4S, V0.4S, V0.4S
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _Sha256h_Sha256h2_Sha256su1_V_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x5E004000u, // SHA256H Q0, Q0, V0.4S
|
|
|
|
0x5E005000u, // SHA256H2 Q0, Q0, V0.4S
|
|
|
|
0x5E006000u // SHA256SU1 V0.4S, V0.4S, V0.4S
|
|
|
|
};
|
|
|
|
}
|
2018-11-18 03:41:16 +01:00
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
private static uint[] _SU_Max_Min_P_V_()
|
2018-11-18 03:41:16 +01:00
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E206400u, // SMAX V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E20A400u, // SMAXP V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E206C00u, // SMIN V0.8B, V0.8B, V0.8B
|
2019-06-30 01:02:48 +02:00
|
|
|
0x0E20AC00u, // SMINP V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E206400u, // UMAX V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E20A400u, // UMAXP V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E206C00u, // UMIN V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E20AC00u // UMINP V0.8B, V0.8B, V0.8B
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _SU_Mlal_Mlsl_Mull_V_8B8H_4H4S_2S2D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E208000u, // SMLAL V0.8H, V0.8B, V0.8B
|
|
|
|
0x0E20A000u, // SMLSL V0.8H, V0.8B, V0.8B
|
|
|
|
0x0E20C000u, // SMULL V0.8H, V0.8B, V0.8B
|
|
|
|
0x2E208000u, // UMLAL V0.8H, V0.8B, V0.8B
|
|
|
|
0x2E20A000u, // UMLSL V0.8H, V0.8B, V0.8B
|
|
|
|
0x2E20C000u // UMULL V0.8H, V0.8B, V0.8B
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _SU_Mlal_Mlsl_Mull_V_16B8H_8H4S_4S2D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x4E208000u, // SMLAL2 V0.8H, V0.16B, V0.16B
|
|
|
|
0x4E20A000u, // SMLSL2 V0.8H, V0.16B, V0.16B
|
|
|
|
0x4E20C000u, // SMULL2 V0.8H, V0.16B, V0.16B
|
|
|
|
0x6E208000u, // UMLAL2 V0.8H, V0.16B, V0.16B
|
|
|
|
0x6E20A000u, // UMLSL2 V0.8H, V0.16B, V0.16B
|
|
|
|
0x6E20C000u // UMULL2 V0.8H, V0.16B, V0.16B
|
2018-11-18 03:41:16 +01:00
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2020-07-13 13:08:47 +02:00
|
|
|
private static uint[] _ShlReg_S_D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x5EE04400u, // SSHL D0, D0, D0
|
|
|
|
0x7EE04400u // USHL D0, D0, D0
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
2018-12-02 01:34:43 +01:00
|
|
|
private static uint[] _ShlReg_V_8B_4H_2S_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x0E205C00u, // SQRSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E204C00u, // SQSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E205400u, // SRSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x0E204400u, // SSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E205C00u, // UQRSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E204C00u, // UQSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E205400u, // URSHL V0.8B, V0.8B, V0.8B
|
|
|
|
0x2E204400u // USHL V0.8B, V0.8B, V0.8B
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
private static uint[] _ShlReg_V_16B_8H_4S_2D_()
|
|
|
|
{
|
|
|
|
return new uint[]
|
|
|
|
{
|
|
|
|
0x4E205C00u, // SQRSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x4E204C00u, // SQSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x4E205400u, // SRSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x4E204400u, // SSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x6E205C00u, // UQRSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x6E204C00u, // UQSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x6E205400u, // URSHL V0.16B, V0.16B, V0.16B
|
|
|
|
0x6E204400u // USHL V0.16B, V0.16B, V0.16B
|
|
|
|
};
|
|
|
|
}
|
2018-04-20 17:40:15 +02:00
|
|
|
#endregion
|
|
|
|
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
private const int RndCnt = 2;
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-17 06:54:05 +02:00
|
|
|
private static readonly bool NoZeros = false;
|
|
|
|
private static readonly bool NoInfs = false;
|
|
|
|
private static readonly bool NoNaNs = false;
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("ADD <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Add_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-04-20 17:40:15 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5EE08400; // ADD D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("ADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Add_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E208400; // ADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Add_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E208400; // ADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Addhn_V_8H8B_4S4H_2D2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H8B, 4S4H, 2D2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E204000; // ADDHN V0.8B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Addhn_V_8H16B_4S8H_2D4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H16B, 4S8H, 2D4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E204000; // ADDHN2 V0.16B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("ADDP <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Addp_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-04-21 21:15:04 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E20BC00; // ADDP V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-21 21:15:04 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ADDP <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Addp_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-21 21:15:04 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E20BC00; // ADDP V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-21 21:15:04 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-21 21:15:04 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("AND <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void And_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E201C00; // AND V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("AND <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void And_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E201C00; // AND V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("BIC <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bic_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E601C00; // BIC V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("BIC <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bic_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E601C00; // BIC V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("BIF <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bif_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2EE01C00; // BIF V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("BIF <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bif_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6EE01C00; // BIF V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("BIT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bit_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2EA01C00; // BIT V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("BIT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bit_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6EA01C00; // BIT V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("BSL <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bsl_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E601C00; // BSL V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("BSL <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Bsl_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E601C00; // BSL V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMEQ <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmeq_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7EE08C00; // CMEQ D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMEQ <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmeq_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E208C00; // CMEQ V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMEQ <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmeq_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E208C00; // CMEQ V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMGE <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmge_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5EE03C00; // CMGE D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMGE <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmge_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E203C00; // CMGE V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMGE <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmge_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E203C00; // CMGE V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMGT <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmgt_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5EE03400; // CMGT D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMGT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmgt_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E203400; // CMGT V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMGT <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmgt_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E203400; // CMGT V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMHI <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhi_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7EE03400; // CMHI D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMHI <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhi_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E203400; // CMHI V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMHI <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhi_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E203400; // CMHI V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMHS <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhs_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7EE03C00; // CMHS D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMHS <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhs_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E203C00; // CMHS V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMHS <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmhs_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E203C00; // CMHS V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMTST <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmtst_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5EE08C00; // CMTST D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("CMTST <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmtst_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E208C00; // CMTST V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("CMTST <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Cmtst_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-06-18 19:55:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E208C00; // CMTST V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("EOR <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Eor_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E201C00; // EOR V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-18 19:55:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("EOR <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Eor_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
2018-06-18 19:55:26 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E201C00; // EOR V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-18 19:55:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_S([ValueSource("_F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-11-01 05:22:09 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Dzc | Fpsr.Idc);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_D([ValueSource("_F_Abd_Add_Div_Mul_Mulx_Nmul_Sub_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE1(z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Dzc | Fpsr.Idc);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2S_4S([ValueSource("_F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2S_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_2S_F_")] ulong z,
|
|
|
|
[ValueSource("_2S_F_")] ulong a,
|
|
|
|
[ValueSource("_2S_F_")] ulong b,
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // <2S, 4S>
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Dzc | Fpsr.Idc);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2D([ValueSource("_F_Abd_Add_Div_Mul_Mulx_Sub_P_V_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_F_")] ulong z,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Dzc | Fpsr.Idc);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-12-26 18:11:36 +01:00
|
|
|
[Test, Pairwise] [Explicit]
|
2020-02-29 21:51:17 +01:00
|
|
|
public void F_AcCm_EqGeGt_S_S([ValueSource("_F_AcCm_EqGeGt_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b)
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-12-26 18:11:36 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise] [Explicit]
|
2020-02-29 21:51:17 +01:00
|
|
|
public void F_AcCm_EqGeGt_S_D([ValueSource("_F_AcCm_EqGeGt_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE1(z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-12-26 18:11:36 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise] [Explicit]
|
2020-02-29 21:51:17 +01:00
|
|
|
public void F_AcCm_EqGeGt_V_2S_4S([ValueSource("_F_AcCm_EqGeGt_V_2S_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_2S_F_")] ulong z,
|
|
|
|
[ValueSource("_2S_F_")] ulong a,
|
|
|
|
[ValueSource("_2S_F_")] ulong b,
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // <2S, 4S>
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-12-26 18:11:36 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise] [Explicit]
|
2020-02-29 21:51:17 +01:00
|
|
|
public void F_AcCm_EqGeGt_V_2D([ValueSource("_F_AcCm_EqGeGt_V_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_F_")] ulong z,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-12-26 18:11:36 +01:00
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-12-26 18:11:36 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
|
|
|
}
|
|
|
|
|
2018-11-18 03:41:16 +01:00
|
|
|
[Test, Pairwise] [Explicit]
|
|
|
|
public void F_Cmp_Cmpe_S_S([ValueSource("_F_Cmp_Cmpe_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b)
|
|
|
|
{
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-11-18 03:41:16 +01:00
|
|
|
|
|
|
|
bool v = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool c = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool z = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool n = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v1: v1, v2: v2, overflow: v, carry: c, zero: z, negative: n);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise] [Explicit]
|
|
|
|
public void F_Cmp_Cmpe_S_D([ValueSource("_F_Cmp_Cmpe_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
|
|
|
{
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-11-18 03:41:16 +01:00
|
|
|
|
|
|
|
bool v = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool c = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool z = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
bool n = TestContext.CurrentContext.Random.NextBool();
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v1: v1, v2: v2, overflow: v, carry: c, zero: z, negative: n);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc);
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Madd_Msub_Nmadd_Nmsub_S_S([ValueSource("_F_Madd_Msub_Nmadd_Nmsub_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b,
|
|
|
|
[ValueSource("_1S_F_")] ulong c)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
|
|
|
V128 v3 = MakeVectorE0(c);
|
2018-11-01 05:22:09 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, v3: v3, fpcr: fpcr);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsS);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2019-06-30 01:02:48 +02:00
|
|
|
public void F_Madd_Msub_Nmadd_Nmsub_S_D([ValueSource("_F_Madd_Msub_Nmadd_Nmsub_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b,
|
|
|
|
[ValueSource("_1D_F_")] ulong c)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE1(z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
|
|
|
V128 v3 = MakeVectorE0(c);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, v3: v3, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsD);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Max_Min_Nm_S_S([ValueSource("_F_Max_Min_Nm_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-11-01 05:22:09 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Max_Min_Nm_S_D([ValueSource("_F_Max_Min_Nm_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE1(z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Max_Min_Nm_P_V_2S_4S([ValueSource("_F_Max_Min_Nm_P_V_2S_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_2S_F_")] ulong z,
|
|
|
|
[ValueSource("_2S_F_")] ulong a,
|
|
|
|
[ValueSource("_2S_F_")] ulong b,
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // <2S, 4S>
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Max_Min_Nm_P_V_2D([ValueSource("_F_Max_Min_Nm_P_V_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_F_")] ulong z,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-09-17 06:54:05 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-09-17 06:54:05 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Ioc | Fpsr.Idc);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-11-18 03:41:16 +01:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
|
|
|
public void F_Mla_Mls_V_2S_4S([ValueSource("_F_Mla_Mls_V_2S_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_2S_F_")] ulong z,
|
|
|
|
[ValueSource("_2S_F_")] ulong a,
|
|
|
|
[ValueSource("_2S_F_")] ulong b,
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // <2S, 4S>
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-11-18 03:41:16 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsS);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
|
|
|
public void F_Mla_Mls_V_2D([ValueSource("_F_Mla_Mls_V_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_F_")] ulong z,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-11-18 03:41:16 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
|
|
|
|
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsD);
|
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Recps_Rsqrts_S_S([ValueSource("_F_Recps_Rsqrts_S_S_")] uint opcodes,
|
|
|
|
[ValueSource("_1S_F_")] ulong a,
|
|
|
|
[ValueSource("_1S_F_")] ulong b)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-11-01 05:22:09 +01:00
|
|
|
|
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsS);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Recps_Rsqrts_S_D([ValueSource("_F_Recps_Rsqrts_S_D_")] uint opcodes,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
ulong z = TestContext.CurrentContext.Random.NextULong();
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE1(z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsD);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Recps_Rsqrts_V_2S_4S([ValueSource("_F_Recps_Rsqrts_V_2S_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_2S_F_")] ulong z,
|
|
|
|
[ValueSource("_2S_F_")] ulong a,
|
|
|
|
[ValueSource("_2S_F_")] ulong b,
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // <2S, 4S>
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsS);
|
2018-10-06 03:45:59 +02:00
|
|
|
}
|
|
|
|
|
2018-10-14 04:35:16 +02:00
|
|
|
[Test, Pairwise] [Explicit] // Fused.
|
2018-11-01 05:22:09 +01:00
|
|
|
public void F_Recps_Rsqrts_V_2D([ValueSource("_F_Recps_Rsqrts_V_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_F_")] ulong z,
|
|
|
|
[ValueSource("_1D_F_")] ulong a,
|
|
|
|
[ValueSource("_1D_F_")] ulong b)
|
2018-10-06 03:45:59 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int rnd = (int)TestContext.CurrentContext.Random.NextUInt();
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
int fpcr = rnd & (1 << (int)Fpcr.Fz);
|
|
|
|
fpcr |= rnd & (1 << (int)Fpcr.Dn);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2, fpcr: fpcr);
|
2018-10-06 03:45:59 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(Fpsr.Ioc | Fpsr.Idc, FpSkips.IfUnderflow, FpTolerances.UpToOneUlpsD);
|
2018-09-17 06:54:05 +02:00
|
|
|
}
|
|
|
|
|
2019-03-13 09:23:52 +01:00
|
|
|
[Test, Pairwise]
|
|
|
|
public void Mla_Mls_Mul_V_8B_4H_2S([ValueSource("_Mla_Mls_Mul_V_8B_4H_2S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2019-03-13 09:23:52 +01:00
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn();
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise]
|
|
|
|
public void Mla_Mls_Mul_V_16B_8H_4S([ValueSource("_Mla_Mls_Mul_V_16B_8H_4S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2019-03-13 09:23:52 +01:00
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn();
|
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("ORN <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Orn_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0EE01C00; // ORN V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ORN <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Orn_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4EE01C00; // ORN V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("ORR <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Orr_V_8B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0EA01C00; // ORR V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ORR <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Orr_V_16B([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B_")] [Random(RndCnt)] ulong b)
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4EA01C00; // ORR V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
|
2021-01-04 23:45:54 +01:00
|
|
|
CompareAgainstUnicorn();
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("PMULL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
|
|
|
public void Pmull_V([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong z0,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong z1,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong a0,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong a1,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong b0,
|
|
|
|
[ValueSource(nameof(_8B1D_))] [Random(RndCnt)] ulong b1,
|
|
|
|
[Values(0b00u, 0b11u)] uint size, // Q0: <8B, 1D> => <8H, 1Q>
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // Q1: <16B, 2D> => <8H, 1Q>
|
|
|
|
{
|
|
|
|
uint opcode = 0x0E20E000; // PMULL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
|
|
|
opcode |= ((q & 1) << 30);
|
|
|
|
|
|
|
|
V128 v0 = MakeVectorE0E1(z0, z1);
|
|
|
|
V128 v1 = MakeVectorE0E1(a0, a1);
|
|
|
|
V128 v2 = MakeVectorE0E1(b0, b1);
|
|
|
|
|
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Cls_V, Clz_V, Orn_V instructions. Add 18 Tests: And_V, Bic_V, Bif_V, Bit_V, Bsl_V, Cls_V, Clz_V, Orn_V, Orr_V. (#104)
* Update AOpCodeTable.cs
* Update AInstEmitSimdLogical.cs
* Update AInstEmitSimdArithmetic.cs
* Update ASoftFallback.cs
* Update AInstEmitAlu.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
2018-04-26 04:20:22 +02:00
|
|
|
}
|
|
|
|
|
2018-04-20 17:40:15 +02:00
|
|
|
[Test, Pairwise, Description("RADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Raddhn_V_8H8B_4S4H_2D2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H8B, 4S4H, 2D2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E204000; // RADDHN V0.8B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("RADDHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Raddhn_V_8H16B_4S8H_2D4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H16B, 4S8H, 2D4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E204000; // RADDHN2 V0.16B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("RSUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Rsubhn_V_8H8B_4S4H_2D2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H8B, 4S4H, 2D2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E206000; // RSUBHN V0.8B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("RSUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Rsubhn_V_8H16B_4S8H_2D4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H16B, 4S8H, 2D4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E206000; // RSUBHN2 V0.16B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saba_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E207C00; // SABA V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saba_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E207C00; // SABA V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabal_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E205000; // SABAL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabal_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E205000; // SABAL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E207400; // SABD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E207400; // SABD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabdl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
2018-06-30 17:40:41 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E207000; // SABDL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sabdl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
2018-06-30 17:40:41 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E207000; // SABDL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Test, Pairwise, Description("SADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saddl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E200000; // SADDL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saddl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E200000; // SADDL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2018-07-19 02:06:28 +02:00
|
|
|
[Test, Pairwise, Description("SADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saddw_V_8B8H8H_4H4S4S_2S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H8H, 4H4S4S, 2S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E201000; // SADDW V0.8H, V0.8H, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Saddw_V_16B8H8H_8H4S4S_4S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H8H, 8H4S4S, 4S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E201000; // SADDW2 V0.8H, V0.8H, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-17 02:44:44 +02:00
|
|
|
}
|
|
|
|
|
2018-10-28 23:27:50 +01:00
|
|
|
[Test, Pairwise]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sha1c_Sha1m_Sha1p_Sha1su0_V([ValueSource("_Sha1c_Sha1m_Sha1p_Sha1su0_V_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[Random(RndCnt / 2)] ulong z0, [Random(RndCnt / 2)] ulong z1,
|
|
|
|
[Random(RndCnt / 2)] ulong a0, [Random(RndCnt / 2)] ulong a1,
|
|
|
|
[Random(RndCnt / 2)] ulong b0, [Random(RndCnt / 2)] ulong b1)
|
2018-08-17 02:44:44 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-08-17 02:44:44 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z0, z1);
|
|
|
|
V128 v1 = MakeVectorE0E1(a0, a1);
|
|
|
|
V128 v2 = MakeVectorE0E1(b0, b1);
|
2018-09-08 19:23:07 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
2018-08-17 02:44:44 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-17 02:44:44 +02:00
|
|
|
}
|
|
|
|
|
2018-10-28 23:27:50 +01:00
|
|
|
[Test, Pairwise]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sha256h_Sha256h2_Sha256su1_V([ValueSource("_Sha256h_Sha256h2_Sha256su1_V_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[Random(RndCnt / 2)] ulong z0, [Random(RndCnt / 2)] ulong z1,
|
|
|
|
[Random(RndCnt / 2)] ulong a0, [Random(RndCnt / 2)] ulong a1,
|
|
|
|
[Random(RndCnt / 2)] ulong b0, [Random(RndCnt / 2)] ulong b1)
|
2018-08-17 02:44:44 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-08-17 02:44:44 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z0, z1);
|
|
|
|
V128 v1 = MakeVectorE0E1(a0, a1);
|
|
|
|
V128 v2 = MakeVectorE0E1(b0, b1);
|
2018-09-08 19:23:07 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
2018-08-17 02:44:44 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
2018-08-27 08:44:01 +02:00
|
|
|
[Test, Pairwise, Description("SHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Shadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E200400; // SHADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Shadd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E200400; // SHADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Shsub_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E202400; // SHSUB V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Shsub_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E202400; // SHSUB V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
2018-11-18 03:41:16 +01:00
|
|
|
[Test, Pairwise]
|
2019-06-30 01:02:48 +02:00
|
|
|
public void SU_Max_Min_P_V([ValueSource("_SU_Max_Min_P_V_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size, // Q0: <8B, 4H, 2S>
|
|
|
|
[Values(0b0u, 0b1u)] uint q) // Q1: <16B, 8H, 4S>
|
2018-11-18 03:41:16 +01:00
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
|
|
|
opcodes |= ((q & 1) << 30);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a * q);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b * q);
|
2018-11-18 03:41:16 +01:00
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn();
|
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
[Test, Pairwise]
|
|
|
|
public void SU_Mlal_Mlsl_Mull_V_8B8H_4H4S_2S2D([ValueSource("_SU_Mlal_Mlsl_Mull_V_8B8H_4H4S_2S2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
{
|
2019-06-30 01:02:48 +02:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
[Test, Pairwise]
|
|
|
|
public void SU_Mlal_Mlsl_Mull_V_16B8H_8H4S_4S2D([ValueSource("_SU_Mlal_Mlsl_Mull_V_16B8H_8H4S_4S2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
{
|
2019-06-30 01:02:48 +02:00
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2019-06-30 01:02:48 +02:00
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2018-08-04 21:58:54 +02:00
|
|
|
[Test, Pairwise, Description("SQADD <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqadd_S_B_H_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <b, H, S, D>
|
2018-08-04 21:58:54 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5E200C00; // SQADD B0, B0, B0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E200C00; // SQADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqadd_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E200C00; // SQADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
2018-08-10 19:27:15 +02:00
|
|
|
[Test, Pairwise, Description("SQDMULH <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqdmulh_S_H_S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <H, S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5E20B400; // SQDMULH B0, B0, B0 (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqdmulh_V_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E20B400; // SQDMULH V0.8B, V0.8B, V0.8B (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqdmulh_V_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E20B400; // SQDMULH V0.16B, V0.16B, V0.16B (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQRDMULH <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqrdmulh_S_H_S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1H1S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <H, S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7E20B400; // SQRDMULH B0, B0, B0 (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQRDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqrdmulh_V_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E20B400; // SQRDMULH V0.8B, V0.8B, V0.8B (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQRDMULH <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqrdmulh_V_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-10 19:27:15 +02:00
|
|
|
[Values(0b01u, 0b10u)] uint size) // <8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E20B400; // SQRDMULH V0.16B, V0.16B, V0.16B (RESERVED)
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-10 19:27:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-10 19:27:15 +02:00
|
|
|
}
|
|
|
|
|
2018-08-04 21:58:54 +02:00
|
|
|
[Test, Pairwise, Description("SQSUB <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqsub_S_B_H_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <b, H, S, D>
|
2018-08-04 21:58:54 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x5E202C00; // SQSUB B0, B0, B0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqsub_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E202C00; // SQSUB V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sqsub_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E202C00; // SQSUB V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
2018-08-27 08:44:01 +02:00
|
|
|
[Test, Pairwise, Description("SRHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Srhadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E201400; // SRHADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SRHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Srhadd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E201400; // SRHADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
2020-07-13 13:08:47 +02:00
|
|
|
[Test, Pairwise]
|
|
|
|
public void ShlReg_S_D([ValueSource("_ShlReg_S_D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(0ul, 255ul, RndCnt)] ulong b)
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
|
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
|
|
|
}
|
|
|
|
|
2018-12-02 01:34:43 +01:00
|
|
|
[Test, Pairwise]
|
|
|
|
public void ShlReg_V_8B_4H_2S([ValueSource("_ShlReg_V_8B_4H_2S_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(0ul, 255ul, RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-12-02 01:34:43 +01:00
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise]
|
|
|
|
public void ShlReg_V_16B_8H_4S_2D([ValueSource("_ShlReg_V_16B_8H_4S_2D_")] uint opcodes,
|
|
|
|
[Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(0ul, 255ul, RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
|
|
|
opcodes |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcodes |= ((size & 3) << 22);
|
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-12-02 01:34:43 +01:00
|
|
|
|
|
|
|
SingleOpcode(opcodes, v0: v0, v1: v1, v2: v2);
|
|
|
|
|
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
|
|
|
}
|
|
|
|
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Test, Pairwise, Description("SSUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Ssubl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E202000; // SSUBL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SSUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Ssubl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E202000; // SSUBL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2018-07-19 02:06:28 +02:00
|
|
|
[Test, Pairwise, Description("SSUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Ssubw_V_8B8H8H_4H4S4S_2S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H8H, 4H4S4S, 2S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E203000; // SSUBW V0.8H, V0.8H, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SSUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Ssubw_V_16B8H8H_8H4S4S_4S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H8H, 8H4S4S, 4S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E203000; // SSUBW2 V0.8H, V0.8H, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SUB <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sub_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1D_")] [Random(RndCnt)] ulong b)
|
2018-04-20 17:40:15 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7EE08400; // SUB D0, D0, D0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("SUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sub_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E208400; // SUB V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Sub_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E208400; // SUB V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Subhn_V_8H8B_4S4H_2D2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H8B, 4S4H, 2D2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E206000; // SUBHN V0.8B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("SUBHN{2} <Vd>.<Tb>, <Vn>.<Ta>, <Vm>.<Ta>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Subhn_V_8H16B_4S8H_2D4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-04-20 17:40:15 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8H16B, 4S8H, 2D4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E206000; // SUBHN2 V0.16B, V0.8H, V0.8H
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-04-20 17:40:15 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("TRN1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Trn1_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E002800; // TRN1 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("TRN1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Trn1_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E002800; // TRN1 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("TRN2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Trn2_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E006800; // TRN2 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("TRN2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Trn2_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E006800; // TRN2 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("UABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaba_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E207C00; // UABA V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UABA <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaba_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E207C00; // UABA V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabal_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E205000; // UABAL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UABAL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabal_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E205000; // UABAL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("UABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E207400; // UABD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UABD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-06-30 17:40:41 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E207400; // UABD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("UABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabdl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
2018-06-30 17:40:41 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E207000; // UABDL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-06-30 17:40:41 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("UABDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uabdl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
2018-06-30 17:40:41 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E207000; // UABDL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-06-30 17:40:41 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-04-20 17:40:15 +02:00
|
|
|
}
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Test, Pairwise, Description("UADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaddl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E200000; // UADDL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UADDL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaddl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E200000; // UADDL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2018-07-19 02:06:28 +02:00
|
|
|
[Test, Pairwise, Description("UADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaddw_V_8B8H8H_4H4S4S_2S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H8H, 4H4S4S, 2S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E201000; // UADDW V0.8H, V0.8H, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UADDW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uaddw_V_16B8H8H_8H4S4S_4S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H8H, 8H4S4S, 4S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E201000; // UADDW2 V0.8H, V0.8H, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
2018-08-27 08:44:01 +02:00
|
|
|
[Test, Pairwise, Description("UHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uhadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E200400; // UHADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uhadd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E200400; // UHADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uhsub_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E202400; // UHSUB V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UHSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uhsub_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E202400; // UHSUB V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
2018-08-04 21:58:54 +02:00
|
|
|
[Test, Pairwise, Description("UQADD <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqadd_S_B_H_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <b, H, S, D>
|
2018-08-04 21:58:54 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7E200C00; // UQADD B0, B0, B0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E200C00; // UQADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UQADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqadd_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E200C00; // UQADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UQSUB <V><d>, <V><n>, <V><m>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqsub_S_B_H_S_D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_1B1H1S1D_")] [Random(RndCnt)] ulong b,
|
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <b, H, S, D>
|
2018-08-04 21:58:54 +02:00
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x7E202C00; // UQSUB B0, B0, B0
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqsub_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E202C00; // UQSUB V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UQSUB <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uqsub_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-08-04 21:58:54 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E202C00; // UQSUB V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-04 21:58:54 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
CompareAgainstUnicorn(fpsrMask: Fpsr.Qc);
|
2018-08-04 21:58:54 +02:00
|
|
|
}
|
|
|
|
|
2018-08-27 08:44:01 +02:00
|
|
|
[Test, Pairwise, Description("URHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Urhadd_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E201400; // URHADD V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("URHADD <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Urhadd_V_16B_8H_4S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-08-27 08:44:01 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B, 8H, 4S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E201400; // URHADD V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, b);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-08-27 08:44:01 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-08-27 08:44:01 +02:00
|
|
|
}
|
|
|
|
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Test, Pairwise, Description("USUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Usubl_V_8B8H_4H4S_2S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H, 4H4S, 2S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E202000; // USUBL V0.8H, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("USUBL{2} <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Usubl_V_16B8H_8H4S_4S2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H, 8H4S, 4S2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E202000; // USUBL2 V0.8H, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE1(a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
|
2018-09-08 19:23:07 +02:00
|
|
|
CompareAgainstUnicorn();
|
Add Fcvtns_S, Fcvtns_V, Fcvtnu_S, Fcvtnu_V (AOpCodeSimd) FP & Umlal_V, Umlsl_V, Saddl_V, Ssubl_V, Usubl_V instructions; add 8 FP & 16 S/Umlal_V, S/Umlsl_V, S/Uaddl_V, S/Usubl_V Tests. (#390)
* Update AOpCodeTable.cs
* Update AInstEmitSimdCvt.cs
* Update Pseudocode.cs
* Update Instructions.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Update CpuTestSimd.cs
* Update AOpCodeTable.cs
* Update AInstEmitSimdArithmetic.cs
* Update Instructions.cs
* Update CpuTestSimdReg.cs
* Add QCFlagBit.
* Add QCFlagBit.
2018-09-01 16:52:51 +02:00
|
|
|
}
|
|
|
|
|
2018-07-19 02:06:28 +02:00
|
|
|
[Test, Pairwise, Description("USUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Usubw_V_8B8H8H_4H4S4S_2S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B8H8H, 4H4S4S, 2S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x2E203000; // USUBW V0.8H, V0.8H, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("USUBW{2} <Vd>.<Ta>, <Vn>.<Ta>, <Vm>.<Tb>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Usubw_V_16B8H8H_8H4S4S_4S2D2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-19 02:06:28 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <16B8H8H, 8H4S4S, 4S2D2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x6E203000; // USUBW2 V0.8H, V0.8H, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, a);
|
|
|
|
V128 v2 = MakeVectorE1(b);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-19 02:06:28 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-19 02:06:28 +02:00
|
|
|
}
|
|
|
|
|
2018-07-15 05:53:26 +02:00
|
|
|
[Test, Pairwise, Description("UZP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uzp1_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E001800; // UZP1 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UZP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uzp1_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E001800; // UZP1 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UZP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uzp2_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E005800; // UZP2 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("UZP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Uzp2_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E005800; // UZP2 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ZIP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Zip1_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E003800; // ZIP1 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ZIP1 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Zip1_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E003800; // ZIP1 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ZIP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Zip2_V_8B_4H_2S([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u)] uint size) // <8B, 4H, 2S>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x0E007800; // ZIP2 V0.8B, V0.8B, V0.8B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, z);
|
|
|
|
V128 v1 = MakeVectorE0(a);
|
|
|
|
V128 v2 = MakeVectorE0(b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
[Test, Pairwise, Description("ZIP2 <Vd>.<T>, <Vn>.<T>, <Vm>.<T>")]
|
2018-11-01 05:22:09 +01:00
|
|
|
public void Zip2_V_16B_8H_4S_2D([Values(0u)] uint rd,
|
|
|
|
[Values(1u, 0u)] uint rn,
|
|
|
|
[Values(2u, 0u)] uint rm,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong z,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong a,
|
|
|
|
[ValueSource("_8B4H2S1D_")] [Random(RndCnt)] ulong b,
|
2018-07-15 05:53:26 +02:00
|
|
|
[Values(0b00u, 0b01u, 0b10u, 0b11u)] uint size) // <16B, 8H, 4S, 2D>
|
|
|
|
{
|
2018-11-01 05:22:09 +01:00
|
|
|
uint opcode = 0x4E007800; // ZIP2 V0.16B, V0.16B, V0.16B
|
|
|
|
opcode |= ((rm & 31) << 16) | ((rn & 31) << 5) | ((rd & 31) << 0);
|
|
|
|
opcode |= ((size & 3) << 22);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
Add a new JIT compiler for CPU code (#693)
* Start of the ARMeilleure project
* Refactoring around the old IRAdapter, now renamed to PreAllocator
* Optimize the LowestBitSet method
* Add CLZ support and fix CLS implementation
* Add missing Equals and GetHashCode overrides on some structs, misc small tweaks
* Implement the ByteSwap IR instruction, and some refactoring on the assembler
* Implement the DivideUI IR instruction and fix 64-bits IDIV
* Correct constant operand type on CSINC
* Move division instructions implementation to InstEmitDiv
* Fix destination type for the ConditionalSelect IR instruction
* Implement UMULH and SMULH, with new IR instructions
* Fix some issues with shift instructions
* Fix constant types for BFM instructions
* Fix up new tests using the new V128 struct
* Update tests
* Move DIV tests to a separate file
* Add support for calls, and some instructions that depends on them
* Start adding support for SIMD & FP types, along with some of the related ARM instructions
* Fix some typos and the divide instruction with FP operands
* Fix wrong method call on Clz_V
* Implement ARM FP & SIMD move instructions, Saddlv_V, and misc. fixes
* Implement SIMD logical instructions and more misc. fixes
* Fix PSRAD x86 instruction encoding, TRN, UABD and UABDL implementations
* Implement float conversion instruction, merge in LDj3SNuD fixes, and some other misc. fixes
* Implement SIMD shift instruction and fix Dup_V
* Add SCVTF and UCVTF (vector, fixed-point) variants to the opcode table
* Fix check with tolerance on tester
* Implement FP & SIMD comparison instructions, and some fixes
* Update FCVT (Scalar) encoding on the table to support the Half-float variants
* Support passing V128 structs, some cleanup on the register allocator, merge LDj3SNuD fixes
* Use old memory access methods, made a start on SIMD memory insts support, some fixes
* Fix float constant passed to functions, save and restore non-volatile XMM registers, other fixes
* Fix arguments count with struct return values, other fixes
* More instructions
* Misc. fixes and integrate LDj3SNuD fixes
* Update tests
* Add a faster linear scan allocator, unwinding support on windows, and other changes
* Update Ryujinx.HLE
* Update Ryujinx.Graphics
* Fix V128 return pointer passing, RCX is clobbered
* Update Ryujinx.Tests
* Update ITimeZoneService
* Stop using GetFunctionPointer as that can't be called from native code, misc. fixes and tweaks
* Use generic GetFunctionPointerForDelegate method and other tweaks
* Some refactoring on the code generator, assert on invalid operations and use a separate enum for intrinsics
* Remove some unused code on the assembler
* Fix REX.W prefix regression on float conversion instructions, add some sort of profiler
* Add hardware capability detection
* Fix regression on Sha1h and revert Fcm** changes
* Add SSE2-only paths on vector extract and insert, some refactoring on the pre-allocator
* Fix silly mistake introduced on last commit on CpuId
* Generate inline stack probes when the stack allocation is too large
* Initial support for the System-V ABI
* Support multiple destination operands
* Fix SSE2 VectorInsert8 path, and other fixes
* Change placement of XMM callee save and restore code to match other compilers
* Rename Dest to Destination and Inst to Instruction
* Fix a regression related to calls and the V128 type
* Add an extra space on comments to match code style
* Some refactoring
* Fix vector insert FP32 SSE2 path
* Port over the ARM32 instructions
* Avoid memory protection races on JIT Cache
* Another fix on VectorInsert FP32 (thanks to LDj3SNuD
* Float operands don't need to use the same register when VEX is supported
* Add a new register allocator, higher quality code for hot code (tier up), and other tweaks
* Some nits, small improvements on the pre allocator
* CpuThreadState is gone
* Allow changing CPU emulators with a config entry
* Add runtime identifiers on the ARMeilleure project
* Allow switching between CPUs through a config entry (pt. 2)
* Change win10-x64 to win-x64 on projects
* Update the Ryujinx project to use ARMeilleure
* Ensure that the selected register is valid on the hybrid allocator
* Allow exiting on returns to 0 (should fix test regression)
* Remove register assignments for most used variables on the hybrid allocator
* Do not use fixed registers as spill temp
* Add missing namespace and remove unneeded using
* Address PR feedback
* Fix types, etc
* Enable AssumeStrictAbiCompliance by default
* Ensure that Spill and Fill don't load or store any more than necessary
2019-08-08 20:56:22 +02:00
|
|
|
V128 v0 = MakeVectorE0E1(z, ~z);
|
|
|
|
V128 v1 = MakeVectorE0E1(a, ~a);
|
|
|
|
V128 v2 = MakeVectorE0E1(b, ~b);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-11-01 05:22:09 +01:00
|
|
|
SingleOpcode(opcode, v0: v0, v1: v1, v2: v2);
|
2018-07-15 05:53:26 +02:00
|
|
|
|
2018-09-01 16:24:05 +02:00
|
|
|
CompareAgainstUnicorn();
|
2018-07-15 05:53:26 +02:00
|
|
|
}
|
2018-04-20 17:40:15 +02:00
|
|
|
#endif
|
|
|
|
}
|
|
|
|
}
|