Commit graph

945 commits

Author SHA1 Message Date
ReinUsesLisp
e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp
b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp
6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp
4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
bunnei
52f54c728d
Merge pull request #2592 from FernandoS27/sync1
Implement GPU Synchronization Mechanisms & Correct NVFlinger
2019-07-26 14:26:44 -04:00
Fernando Sahmkow
a452ff983d MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
bunnei
31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
bunnei
9be9600bdc
Merge pull request #2704 from FernandoS27/conditional
maxwell3d: Implement Conditional Rendering
2019-07-24 17:07:57 -04:00
bunnei
f601f25bcc
Merge pull request #2734 from ReinUsesLisp/compute-shaders
gl_rasterizer: Implement compute shaders
2019-07-22 11:12:55 -04:00
bunnei
27e10e0442
Merge pull request #2735 from FernandoS27/pipeline-rework
Rework Dirty Flags in GPU Pipeline, Optimize CBData and Redo Clearing mechanism
2019-07-21 00:59:52 -04:00
Fernando Sahmkow
11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow
7a35178ee2 Maxwell3D: Reorganize and address feedback 2019-07-20 10:18:35 -04:00
ReinUsesLisp
6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Fernando Sahmkow
3a3fee5abf MaxwellDMA/KeplerCopy: Downgrade DMA log message to Trace.
This log was just to know which games used DMA. It's no longer 
important.
2019-07-18 08:31:38 -04:00
Fernando Sahmkow
4be61013a1 GL_State: Feedback and fixes 2019-07-17 17:29:56 -04:00
Fernando Sahmkow
5ad889f6fd Maxwell3D: Address Feedback 2019-07-17 17:29:55 -04:00
Fernando Sahmkow
8cdbfe69b1 GL_Rasterizer: Corrections to Clearing. 2019-07-17 17:29:54 -04:00
Fernando Sahmkow
0ff4a5fa39 Maxwell3D: Correct marking dirtiness on CB upload 2019-07-17 17:29:53 -04:00
Fernando Sahmkow
fec32fed18 GL_Rasterizer: Rework RenderTarget/DepthBuffer clearing 2019-07-17 17:29:52 -04:00
Fernando Sahmkow
a081dea8ab Maxwell3D: Implement State Dirty Flags. 2019-07-17 17:29:51 -04:00
Fernando Sahmkow
0d3db58657 Maxwell3D: Rework CBData Upload 2019-07-17 17:29:50 -04:00
Fernando Sahmkow
f2e7b29c14 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow
e42bcf2314 maxwell3d: Implement Conditional Rendering
Conditional Rendering takes care of conditionaly clearing or drawing
depending on a set of queries. This PR implements the query checks to
stablish if things can be rendered or not.
2019-07-17 17:13:19 -04:00
ReinUsesLisp
725ba6cf63 gl_rasterizer: Implement compute shaders 2019-07-15 17:38:25 -03:00
Fernando Sahmkow
1bdb59fc6e
Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
bunnei
3477b92289
Merge pull request #2675 from ReinUsesLisp/opengl-buffer-cache
buffer_cache: Implement a generic buffer cache and its OpenGL backend
2019-07-14 19:03:43 -04:00
Fernando Sahmkow
0ec9da2f9f
Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
8a6fc529a9 shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
ReinUsesLisp
c9d886c84e gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
ReinUsesLisp
d0966b9f7c shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
ReinUsesLisp
7ecf64257a gl_rasterizer: Minor style changes 2019-07-06 00:37:55 -03:00
Fernando Sahmkow
82b829625b video_core: Implement GPU side Syncpoints 2019-07-05 15:49:11 -04:00
ReinUsesLisp
4d63f97945 shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
Fernando Sahmkow
082740d34d surface: Correct format S8Z24 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
7232a1ed16 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
cb728797b0 fermi2d: Correct Origin Mode 2019-06-20 21:38:34 -03:00
Fernando Sahmkow
175aa343ff texture_cache: Fermi2D reform and implement View Mirage
This also does some fixes on compressed textures reinterpret and on the
Fermi2D engine in general.
2019-06-20 21:38:33 -03:00
ReinUsesLisp
06c4ce8645 shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
b8c75a845b maxwell_3d: Partially implement texture buffers as 1D textures 2019-06-20 21:36:12 -03:00
ReinUsesLisp
4e81fc8296 shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
d267948a73 texture_cache: loose TryReconstructSurface when accurate GPU is not on.
Also corrects some asserts.
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
6bd034eae9 engine_upload: Addapt to new Texture Cache 2019-06-20 21:36:12 -03:00
ReinUsesLisp
345e73f2fe video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
bunnei
c7b5c245e1
Merge pull request #2562 from ReinUsesLisp/split-cbuf-upload
video_core/engines: Move ConstBufferInfo out of Maxwell3D
2019-06-17 22:35:04 -04:00
ReinUsesLisp
528c15051c kepler_compute: Use std::array for cbuf info 2019-06-07 20:36:22 -03:00
ReinUsesLisp
17d5fb6d06 kepler_compute: Fix block_dim_x encoding 2019-06-07 20:35:46 -03:00
ReinUsesLisp
2f2a61887a video_core/engines: Move ConstBufferInfo out of Maxwell3D 2019-06-07 19:47:15 -03:00
Fernando Sahmkow
a32c52b1d8 shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp
75e7b45d69 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
f78ef617b6 shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
bunnei
d49efbfb4a
Merge pull request #2441 from ReinUsesLisp/al2p
shader: Implement AL2P and ALD.PHYS
2019-05-19 14:02:58 -04:00
Hexagon12
b54bd3f018
Merge pull request #2472 from FernandoS27/tic
maxwell_3d: reduce severity of different component formats assert.
2019-05-19 15:04:47 +01:00
Hexagon12
3bd5f01240
Merge pull request #2469 from lioncash/copyable
video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
2019-05-19 15:02:17 +01:00
Sebastian Valle
a6ed792ac4
Merge pull request #2470 from lioncash/ranged-for
video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
2019-05-19 09:01:19 -05:00
Fernando Sahmkow
fc975e9021 maxwell_3d: reduce sevirity of different component formats assert.
This was reduced due to happening on most games and at such constant
rate that it affected performance heavily for the end user. In general,
we are well aware of the assert and an implementation is already
planned.
2019-05-14 17:12:54 -04:00
Lioncash
b01cce716e video_core/engines/engine_upload: Amend constructor initializer list order
Silences a -Wreorder warning.
2019-05-14 13:43:28 -04:00
Lioncash
9b6d993e52 video_core/engines/engine_upload: Default destructor in the cpp file
Avoids inlining destruction logic where applicable, and also makes
forward declarations not cause unexpected compilation errors depending
on where the State class is used.
2019-05-14 13:41:41 -04:00
Lioncash
ec1c69258a video_core/engines/engine_upload: Remove unnecessary const on parameters in function declarations
These only apply in the definition of the function. They can be omitted
from the declaration.
2019-05-14 13:40:09 -04:00
Lioncash
0f83c8dffa video_core/engines/engine_upload: Remove unnecessary includes 2019-05-14 13:39:04 -04:00
Lioncash
5db1b54b58 video_core/engines/maxwell3d: Get rid of three magic values in CallMethod()
We can use the named constant instead of using 32 directly.
2019-05-14 09:02:47 -04:00
Lioncash
48ce5880a0 video_core/engines/maxwell_3d: Simplify for loops into ranged for loops within InitializeRegisterDefaults()
Lessens the amount of code that needs to be read, and gets rid of the
need to introduce an indexing variable. Instead, we just operate on the
objects directly.
2019-05-14 08:53:19 -04:00
Lioncash
c212fc9b2c video_core/engines/maxwell_3d: Add is_trivially_copyable_v check for Regs
std::memset is used to clear the entire register structure, which
requires that the Regs struct be trivially copyable (otherwise undefined
behavior is invoked). This prevents the case where a non-trivial type is
potentially added to the struct.
2019-05-14 08:47:56 -04:00
bunnei
c27b81cb85
Merge pull request #2429 from FernandoS27/compute
Corrections and Implementation on GPU Engines
2019-05-09 13:19:22 -04:00
ReinUsesLisp
d4df803b2b shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
71aa9d0877 shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
bd81a03d9d gl_shader_decompiler: Declare all possible varyings on physical attribute usage 2019-05-02 21:46:25 -03:00
ReinUsesLisp
7632a7d6d2 shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
Fernando Sahmkow
e64c41efe8 Refactors and name corrections. 2019-05-01 15:31:39 -04:00
bunnei
c52233ec8b
Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
Fernando Sahmkow
b3118ee316 Fixes and Corrections to DMA Engine 2019-04-23 15:28:18 -04:00
Fernando Sahmkow
f1e5314f1a Add Swizzle Parameters to the DMA engine 2019-04-23 11:21:00 -04:00
Fernando Sahmkow
e140e2ebc6 Add Documentation Headers to all the GPU Engines 2019-04-23 08:44:52 -04:00
Fernando Sahmkow
021d28c9b8 Corrections and styling 2019-04-23 08:02:24 -04:00
Fernando Sahmkow
701ce1c9d0 Implement Maxwell3D Data Upload 2019-04-22 19:27:36 -04:00
Fernando Sahmkow
e4ff140b99 Introduce skeleton of the GPU Compute Engine. 2019-04-22 19:05:43 -04:00
Fernando Sahmkow
a91d3fc639 Revamp Kepler Memory to use a subegine to manage uploads 2019-04-22 18:50:56 -04:00
bunnei
68b707711a
Merge pull request #2411 from FernandoS27/unsafe-gpu
GPU Manager: Implement ReadBlockUnsafe and WriteBlockUnsafe
2019-04-22 17:09:00 -04:00
bunnei
01100f8afd
Merge pull request #2400 from FernandoS27/corret-kepler-mem
Implement Kepler Memory on both Linear and BlockLinear.
2019-04-22 16:47:05 -04:00
bunnei
da0c3bc658
Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
ReinUsesLisp
fbe8d1ceaa video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
bunnei
5bd5140bde
Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
0cfbd3325b
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow
ef381e6924 Use ReadBlockUnsafe on TIC and TSC reading
Use ReadBlockUnsafe on TIC and TSC reading as memory is never flushed
from host GPU there.
2019-04-15 23:10:24 -04:00
Fernando Sahmkow
3e96c367bd Use WriteBlock and ReadBlock. 2019-04-15 22:42:34 -04:00
Fernando Sahmkow
bec28d692d Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
Fernando Sahmkow
aa471274d9 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
Fernando Sahmkow
8a099ac99f Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
ReinUsesLisp
5c280e6ff0 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow
5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
492040bd9c Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
4841440382 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
ac3ba9a33e Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
7af82ca022 Implement Bindless Handling on SetupTexture 2019-04-08 11:23:46 -04:00
Fernando Sahmkow
e28fd3d0a5 Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
ReinUsesLisp
ddcb711ee8 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
bunnei
864280fabc
Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
Fernando Sahmkow
fc91e21206 Implement SyncPoint Register in the GPU. 2019-04-05 19:19:30 -04:00
Lioncash
22f02076c6 video_core/engines: Make memory manager members private
These aren't used externally by anything, so they can be made private
data members.
2019-04-05 18:26:43 -04:00
Lioncash
26223f8124 video_core/engines: Remove unnecessary inclusions where applicable
Replaces header inclusions with forward declarations where applicable
and also removes unused headers within the cpp file. This reduces a few
more dependencies on core/memory.h
2019-04-05 18:26:32 -04:00
ReinUsesLisp
04979560fb shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
24abeb9a67 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
bunnei
19330f45d3 maxwell_dma: Check for valid source in destination before copy.
- Avoid a crash in Octopath Traveler.
2019-03-20 22:36:03 -04:00
bunnei
22d3dfbcd4 gpu: Rewrite virtual memory manager using PageTable. 2019-03-20 22:36:02 -04:00
bunnei
574e89d924 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
bunnei
2eaf6c41a4 gpu: Use host address for caching instead of guest address. 2019-03-14 22:34:42 -04:00
bunnei
633ce92908
Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
bunnei
7b574f406b gpu: Move command processing to another thread. 2019-03-06 21:48:57 -05:00
Lioncash
f9ee0dc7ee video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
bunnei
f15e2dd881
Merge pull request #2163 from ReinUsesLisp/bitset-dirty
maxwell_3d: Use std::bitset to manage dirty flags
2019-02-27 20:50:08 -05:00
Lioncash
b9238edd0d common/math_util: Move contents into the Common namespace
These types are within the common library, so they should be within the
Common namespace.
2019-02-27 03:38:39 -05:00
ReinUsesLisp
5219edd715 maxwell_3d: Use std::bitset to manage dirty flags 2019-02-26 03:01:48 -03:00
ReinUsesLisp
5ca63d0675 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp
48e6f77c03 shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
bunnei
c07987dfab
Merge pull request #2118 from FernandoS27/ipa-improve
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-24 23:04:22 -05:00
Lioncash
a8fa5019b5 video_core: Remove usages of System::GetInstance() within the engines
Avoids the use of the global accessor in favor of explicitly making the
system a dependency within the interface.
2019-02-15 22:06:23 -05:00
Lioncash
bd983414f6 core_timing: Convert core timing into a class
Gets rid of the largest set of mutable global state within the core.
This also paves a way for eliminating usages of GetInstance() on the
System class as a follow-up.

Note that no behavioral changes have been made, and this simply extracts
the functionality into a class. This also has the benefit of making
dependencies on the core timing functionality explicit within the
relevant interfaces.
2019-02-15 21:50:25 -05:00
Fernando Sahmkow
10682ad7e0 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
bunnei
8135f4bfce
Merge pull request #2110 from lioncash/namespace
core_timing: Rename CoreTiming namespace to Core::Timing
2019-02-12 19:26:37 -05:00
bunnei
c440ecfafe
Merge pull request #2104 from ReinUsesLisp/compute-assert
kepler_compute: Fixup assert and rename the engine
2019-02-12 19:24:34 -05:00
Lioncash
48d9d66dc5 core_timing: Rename CoreTiming namespace to Core::Timing
Places all of the timing-related functionality under the existing Core
namespace to keep things consistent, rather than having the timing
utilities sitting in its own completely separate namespace.
2019-02-12 12:42:17 -05:00
Fernando Sahmkow
f5ec165e8c Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
ReinUsesLisp
1ddcd0e6f0 kepler_compute: Fixup assert and rename engines
When I originally added the compute assert I used the wrong
documentation. This addresses that.

The dispatch register was tested with homebrew against hardware and is
triggered by some games (e.g. Super Mario Odyssey). What exactly is
missing to get a valid program bound by this engine requires more
investigation.
2019-02-10 19:29:33 -03:00
bunnei
dd1aab5446 gl_rasterizer: Implement a more accurate fermi 2D copy.
- This is a blit, use the blit registers.
2019-02-06 21:54:21 -05:00
bunnei
10ab714fe0
Merge pull request #2042 from ReinUsesLisp/nouveau-tex
maxwell_3d: Allow texture handles with TIC id zero
2019-02-06 20:19:20 -05:00
bunnei
72c70d6808
Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
bb4549a73d
Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
a568cd805b
Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
0306c50339 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp
2bdbb90af7 video_core: Assert on invalid GPU to CPU address queries 2019-02-03 04:58:40 -03:00
ReinUsesLisp
04e68e9738 maxwell_3d: Allow sampler handles with TSC id zero 2019-02-03 04:58:40 -03:00
ReinUsesLisp
390721a561 maxwell_3d: Allow texture handles with TIC id zero
Also remove "enabled" field from Tegra::Texture::FullTextureInfo because
it would become unused.
2019-02-03 04:58:24 -03:00
ReinUsesLisp
9feb68085d shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00
ReinUsesLisp
477d616f7d shader_ir: Unify constant buffer offset values
Constant buffer values on the shader IR were using different offsets if
the access direct or indirect. cbuf34 has a non-multiplied offset while
cbuf36 does. On shader decoding this commit multiplies it by four on
cbuf34 queries.
2019-01-30 02:45:50 -03:00
ReinUsesLisp
3b84e04af1 shader_decode: Implement LDG and basic cbuf tracking 2019-01-30 00:00:15 -03:00
bunnei
1f4ca1e841
Merge pull request #1927 from ReinUsesLisp/shader-ir
video_core: Replace gl_shader_decompiler with an IR based decompiler
2019-01-25 23:42:14 -05:00
ReinUsesLisp
9a82dec74a maxwell_3d: Set rt_separate_frag_data to 1 by default
Commercial games assume that this value is 1 but they never set it. On
the other hand nouveau manually sets this register. On
ConfigureFramebuffers we were asserting for what we are actually
implementing (according to envytools).
2019-01-22 04:14:29 -03:00
ReinUsesLisp
a1b845b651 shader_decode: Implement VMAD and VSETP 2019-01-15 17:54:53 -03:00
ReinUsesLisp
dd91650aaf shader_decode: Implement HFMA2 2019-01-15 17:54:52 -03:00
ReinUsesLisp
4316eaf75c shader_decode: Fixup clang-format 2019-01-15 17:54:52 -03:00
ReinUsesLisp
15a0e1481d shader_ir: Initial implementation 2019-01-15 17:54:49 -03:00
ReinUsesLisp
294df41b86 shader_bytecode: Fixup encoding 2019-01-15 17:54:49 -03:00
ReinUsesLisp
a0c8c16d07 shader_header: Make local memory size getter constant 2019-01-15 17:54:49 -03:00
ReinUsesLisp
b683e41fca gl_rasterizer_cache: Use dirty flags for the depth buffer 2019-01-07 16:22:28 -03:00
ReinUsesLisp
179ee963db gl_rasterizer_cache: Use dirty flags for color buffers 2019-01-07 16:20:39 -03:00
ReinUsesLisp
0ab17ab406 gl_shader_cache: Use dirty flags for shaders 2019-01-07 16:13:12 -03:00
ReinUsesLisp
aaa0e6c346 shader_bytecode: Fixup TEXS.F16 encoding 2018-12-26 01:35:44 -03:00
David Marcec
fdd649e2ef Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
ReinUsesLisp
ef061481c5 shader_bytecode: Fixup half float's operator B encoding 2018-12-18 04:28:50 -03:00
heapo
72599cc667 Implement postfactor multiplication/division for fmul instructions 2018-12-17 07:56:25 -08:00
ReinUsesLisp
59a8df1b14 gl_shader_decompiler: Implement TEXS.F16 2018-12-05 02:06:34 -03:00
ReinUsesLisp
2908d30274 gl_rasterizer: Enable clip distances when set in register and in shader 2018-11-29 16:58:20 -03:00
bunnei
5a9a84994a
Merge pull request #1808 from Tinob/master
Fix clip distance and viewport
2018-11-28 17:47:28 -05:00
bunnei
3fe8ab0d99
Merge pull request #1786 from Tinob/DepthClamp
Add Depth Clamp Support
2018-11-28 17:46:55 -05:00
bunnei
6f849887c9
Merge pull request #1792 from bunnei/dma-pusher
gpu: Rewrite GPU command list processing with DmaPusher class.
2018-11-28 10:12:37 -05:00
bunnei
881f5ad70f
Merge pull request #1735 from FernandoS27/tex-spacing
Texture decoder: Implemented Tile Width Spacing
2018-11-27 19:21:17 -05:00
bunnei
abea6fa90c gpu: Rewrite GPU command list processing with DmaPusher class.
- More accurate impl., fixes Undertale (among other games).
2018-11-26 23:14:01 -05:00
Rodolfo Bogado
dfdbfa69e5 Implement depth clamp 2018-11-26 20:56:32 -03:00
Rodolfo Bogado
8e971f5062 Add support for Clip Distance enabled register 2018-11-26 20:45:21 -03:00
bunnei
1856d0ee8a
Merge pull request #1794 from Tinob/master
Add support for viewport_transfom_enable register
2018-11-26 18:34:09 -05:00
bunnei
67a154e23d
Merge pull request #1723 from degasus/dirty_flags
gl_rasterizer: Skip VB upload if the state is clean.
2018-11-26 18:33:22 -05:00
Marcos
cb8d51e37e GPU States: Implement Polygon Offset. This is used in SMO all the time. (#1784)
* GPU States: Implement Polygon Offset. This is used in SMO all the time.

* Clang Format fixes.

* Initialize polygon_offset in the constructor.
2018-11-26 18:31:44 -05:00
bunnei
a41943dc55
Merge pull request #1798 from ReinUsesLisp/y-direction
gl_shader_decompiler: Implement S2R's Y_DIRECTION
2018-11-26 18:25:42 -05:00
FernandoS27
ddfbe0b58d Implemented Tile Width Spacing 2018-11-26 09:05:12 -04:00
bunnei
f9a211220c
Merge pull request #1763 from ReinUsesLisp/bfi
gl_shader_decompiler: Implement BFI_IMM_R
2018-11-25 23:04:57 -05:00
bunnei
d7d1ab15b6
Merge pull request #1760 from ReinUsesLisp/r2p
gl_shader_decompiler: Implement R2P_IMM
2018-11-25 22:38:42 -05:00
bunnei
8ce90a4f0b
Merge pull request #1783 from ReinUsesLisp/clip-distances
gl_shader_decompiler: Implement clip distances
2018-11-25 22:35:30 -05:00
ReinUsesLisp
924e834b8f gl_shader_decompiler: Implement S2R's Y_DIRECTION 2018-11-25 04:37:29 -03:00
Rodolfo Bogado
13f6a603c2 Add support for viewport_transfom_enable register 2018-11-24 13:17:48 -03:00
bunnei
e23543918b
Merge pull request #1785 from Tinob/master
Add support for clear_flags register
2018-11-23 23:55:56 -05:00
bunnei
b6b78203cc
Merge pull request #1769 from ReinUsesLisp/cc
gl_shader_decompiler: Rename cc to condition code and name internal flags
2018-11-23 23:31:04 -05:00
Rodolfo Bogado
54c2a4cafc Add support for clear_flags register 2018-11-24 00:16:33 -03:00
Hexagon12
3135dbc29c Added predicate comparison LessEqualWithNan (#1736)
* Added predicate comparison LessEqualWithNan

* oops

* Clang fix
2018-11-23 08:51:32 -08:00
ReinUsesLisp
b3853403b7 gl_shader_decompiler: Implement clip distances 2018-11-23 02:14:43 -03:00
bunnei
0e6a608245 maxwell_3d: Implement alternate blend equations.
- Used by Undertale.
2018-11-22 00:51:01 -05:00
ReinUsesLisp
8a5e6fce07 gl_shader_decompiler: Rename control codes to condition codes 2018-11-21 22:31:16 -03:00
ReinUsesLisp
642dfeda2a gl_shader_decompiler: Implement BFI_IMM_R 2018-11-21 16:12:30 -03:00
ReinUsesLisp
d92afc7493 gl_shader_decompiler: Implement R2P_IMM 2018-11-21 04:56:00 -03:00
bunnei
1a543723ab maxwell_3d: Initialize rasterizer color mask registers as enabled.
- Fixes rendering regression with Sonic Mania.
2018-11-20 19:58:06 -05:00
Rodolfo Bogado
5297495c87 small fix for alphaToOne bit location 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
e69eb3c760 fix for gcc compilation 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
53b4a1af0f add AlphaToCoverage and AlphaToOne 2018-11-17 19:59:34 -03:00
Rodolfo Bogado
8ed7e1af2c add support for fragment_color_clamp 2018-11-17 19:59:33 -03:00
Rodolfo Bogado
6a2aa6dbdb set default value for point size register 2018-11-17 19:59:33 -03:00
Rodolfo Bogado
1881e86c43 fix viewport and scissor behavior 2018-11-17 19:59:32 -03:00
Markus Wick
97f5c4ffd3 gl_rasterizer: Skip VB upload if the state is clean. 2018-11-17 14:28:54 +01:00
Frederic L
ab362aa7e5
gl_rasterizer: Minor cleanup
Minor code cleanup from unaddressed feedback in #1654
2018-11-13 14:07:23 +01:00
Rodolfo Bogado
4a6eff3b7b Try to fix problems with stencil test in some games, relax translation to opengl enums to avoid crashing and only generate logs of the errors. 2018-11-11 16:31:00 -03:00
bunnei
8ea6261547
Merge pull request #1654 from degasus/dirty_flags
gl_rasterizer: Skip VAO binding if the state is clean.
2018-11-11 08:17:57 -08:00
Markus Wick
359db6a673 gl_rasterizer: Skip VAO binding if the state is clean. 2018-11-06 22:31:33 +01:00
Rodolfo Bogado
19038db489 Add support to color mask to avoid issues in blending caused by wrong values in the alpha channel in some render targets. 2018-11-05 00:24:19 -03:00
Rodolfo Bogado
145ae36963 Implement multi-target viewports and blending 2018-11-04 20:49:48 -03:00
bunnei
9afcbba8e4
Merge pull request #1527 from FernandoS27/assert-flow
Assert Control Flow Instructions using Control Codes
2018-11-01 00:34:56 -04:00
bunnei
de0ab806df maxwell_3d: Restructure macro upload to use a single macro code memory.
- Fixes an issue where macros could be skipped.
- Fixes rendering of distant objects in Super Mario Odyssey.
2018-10-31 23:29:21 -04:00
bunnei
86e70cf302
Merge pull request #1528 from FernandoS27/assert-control-codes
Assert Control Codes Generation on Shader Instructions
2018-10-31 22:34:18 -04:00
FernandoS27
5bb80ab009 Assert Control Codes Generation 2018-10-30 13:37:55 -04:00
Frederic L
7a5eda5914 global: Use std::optional instead of boost::optional (#1578)
* get rid of boost::optional

* Remove optional references

* Use std::reference_wrapper for optional references

* Fix clang format

* Fix clang format part 2

* Adressed feedback

* Fix clang format and MacOS build
2018-10-30 00:03:25 -04:00
FernandoS27
3aa8b644a9 Assert Control Flow Instructions using Control Codes 2018-10-28 19:16:41 -04:00
Rodolfo Bogado
0287b2be6d Implement sRGB Support, including workarounds for nvidia driver issues and QT sRGB support 2018-10-28 01:13:55 -03:00
bunnei
58444a0376 gl_rasterizer: Implement primitive restart. 2018-10-26 00:42:57 -04:00
bunnei
d278f25bda
Merge pull request #1533 from FernandoS27/lmem
Implemented Shader Local Memory
2018-10-26 00:16:25 -04:00
bunnei
949d9a7136 maxwell_3d: Add code for initializing register defaults. 2018-10-25 23:42:39 -04:00
FernandoS27
ca142f35c0 Implemented LD_L and ST_L 2018-10-24 17:51:53 -04:00
bunnei
69b35d7615
Merge pull request #1554 from FernandoS27/pointsize
Implement PointSize Output Attribute.
2018-10-24 17:38:38 -04:00
Lioncash
a97cdb5eb4
maxwell_3d: Remove unused variable within ProcessQueryGet() 2018-10-23 23:50:16 -04:00
FernandoS27
ed8ca608a0 Implement PointSize 2018-10-23 15:08:00 -04:00
bunnei
5716496239
Merge pull request #1519 from ReinUsesLisp/vsetp
gl_shader_decompiler: Implement VSETP
2018-10-23 10:22:37 -04:00
bunnei
0f3d8c2574
Merge pull request #1539 from lioncash/dma
maxwell_dma: Silence compilation warnings
2018-10-23 10:22:12 -04:00
bunnei
75d807788c
Merge pull request #1470 from FernandoS27/alpha_testing
Implemented Alpha Test using Shader Emulation
2018-10-23 10:21:30 -04:00
ReinUsesLisp
7d6dca0d0a gl_shader_decompiler: Implement VSETP 2018-10-23 01:07:20 -03:00
ReinUsesLisp
5dfb43531c gl_shader_decompiler: Abstract VMAD into a video subset 2018-10-23 01:07:20 -03:00
bunnei
848a49112a
Merge pull request #1512 from ReinUsesLisp/brk
gl_shader_decompiler: Implement PBK and BRK
2018-10-23 00:01:38 -04:00
FernandoS27
259da93567 Added Saturation to FMUL32I 2018-10-22 20:22:15 -04:00
FernandoS27
aa620c14af Implemented Alpha Testing 2018-10-22 15:07:30 -04:00
FernandoS27
5c5b4e8e7d Fixed FSETP and FSET 2018-10-22 11:31:17 -04:00
Lioncash
c1e5525fc6 engines/maxwell_*: Use nested namespace specifiers where applicable
These three source files are the only ones within the engines directory
that don't use nested namespaces. We may as well change these over to
keep things consistent.
2018-10-20 15:58:09 -04:00
Lioncash
d53c73adaa maxwell_dma: Make variables const where applicable within HandleCopy()
These are never modified, so we can make that assumption explicit.
2018-10-20 15:56:01 -04:00
Lioncash
dd1ee39426 maxwell_dma: Make FlushAndInvalidate's size parameter a u64
This prevents truncation warnings at the lambda's usage sites.
2018-10-20 15:54:45 -04:00
Lioncash
08e574eec4 maxwell_dma: Remove unused variables in HandleCopy()
These pointer variables are never used, so we can get rid of them.
2018-10-20 15:53:24 -04:00
bunnei
b1f8bff7db
Merge pull request #1501 from ReinUsesLisp/half-float
gl_shader_decompiler: Implement H* instructions
2018-10-19 23:47:19 -04:00
bunnei
7e665c2721 GPU: Improved implementation of maxwell DMA (Subv). 2018-10-18 22:41:53 -04:00
bunnei
a5d853a9f8 GPU: Invalidate destination address of kepler_memory writes. 2018-10-18 22:41:13 -04:00
bunnei
6b333d862b fermi_2d: Add support for more accurate surface copies. 2018-10-18 22:41:12 -04:00
ReinUsesLisp
41fb25349a gl_shader_decompiler: Implement PBK and BRK 2018-10-17 21:30:45 -03:00
FernandoS27
fd9e2d0073 Implement 3D Textures 2018-10-17 18:52:08 -04:00
ReinUsesLisp
936c36a514 shader_bytecode: Add Control Code enum 0xf
Control Code 0xf means to unconditionally execute the instruction. This
value is passed to most BRA, EXIT and SYNC instructions (among others)
but this may not always be the case.
2018-10-15 15:36:47 -03:00
ReinUsesLisp
6312eec5ef gl_shader_decompiler: Implement HSET2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
4fc8ad67bf gl_shader_decompiler: Implement HSETP2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
3d65aa4caf gl_shader_decompiler: Implement HFMA2 instructions 2018-10-15 02:55:51 -03:00
ReinUsesLisp
d93cdc2750 gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMM 2018-10-15 02:07:16 -03:00
ReinUsesLisp
d46e2a6e7a gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructions 2018-10-15 02:04:31 -03:00
ReinUsesLisp
08d751d882 gl_shader_decompiler: Setup base for half float unpacking and setting 2018-10-15 01:58:30 -03:00
FernandoS27
e0ca938b22 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
ReinUsesLisp
17290a4416 gl_shader_decompiler: Implement VMAD 2018-10-11 04:15:10 -03:00
bunnei
6d82c4adf9
Merge pull request #1458 from FernandoS27/fix-render-target-block-settings
Fixed block height settings for RenderTargets and Depth Buffers
2018-10-10 21:24:07 -04:00
bunnei
03ec936ca0
Merge pull request #1460 from FernandoS27/scissor_test
Implemented Scissor Testing
2018-10-10 12:04:10 -04:00
FernandoS27
5f4ee6f0c8 Add memory Layout to Render Targets and Depth Buffers 2018-10-09 22:28:19 -04:00
FernandoS27
af653906d0 Fixed block height settings for RenderTargets and Depth Buffers, and added block width and block depth 2018-10-09 21:14:32 -04:00
FernandoS27
30ff42b8cc Assert Scissor tests 2018-10-08 20:49:36 -04:00
ReinUsesLisp
ee4d538850 gl_shader_decompiler: Implement geometry shaders 2018-10-07 17:36:00 -03:00
bunnei
9aec85d39c fermi_2d: Implement simple copies with AccelerateSurfaceCopy. 2018-10-06 03:20:04 -04:00
ReinUsesLisp
3e2380327a gl_rasterizer: Implement quads topology 2018-10-04 00:03:44 -03:00
bunnei
fe5962e073
Merge pull request #1411 from ReinUsesLisp/point-size
video_core: Implement point_size and add point state sync
2018-09-29 11:58:39 -04:00
ReinUsesLisp
e3e51d3ddb video_core: Implement point_size and add point state sync 2018-09-28 02:13:29 -03:00
ReinUsesLisp
b8f1506aa5 gl_state: Pack sampler bindings into a single ARB_multi_bind 2018-09-28 02:04:22 -03:00
ReinUsesLisp
ab65fde9f4 video_core: Add asserts for CS, TFB and alpha testing
Add asserts for compute shader dispatching, transform feedback being
enabled and alpha testing. These have in common that they'll probably break
rendering without logging.
2018-09-25 21:07:00 -03:00
Lioncash
a8f5fd787f shader_bytecode: Lay out the Ipa-related enums better
This is more consistent with the surrounding enums.
2018-09-21 16:17:31 -04:00
Lioncash
272517cf7e shader_bytecode: Make operator== and operator!= of IpaMode const qualified
These don't affect the state of the struct and can be const member
functions.
2018-09-21 16:17:27 -04:00
bunnei
0284cbe7ec
Merge pull request #1279 from FernandoS27/csetp
shader_decompiler: Implemented (Partialy) Control Codes and CSETP
2018-09-18 22:10:48 -04:00
bunnei
6415f81bb8
Merge pull request #1299 from FernandoS27/texture-sanatize
shader_decompiler: Asserts for Texture Instructions
2018-09-18 22:10:09 -04:00
bunnei
fafc80d72e
Merge pull request #1290 from FernandoS27/shader-header
Implemented (Partialy) Shader Header
2018-09-17 18:53:14 -04:00
FernandoS27
e4bb759c4b Implemented I2I.CC on the NEU control code, used by SMO 2018-09-17 17:42:46 -04:00
FernandoS27
e2ac8fb36d Implemented CSETP 2018-09-17 17:42:44 -04:00
FernandoS27
aac77bbd18 Implemented Control Codes 2018-09-17 17:42:43 -04:00
FernandoS27
55a4756766 Added texture misc modes to texture instructions 2018-09-17 12:51:05 -04:00
bunnei
076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei
ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei
daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
Subv
c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi
63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
bunnei
cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv
bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27
a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
FernandoS27
3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27
2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27
e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
Markus Wick
c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei
49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei
e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27
00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei
223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
FernandoS27
073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
FernandoS27
82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku
915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei
d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me
a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei
b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei
2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei
be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
Lioncash
20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku
36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
tech4me
ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei
2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
Laku
8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei
125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
Lioncash
a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei
2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
Subv
6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
bunnei
028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei
b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
bunnei
51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
Subv
f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
Subv
73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv
656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei
29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei
9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv
1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv
731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv
e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv
8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
Subv
2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec
63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
Subv
c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei
534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
bunnei
f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
Subv
969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00
Subv
2dad1204e8 RasterizerGL: Ignore invalid/unset vertex attributes.
This should make the es2gears example not crash anymore.
2018-08-11 20:36:40 -05:00
bunnei
403dfd68fc
Merge pull request #1010 from bunnei/unk-vert-attrib-shader
gl_shader_decompiler: Improve handling of unknown input/output attributes.
2018-08-11 19:56:28 -04:00
bunnei
0b668d5ff3 gl_shader_decompiler: Improve handling of unknown input/output attributes. 2018-08-11 19:26:45 -04:00
bunnei
670a2c1f80
Merge pull request #1018 from Subv/ssy_sync
GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
2018-08-11 19:10:02 -04:00
Subv
c1ad973881 GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY). 2018-08-11 16:00:14 -05:00
Lioncash
b8c43b6080 video_core: Use variable template variants of type_traits interfaces where applicable 2018-08-09 20:45:48 -04:00
bunnei
efe6b473c5 maxwell_3d: Ignore macros that have not been uploaded yet.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:25:37 -04:00
bunnei
25ba4d1b68
Merge pull request #982 from bunnei/stub-unk-63
gl_shader_decompiler: Stub input attribute Unknown_63.
2018-08-08 22:28:18 -04:00
bunnei
cf917a5e93
Merge pull request #976 from bunnei/shader-imm
gl_shader_decompiler: Let OpenGL interpret floats.
2018-08-08 19:17:01 -04:00
bunnei
7f0d0a93f7 gl_shader_decompiler: Stub input attribute Unknown_63. 2018-08-08 02:35:59 -04:00
bunnei
57982df105 maxwell_3d: Use correct const buffer size and check bounds.
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08 02:10:25 -04:00
bunnei
e542356d0c gl_shader_decompiler: Let OpenGL interpret floats.
- Accuracy is lost in translation to string, e.g. with NaN.
- Needed for Super Mario Odyssey.
2018-08-08 01:45:23 -04:00
bunnei
904d7eaa94 maxwell_3d: Remove outdated assert. 2018-08-05 23:57:19 -04:00
Lioncash
6030c5ce41 video_core: Eliminate the g_renderer global variable
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.

For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
2018-08-04 02:36:57 -04:00
Subv
8f2c4191ab GPU: Remove the assert that required the CODE_ADDRESS to be 0.
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24 13:54:12 -05:00
bunnei
148a5bef7e shader_bytecode: Implement other TEXS masks. 2018-07-22 03:23:15 -04:00
bunnei
c43eaa94f3 gl_shader_decompiler: Implement SEL instruction. 2018-07-22 00:37:12 -04:00
bunnei
5287991a36 maxwell_3d: Add depth buffer enable, width, and height registers. 2018-07-21 21:51:05 -04:00
Lioncash
bb960c8cb4 video_core: Use nested namespaces where applicable
Compresses a few namespace specifiers to be more compact.
2018-07-20 18:23:54 -04:00
Lioncash
8b08f82dc7 maxwell_3d: Remove unused variable within GetStageTextures() 2018-07-19 22:38:28 -04:00
Subv
3d3b10adc7 GPU: Added register definitions for the stencil parameters. 2018-07-17 15:00:21 -05:00
bunnei
8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei
64b5e5d5d9
Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei
49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei
4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle
274d1fb0fc
Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei
3ff21345b4
Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv
c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Subv
0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei
639346bcfb
Merge pull request #625 from Subv/imnmx
GPU: Implemented the IMNMX shader instruction.
2018-07-07 19:33:50 -07:00
bunnei
51bd76a5fd
Merge pull request #629 from Subv/depth_test
GPU: Allow using the old NV04 values for the depth test function.
2018-07-05 16:43:10 -04:00
Subv
9f6a5660e8 GPU: Allow using the old NV04 values for the depth test function.
These seem to be just a valid as the GL token values. Thanks @ReinUsesLisp

This restores graphical output to Disgaea 5
2018-07-05 13:01:31 -05:00
bunnei
762bf6a522
Merge pull request #626 from Subv/shader_sync
GPU: Stub the shader SYNC and DEPBAR instructions.
2018-07-05 12:54:19 -04:00
bunnei
8b815877a6
Merge pull request #622 from Subv/unused_tex
GPU: Ignore unused textures and corrected the TEX shader instruction decoding.
2018-07-05 11:29:17 -04:00
bunnei
1b0a74e23f
Merge pull request #621 from Subv/psetp_
GPU: Implemented the PSETP shader instruction.
2018-07-05 11:28:50 -04:00
Subv
b0c92b80b1 GPU: Implemented the IMNMX shader instruction.
It's similar to the FMNMX instruction but it works on integers.
2018-07-04 15:44:37 -05:00
Subv
77cfe4f027 GPU: Stub the shader SYNC and DEPBAR instructions.
It is unknown at this moment if we actually need to do something with these instructions or if the GLSL compiler takes care of that for us.
2018-07-04 15:29:51 -05:00
Subv
c42b818cf9 GPU: Corrected the decoding for the TEX shader instruction. 2018-07-04 15:19:20 -05:00
Subv
53a55bd751 GPU: Implemented the PSETP shader instruction.
It's similar to the isetp and fsetp instructions but it works on predicates instead.
2018-07-04 15:15:03 -05:00
Subv
c1bebdef5e GPU: Flip the triangle front face winding if the GPU is configured to not flip the triangles.
OpenGL's default behavior is already correct when the GPU is configured to flip the triangles.

This fixes 1-2 Switch's splash screen.
2018-07-04 10:26:46 -05:00
bunnei
c996787d84
Merge pull request #609 from Subv/clear_buffers
GPU: Implemented the CLEAR_BUFFERS register.
2018-07-03 19:34:34 -04:00
Subv
c1811ed3d1 GPU: Support clears that don't clear the color buffer. 2018-07-03 16:56:47 -05:00
Subv
be51120d23 GPU: Bind and clear the render target when the CLEAR_BUFFERS register is written to. 2018-07-03 16:56:44 -05:00
Subv
827bb08c91 GPU: Added registers for the CLEAR_BUFFERS and CLEAR_COLOR methods. 2018-07-03 16:56:31 -05:00
bunnei
15e68cdbaa
Merge pull request #607 from jroweboy/logging
Logging - Customizable backends
2018-07-03 00:26:45 -04:00
bunnei
ddb767f1b6
Merge pull request #611 from Subv/enabled_depth_test
GPU: Don't try to parse the depth test function if the depth test is disabled and use only the least significant 3 bits in the depth test func
2018-07-02 23:47:11 -04:00
bunnei
5410b4659d
Merge pull request #610 from Subv/mufu_8
GPU: Implemented MUFU suboperation 8, sqrt.
2018-07-02 22:26:42 -04:00
Subv
6e0eba9917 GPU: Use only the least significant 3 bits when reading the depth test func.
Some games set the full GL define value here (including nouveau), but others just seem to set those last 3 bits.
2018-07-02 21:06:36 -05:00
James Rowe
0d46f0df12 Update clang format 2018-07-02 21:45:47 -04:00
James Rowe
638956aa81 Rename logging macro back to LOG_* 2018-07-02 21:45:47 -04:00
bunnei
92c7135065
Merge pull request #608 from Subv/depth
GPU: Implemented the depth buffer and depth test + culling
2018-07-02 21:24:43 -04:00
Subv
6e4e0b2b41 GPU: Implemented MUFU suboperation 8, sqrt. 2018-07-02 19:48:15 -05:00
Sebastian Valle
055f1546d7
Merge pull request #606 from Subv/base_vertex
GPU: Fixed the index offset and implement BaseVertex when doing indexed rendering.
2018-07-02 14:07:38 -05:00
Sebastian Valle
9685dd5840
Merge pull request #605 from Subv/dma_copy
GPU: Directly copy the pixels when performing a same-layout DMA.
2018-07-02 14:06:56 -05:00
Subv
c1f55c32c8 GPU: Added registers for depth test and cull mode. 2018-07-02 13:31:20 -05:00
Subv
0f929762b3 GPU: Implemented the Z24S8 depth format and load the depth framebuffer. 2018-07-02 12:42:04 -05:00
Subv
cc73bad293 GPU: Added register definitions for the vertex buffer base element. 2018-07-02 11:21:23 -05:00
Subv
ca633a5a3c GPU: Directly copy the pixels when performing a same-layout DMA. 2018-07-02 09:46:33 -05:00
bunnei
066d6184d4
Merge pull request #602 from Subv/mufu_subop
GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
2018-07-01 11:06:04 -04:00
Subv
f33e406ff2 GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation. 2018-06-30 14:48:25 -05:00
bunnei
c96da97630 gl_shader_decompiler: Implement predicate NotEqualWithNan. 2018-06-30 03:01:25 -04:00
bunnei
6a28a66832 maxwell_3d: Add a struct for RenderTargetConfig. 2018-06-27 00:08:04 -04:00
Subv
a3d82ef5d9 Build: Fixed some MSVC warnings in various parts of the code. 2018-06-20 11:39:10 -05:00
Subv
eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei
afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei
5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei
d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei
019d7208c8
Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei
2015a1b180
Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv
db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv
987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei
5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
Subv
b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv
3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei
d81aaa3ed3
Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei
e2176dc7ce
Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei
5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv
abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei
bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei
79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
Subv
c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei
92209f905f gl_shader_decompiler: Implement BFE_IMM instruction. 2018-06-07 00:58:12 -04:00
bunnei
128aeba0f3 gl_shader_decompiler: F2F: Implement rounding modes. 2018-06-06 22:21:29 -04:00
bunnei
4b114e1b8a shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD. 2018-06-06 19:47:34 -04:00
bunnei
0ff2929644
Merge pull request #534 from Subv/multitexturing
GPU: Implement sampling multiple textures in the generated glsl shaders.
2018-06-06 19:12:52 -04:00
bunnei
4669f15f8b gl_shader_decompiler: Implement LD_C instruction. 2018-06-06 18:09:06 -04:00
bunnei
6e386a334b gl_shader_decompiler: Refactor uniform handling to allow different decodings. 2018-06-06 17:57:15 -04:00
Subv
dbfc39d214 GPU: Implement sampling multiple textures in the generated glsl shaders.
All tested games that use a single texture show no regression.

Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06 12:58:16 -05:00
bunnei
5fb99e6a16
Merge pull request #516 from Subv/f2i_r
GPU: Implemented the F2I_R shader instruction.
2018-06-05 22:01:29 -04:00
bunnei
38eb33f150
Merge pull request #521 from Subv/bra
GPU: Corrected the branch targets for the shader bra instruction.
2018-06-05 10:09:35 -04:00
Subv
e7dfcdde74 GPU: Corrected the branch targets for the shader bra instruction. 2018-06-04 22:56:28 -05:00
Subv
4b89348c00 GPU: Implemented the F2I_R shader instruction. 2018-06-04 22:06:50 -05:00
bunnei
c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
Subv
23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv
438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei
e8bfff7b4b
Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei
f564822e78
Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
bunnei
37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei
cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
Subv
2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv
5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv
0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv
7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv
b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv
06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei
ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv
d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei
1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei
bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei
e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
bunnei
888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei
4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei
15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv
99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
bunnei
68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
bunnei
ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei
898f0fa029
Merge pull request #458 from Subv/fmnmx
Shaders: Implemented the FMNMX shader instruction.
2018-05-20 23:44:07 -04:00
Subv
8440cef223 Shaders: Implemented the FMNMX shader instruction. 2018-05-20 17:53:06 -05:00
Subv
a056d5ad8c ShadersDecompiler: Added decoding for the PSETP instruction. 2018-05-19 11:41:14 -05:00
bunnei
f41eb95e13 maxwell_3d: Reset vertex counts after drawing. 2018-04-29 16:23:31 -04:00
bunnei
c7ce472eeb shader_bytecode: Add decoding for FMNMX instruction. 2018-04-29 16:05:17 -04:00
bunnei
6c464a2a4a
Merge pull request #416 from bunnei/shader-ints-p3
gl_shader_decompiler: Implement MOV32I, partially implement I2I, I2F
2018-04-29 12:56:16 -04:00
bunnei
f87ea8fa8b fermi_2d: Fix surface copy block height. 2018-04-28 20:40:03 -04:00