Commit graph

3602 commits

Author SHA1 Message Date
Lioncash
a97cdb5eb4
maxwell_3d: Remove unused variable within ProcessQueryGet() 2018-10-23 23:50:16 -04:00
FernandoS27
ed8ca608a0 Implement PointSize 2018-10-23 15:08:00 -04:00
FernandoS27
e0ea2f5f6e Fixed Layered Textures Loading and Cubemaps 2018-10-23 14:27:36 -04:00
bunnei
5716496239
Merge pull request #1519 from ReinUsesLisp/vsetp
gl_shader_decompiler: Implement VSETP
2018-10-23 10:22:37 -04:00
bunnei
0f3d8c2574
Merge pull request #1539 from lioncash/dma
maxwell_dma: Silence compilation warnings
2018-10-23 10:22:12 -04:00
bunnei
75d807788c
Merge pull request #1470 from FernandoS27/alpha_testing
Implemented Alpha Test using Shader Emulation
2018-10-23 10:21:30 -04:00
ReinUsesLisp
7d6dca0d0a gl_shader_decompiler: Implement VSETP 2018-10-23 01:07:20 -03:00
ReinUsesLisp
5dfb43531c gl_shader_decompiler: Abstract VMAD into a video subset 2018-10-23 01:07:20 -03:00
bunnei
848a49112a
Merge pull request #1512 from ReinUsesLisp/brk
gl_shader_decompiler: Implement PBK and BRK
2018-10-23 00:01:38 -04:00
bunnei
496d155d7b
Merge pull request #1550 from FernandoS27/fmul32
Added Saturation to FMUL32I
2018-10-22 23:58:09 -04:00
bunnei
4cccfb4190
Merge pull request #1537 from lioncash/shader
gl_shader_decompiler: Minor changes
2018-10-22 22:49:49 -04:00
FernandoS27
259da93567 Added Saturation to FMUL32I 2018-10-22 20:22:15 -04:00
FernandoS27
8e1239fbc5 Assert that multiple render targets are not set while alpha testing 2018-10-22 15:35:45 -04:00
FernandoS27
59a004f915 Use standard UBO and fix/stylize the code 2018-10-22 15:07:33 -04:00
FernandoS27
17315cee16 Cache uniform locations and restructure the implementation 2018-10-22 15:07:32 -04:00
FernandoS27
bcb5b924fd Remove SyncAlphaTest and clang format 2018-10-22 15:07:31 -04:00
FernandoS27
7b39107e3a Added Alpha Func 2018-10-22 15:07:30 -04:00
FernandoS27
aa620c14af Implemented Alpha Testing 2018-10-22 15:07:30 -04:00
bunnei
1226a5706e
Merge pull request #1547 from FernandoS27/fix-fset
Fixed FSETP and FSET
2018-10-22 12:53:47 -04:00
FernandoS27
5c5b4e8e7d Fixed FSETP and FSET 2018-10-22 11:31:17 -04:00
FernandoS27
e2416bbd1f Fixed VAOs Float types only returning GL_FLOAT in cases that they had to return GL_HALF_FLOAT 2018-10-22 09:27:00 -04:00
Lioncash
c1e5525fc6 engines/maxwell_*: Use nested namespace specifiers where applicable
These three source files are the only ones within the engines directory
that don't use nested namespaces. We may as well change these over to
keep things consistent.
2018-10-20 15:58:09 -04:00
Lioncash
d53c73adaa maxwell_dma: Make variables const where applicable within HandleCopy()
These are never modified, so we can make that assumption explicit.
2018-10-20 15:56:01 -04:00
Lioncash
dd1ee39426 maxwell_dma: Make FlushAndInvalidate's size parameter a u64
This prevents truncation warnings at the lambda's usage sites.
2018-10-20 15:54:45 -04:00
Lioncash
08e574eec4 maxwell_dma: Remove unused variables in HandleCopy()
These pointer variables are never used, so we can get rid of them.
2018-10-20 15:53:24 -04:00
Lioncash
8a86c8d48b gl_shader_decompiler: Allow std::move to function in SetPredicate
If the variable being moved is const, then std::move will always perform
a copy (since it can't actually move the data).
2018-10-20 14:25:15 -04:00
Lioncash
381baf783d gl_shader_decompiler: Get rid of variable shadowing warnings
A variable with the same name was previously declared in an outer scope.
2018-10-20 14:22:37 -04:00
Lioncash
61ef8af1e2 gl_shader_decompiler: Fix a few comment typos 2018-10-20 14:19:28 -04:00
ReinUsesLisp
3ec795d95e gl_shader_decompiler: Move position varying declaration back to gl_shader_gen
The intention of declaring them in gl_shader_decompiler was to be able
to use blocks to implement geometry shaders. But that wasn't needed in
the end and it caused issues when both vertex stages were being used,
resulting in a redeclaration of "position".
2018-10-20 02:19:30 -03:00
bunnei
b1f8bff7db
Merge pull request #1501 from ReinUsesLisp/half-float
gl_shader_decompiler: Implement H* instructions
2018-10-19 23:47:19 -04:00
bunnei
7e665c2721 GPU: Improved implementation of maxwell DMA (Subv). 2018-10-18 22:41:53 -04:00
bunnei
bcde71d4d9 decoders: Introduce functions for un/swizzling subrects. 2018-10-18 22:41:43 -04:00
bunnei
a5d853a9f8 GPU: Invalidate destination address of kepler_memory writes. 2018-10-18 22:41:13 -04:00
bunnei
6b333d862b fermi_2d: Add support for more accurate surface copies. 2018-10-18 22:41:12 -04:00
bunnei
6acd8d166a
Merge pull request #1505 from FernandoS27/tex-3d
Implemented 3D Textures
2018-10-18 11:50:42 -04:00
ReinUsesLisp
41fb25349a gl_shader_decompiler: Implement PBK and BRK 2018-10-17 21:30:45 -03:00
bunnei
77e2d68df7
Merge pull request #1489 from FernandoS27/fix-tlds
shader_decompiler: Fix TLDS
2018-10-17 18:58:38 -04:00
FernandoS27
caaa9914fd Clang format and other fixes 2018-10-17 18:52:11 -04:00
FernandoS27
cb9fdc7a26 Implement Reinterpret Surface, to accurately blit 3D textures 2018-10-17 18:52:10 -04:00
FernandoS27
dbc34db6ce Implement GetInRange in the Rasterizer Cache 2018-10-17 18:52:10 -04:00
FernandoS27
fd9e2d0073 Implement 3D Textures 2018-10-17 18:52:08 -04:00
bunnei
f912a82a8e
Merge pull request #1497 from bunnei/flush-framebuffers
Implement flushing in the rasterizer cache
2018-10-17 18:40:34 -04:00
bunnei
86dcf2942b
Merge pull request #1496 from FernandoS27/tex-array
Implement Arrays on Tex Instruction
2018-10-17 18:30:44 -04:00
bunnei
648b55c6b9 gl_rasterizer_cache: Remove unnecessary block_depth=1 on Flush. 2018-10-17 18:20:15 -04:00
bunnei
2a035a1f6f gl_rasterizer_cache: Remove unnecessary temporary buffer with unswizzle. 2018-10-17 18:19:35 -04:00
bunnei
43b9494a0f gl_rasterizer_cache: Use AccurateCopySurface for use_accurate_gpu_emulation. 2018-10-16 17:20:49 -04:00
bunnei
ee7c2dbf5a config: Rename use_accurate_framebuffers -> use_accurate_gpu_emulation.
- This will be used as a catch-all for slow-but-accurate GPU emulation paths.
2018-10-16 17:02:29 -04:00
bunnei
91602de7f2 rasterizer_cache: Refactor to support in-order flushing. 2018-10-16 16:51:53 -04:00
bunnei
0e59291310 gl_rasterizer_cache: Refactor to only call GetRegionEnd on surface creation. 2018-10-16 11:31:02 -04:00
bunnei
949d7832fa gl_rasterizer_cache: Only flush when use_accurate_framebuffers is enabled. 2018-10-16 11:31:02 -04:00
bunnei
5f79ba04bd gl_rasterizer_cache: Separate guest and host surface size managment. 2018-10-16 11:31:01 -04:00
bunnei
58be4dff79 gl_rasterizer_cache: Rename GetGLBytesPerPixel to GetBytesPerPixel.
- This does not really have anything to do with OpenGL.
2018-10-16 11:31:01 -04:00
bunnei
cf7b46c101 gl_rasterizer_cache: Remove unused FlushSurface method. 2018-10-16 11:31:01 -04:00
bunnei
3afdfd7bfa gl_rasterizer: Implement flushing. 2018-10-16 11:31:01 -04:00
bunnei
b4e29ccb81 gl_rasterizer_cache: Remove usage of Memory::Read/Write functions.
- These cannot be used within the cache, as they change cache state.
2018-10-16 11:31:00 -04:00
bunnei
4e9683e9d5 gl_rasterizer_cache: Clamp cached surface size to mapped GPU region size. 2018-10-16 11:31:00 -04:00
bunnei
37575eae65 memory_manager: Add a method for querying the end of a mapped GPU region. 2018-10-16 11:31:00 -04:00
bunnei
0be7e82289 rasterizer_cache: Reintroduce method for flushing. 2018-10-16 11:31:00 -04:00
bunnei
9b929e934b gl_rasterizer_cache: Reintroduce code for handling swizzle and flush to guest RAM. 2018-10-16 11:30:59 -04:00
ReinUsesLisp
936c36a514 shader_bytecode: Add Control Code enum 0xf
Control Code 0xf means to unconditionally execute the instruction. This
value is passed to most BRA, EXIT and SYNC instructions (among others)
but this may not always be the case.
2018-10-15 15:36:47 -03:00
ReinUsesLisp
b461342a84 gl_shader_decompiler: Fixup style inconsistencies 2018-10-15 15:35:26 -03:00
ReinUsesLisp
27916764b1 gl_rasterizer: Silence implicit cast warning in glBindBufferRange 2018-10-15 15:26:50 -03:00
ReinUsesLisp
6312eec5ef gl_shader_decompiler: Implement HSET2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
4fc8ad67bf gl_shader_decompiler: Implement HSETP2_R 2018-10-15 02:55:51 -03:00
ReinUsesLisp
3d65aa4caf gl_shader_decompiler: Implement HFMA2 instructions 2018-10-15 02:55:51 -03:00
ReinUsesLisp
d93cdc2750 gl_shader_decompiler: Implement HADD2_IMM and HMUL2_IMM 2018-10-15 02:07:16 -03:00
ReinUsesLisp
d46e2a6e7a gl_shader_decompiler: Implement non-immediate HADD2 and HMUL2 instructions 2018-10-15 02:04:31 -03:00
ReinUsesLisp
08d751d882 gl_shader_decompiler: Setup base for half float unpacking and setting 2018-10-15 01:58:30 -03:00
bunnei
14286f70f0
Merge pull request #1488 from Hexagon12/astc-types
video_core: Added ASTC 5x4; 8x5 types
2018-10-14 14:44:24 -04:00
FernandoS27
1d6559fbd3 Implement Arrays on Tex Instruction 2018-10-14 13:31:02 -04:00
FernandoS27
d880b77698 Fix TLDS 2018-10-13 22:14:25 -04:00
FernandoS27
331ce2942c Shorten the implementation of 3D swizzle to only 3 functions 2018-10-13 20:58:00 -04:00
FernandoS27
1ff20d8538 Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBuffer 2018-10-13 16:11:11 -04:00
FernandoS27
e0ca938b22 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
FernandoS27
d4ae43f9c1 Remove old Swizzle algorithms and use 3d Swizzle 2018-10-13 15:25:17 -04:00
FernandoS27
4d959c6bdc Implement Precise 3D Swizzle 2018-10-13 15:25:16 -04:00
FernandoS27
736db284d2 Implement Fast 3D Swizzle 2018-10-13 15:25:15 -04:00
Hexagon12
cbf723896f Added ASTC 5x4; 8x5 2018-10-13 17:10:26 +03:00
FernandoS27
97b6405a17 Implemented helper function to correctly calculate a texture's size 2018-10-12 14:21:53 -04:00
ReinUsesLisp
17290a4416 gl_shader_decompiler: Implement VMAD 2018-10-11 04:15:10 -03:00
bunnei
6d82c4adf9
Merge pull request #1458 from FernandoS27/fix-render-target-block-settings
Fixed block height settings for RenderTargets and Depth Buffers
2018-10-10 21:24:07 -04:00
bunnei
03ec936ca0
Merge pull request #1460 from FernandoS27/scissor_test
Implemented Scissor Testing
2018-10-10 12:04:10 -04:00
bunnei
ee1b204749
Merge pull request #1425 from ReinUsesLisp/geometry-shaders
gl_shader_decompiler: Implement geometry shaders
2018-10-10 11:51:29 -04:00
FernandoS27
5f4ee6f0c8 Add memory Layout to Render Targets and Depth Buffers 2018-10-09 22:28:19 -04:00
FernandoS27
af653906d0 Fixed block height settings for RenderTargets and Depth Buffers, and added block width and block depth 2018-10-09 21:14:32 -04:00
Lioncash
6e27c5d4d1 gl_shader_decompiler: Remove unused variables in TMML's implementation
Given "y" isn't always used, but "x" is, we can rearrange this to avoid
unused variable warnings by changing the names of op_a and op_b
2018-10-09 15:44:37 -04:00
FernandoS27
be97fc884d Implement Scissor Test 2018-10-08 21:36:23 -04:00
FernandoS27
30ff42b8cc Assert Scissor tests 2018-10-08 20:49:36 -04:00
ReinUsesLisp
7c2d6ef210 gl_shader_decompiler: Move position varying location from 15 to 0 and apply an offset 2018-10-07 17:36:00 -03:00
ReinUsesLisp
ee4d538850 gl_shader_decompiler: Implement geometry shaders 2018-10-07 17:36:00 -03:00
ReinUsesLisp
4d0c682468 video_core: Allow LabelGLObject to use extra info on any object 2018-10-07 17:27:49 -03:00
bunnei
2c0b0ad50d
Merge pull request #1446 from bunnei/fast_fermi_copy
gl_rasterizer: Implement accelerated Fermi2D copies.
2018-10-06 23:18:52 -04:00
bunnei
1cc5e6e9bc
Merge pull request #1437 from FernandoS27/tex-mode2
Implemented Depth Compare, Shadow Samplers and Texture Processing Modes for TEXS and TLDS
2018-10-06 23:14:27 -04:00
ReinUsesLisp
0ecd181cca gl_rasterizer: Fixup undefined behaviour in SetupDraw 2018-10-06 23:22:48 -03:00
FernandoS27
752faff2bc Implemented Depth Compare and Shadow Samplers 2018-10-06 11:27:54 -04:00
bunnei
9aec85d39c fermi_2d: Implement simple copies with AccelerateSurfaceCopy. 2018-10-06 03:20:04 -04:00
bunnei
011cf77796 gl_rasterizer: Add rasterizer cache code to handle accerated fermi copies. 2018-10-06 03:20:04 -04:00
bunnei
749aef3dd0 gl_rasterizer_cache: Implement a simpler surface copy using glCopyImageSubData. 2018-10-06 03:20:04 -04:00
ReinUsesLisp
3e2380327a gl_rasterizer: Implement quads topology 2018-10-04 00:03:44 -03:00
FernandoS27
f664437ae8 Implemented Texture Processing Modes in TEXS and TLDS 2018-10-03 08:41:12 -04:00
ReinUsesLisp
1835911425 gl_rasterizer: Fixup unassigned point sizes 2018-10-01 01:18:24 -03:00
bunnei
bc679c9b8c
Merge pull request #1330 from raven02/tlds
TLDS: Add 1D sampler
2018-09-30 21:53:38 -04:00
bunnei
df3799a008 gl_rasterizer_cache: Fixes to how we do render to cubemap.
- Fixes issues with Splatoon 2.
2018-09-30 15:10:14 -04:00
bunnei
29782273ec gl_rasterizer_cache: Add check for array rendering to cubemap texture. 2018-09-30 14:31:58 -04:00
bunnei
f543b43fd0 gl_rasterizer_cache: Implement render to cubemap. 2018-09-30 14:31:58 -04:00
bunnei
15cc729ebd gl_shader_decompiler: TEXS: Implement TextureType::TextureCube. 2018-09-30 14:31:58 -04:00
bunnei
ed2e0e85c9 gl_rasterizer_cache: Add support for SurfaceTarget::TextureCubemap. 2018-09-30 14:31:58 -04:00
bunnei
871580dcd8 gl_rasterizer_cache: Implement LoadGLBuffer for Texture2DArray. 2018-09-30 14:31:57 -04:00
bunnei
a9aa1db552 gl_rasterizer_cache: Update BlitTextures to support non-Texture2D ColorTexture surfaces. 2018-09-30 14:31:57 -04:00
bunnei
2e1cdde994 gl_rasterizer_cache: Track texture target and depth in the cache. 2018-09-30 14:31:57 -04:00
bunnei
fefb003b23 gl_rasterizer_cache: Workaround for Texture2D -> Texture2DArray scenario. 2018-09-30 14:31:56 -04:00
bunnei
ce452049d3 gl_rasterizer_cache: Keep track of surface 2D size separately from total size. 2018-09-30 14:31:56 -04:00
raven02
b92b4bbeaf Fix trailing whitespace 2018-09-30 23:51:10 +08:00
bunnei
fe5962e073
Merge pull request #1411 from ReinUsesLisp/point-size
video_core: Implement point_size and add point state sync
2018-09-29 11:58:39 -04:00
bunnei
8d8366b602
Merge pull request #1406 from ReinUsesLisp/multibind-samplers
gl_state: Pack sampler bindings into a single ARB_multi_bind
2018-09-29 10:55:45 -04:00
ReinUsesLisp
e3e51d3ddb video_core: Implement point_size and add point state sync 2018-09-28 02:13:29 -03:00
ReinUsesLisp
b8f1506aa5 gl_state: Pack sampler bindings into a single ARB_multi_bind 2018-09-28 02:04:22 -03:00
bunnei
62048edc15
Merge pull request #1377 from FernandoS27/faster-swizzle
Improved Fast Swizzle and Legacy Swizzle
2018-09-27 10:00:04 -04:00
ReinUsesLisp
ab65fde9f4 video_core: Add asserts for CS, TFB and alpha testing
Add asserts for compute shader dispatching, transform feedback being
enabled and alpha testing. These have in common that they'll probably break
rendering without logging.
2018-09-25 21:07:00 -03:00
David
9f3fc067bf Added glObjectLabels for renderdoc for textures and shader programs (#1384)
* Added glObjectLabels for renderdoc for textures and shader programs

* Changed hardcoded "Texture" name to reflect the texture type instead

* Removed string initialize
2018-09-23 17:55:41 -04:00
greggameplayer
b91e2d55f3 correct BC6H 2018-09-23 19:17:22 +02:00
bunnei
fb9f273e90
Merge pull request #1380 from lioncash/const
shader_bytecode: Make operator== and operator!= of IpaMode const qualified
2018-09-22 01:37:36 -04:00
bunnei
73eea61614
Merge pull request #1382 from lioncash/inc
gl_state: Remove unused type alias
2018-09-22 01:36:21 -04:00
Lioncash
90746c33c7 gl_state: Remove unused type alias
This isn't used anywhere within the header, so we can remove it, along
with the include that was previously necessary. This also uncovers an
indirect include in the cpp file for the assertion macros.
2018-09-21 18:22:43 -04:00
Lioncash
a8f5fd787f shader_bytecode: Lay out the Ipa-related enums better
This is more consistent with the surrounding enums.
2018-09-21 16:17:31 -04:00
Lioncash
272517cf7e shader_bytecode: Make operator== and operator!= of IpaMode const qualified
These don't affect the state of the struct and can be const member
functions.
2018-09-21 16:17:27 -04:00
bunnei
4f186de069
Merge pull request #1379 from lioncash/bitwise
gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
2018-09-21 14:02:00 -04:00
FernandoS27
57b44200a2 Reverse stride align restriction on FastSwizzle due to lost performance 2018-09-21 12:09:59 -04:00
FernandoS27
d2dd1289bd Join both Swizzle methods within one interface function 2018-09-21 11:42:34 -04:00
FernandoS27
41c6c4593a Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table instead 2018-09-21 11:34:54 -04:00
FernandoS27
f020319a45 Remove same output bpp restriction on FastSwizzle 2018-09-21 11:10:44 -04:00
FernandoS27
68aaa83836 Improved Legacy Swizzler to be better documented and work better 2018-09-21 10:57:12 -04:00
Lioncash
ba02dd9ebc gl_stream_buffer: Fix use of bitwise OR instead of logical OR in Map()
This was very likely intended to be a logical OR based off the
conditioning and testing of inversion in one case.

Even if this was intentional, this is the kind of non-obvious thing one
should be clarifying with a comment.
2018-09-21 07:59:03 -04:00
Subv
9cd5c61fcf RasterizerGL: Use the correct framebuffer when clearing via the CLEAR_BUFFERS register.
Previously we were clearing the default backbuffer framebuffer.

Found thanks to a Piglit test :)
2018-09-20 22:31:53 -05:00
FernandoS27
bf2f2a715f Improved fast swizzle and removed restrictions to it 2018-09-20 23:06:53 -04:00
raven02
c8f9bbbf85
Merge branch 'master' into tlds 2018-09-19 19:53:11 +08:00
Markus Wick
f465e4aaf2 gl_rasterizer: Fix StartAddress handling with indexed draw calls.
We uploaded the wrong data before. So the offset on the host GPU pointer may work for the first vertices, the last ones run out bounds.
Let's just offset the upload instead.
2018-09-19 09:22:30 +02:00
bunnei
bd88d4108f
Merge pull request #1342 from lioncash/trunc
gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
2018-09-18 22:11:48 -04:00
bunnei
0284cbe7ec
Merge pull request #1279 from FernandoS27/csetp
shader_decompiler: Implemented (Partialy) Control Codes and CSETP
2018-09-18 22:10:48 -04:00
bunnei
6415f81bb8
Merge pull request #1299 from FernandoS27/texture-sanatize
shader_decompiler: Asserts for Texture Instructions
2018-09-18 22:10:09 -04:00
FernandoS27
567a5524b9 Implemented Internal Flags 2018-09-17 20:50:54 -04:00
Lioncash
9a8dbba1e5 gl_shader_decompiler: Avoid truncation warnings within LD_A and ST_A code
These are internally stored as u64 values, so using u32 here causes
truncation warnings. Instead, we can just use u64 and preserve the bit
width.
2018-09-17 19:25:55 -04:00
bunnei
fafc80d72e
Merge pull request #1290 from FernandoS27/shader-header
Implemented (Partialy) Shader Header
2018-09-17 18:53:14 -04:00
FernandoS27
e4bb759c4b Implemented I2I.CC on the NEU control code, used by SMO 2018-09-17 17:42:46 -04:00
FernandoS27
e2ac8fb36d Implemented CSETP 2018-09-17 17:42:44 -04:00
FernandoS27
aac77bbd18 Implemented Control Codes 2018-09-17 17:42:43 -04:00
FernandoS27
31e52113b3 Added asserts for texture misc modes to texture instructions 2018-09-17 12:56:36 -04:00
FernandoS27
55a4756766 Added texture misc modes to texture instructions 2018-09-17 12:51:05 -04:00
bunnei
a94b623dfb
Merge pull request #1311 from FernandoS27/fast-swizzle
Optimized Texture Swizzling
2018-09-17 12:39:34 -04:00
bunnei
27fe8159c5
Merge pull request #1316 from lioncash/shadow
gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
2018-09-17 12:27:35 -04:00
raven02
b91f7d5d67 Add 1D sampler for TLDS - TexelFetch (Mario Rabbids) 2018-09-17 23:25:18 +08:00
bunnei
076add4ccd
Merge pull request #1326 from FearlessTobi/port-4182
Port #4182 from Citra: "Prefix all size_t with std::"
2018-09-17 09:51:47 -04:00
bunnei
3be048e50a
Merge pull request #1329 from raven02/bgr5a1u
Implement RenderTargetFormat::BGR5A1_UNORM
2018-09-17 09:49:00 -04:00
raven02
2845348608 Implement ASTC_2D_8X8 (Bayonetta 2) 2018-09-17 01:04:27 +08:00
bunnei
ba480ea2fb
Merge pull request #1273 from Subv/ld_sizes
Shaders: Implemented multiple-word loads and stores to and from attribute memory.
2018-09-15 15:27:12 -04:00
bunnei
daee15b058
Merge pull request #1271 from Subv/kepler_engine
GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
2018-09-15 13:27:07 -04:00
raven02
0019a36b41 Implement RenderTargetFormat::BGR5A1_UNORM (Pokken Tournament DX) 2018-09-16 00:21:42 +08:00
Subv
c878a819d7 Shaders: Implemented multiple-word loads and stores to and from attribute memory.
This seems to be an optimization performed by nouveau.
2018-09-15 11:21:21 -05:00
fearlessTobi
63c2e32e20 Port #4182 from Citra: "Prefix all size_t with std::" 2018-09-15 15:21:06 +02:00
FernandoS27
f8e994354f Optimized Texture Swizzling 2018-09-14 12:45:49 -04:00
Lioncash
ae128f0375 gl_shader_decompiler: Get rid of variable shadowing within LEA instructions
These variables are already defined within an outer scope.
2018-09-13 21:53:23 -04:00
ReinUsesLisp
a42376dfad Use ARB_multi_bind for uniform buffers (#1287)
* gl_rasterizer: use ARB_multi_bind for uniform buffers

* address feedback
2018-09-12 20:27:43 -04:00
bunnei
4a43fb7e1d gl_rasterizer_cache: B5G6R5U should use GL_RGB8 as an internal format.
- Fixes a regression with Sonic Mania with ARB_texture_storage.
2018-09-12 18:09:31 -04:00
bunnei
cc50857460
Merge pull request #1263 from FernandoS27/tex-mode
shader_decompiler:  Implemented (Partially) Texture Processing Modes
2018-09-12 16:03:34 -04:00
Subv
bb5eb4f20a GPU: Basic implementation of the Kepler Inline Memory engine (p2mf).
This engine writes data from a FIFO register into the configured address.
2018-09-12 13:57:08 -05:00
FernandoS27
a99d9db32f Implemented Texture Processing Modes 2018-09-12 12:28:22 -04:00
bunnei
89825766ee
Merge pull request #1278 from tech4me/bg-color-fix
Port Citra #4047 & #4052: add change background color support
2018-09-11 23:13:11 -04:00
bunnei
522a11a11f
Merge pull request #1295 from bunnei/accurate-copies
gl_rasterizer_cache: Improve accuracy of caching and copies.
2018-09-11 23:12:15 -04:00
bunnei
4a9acc87f9
Merge pull request #1294 from degasus/optimizations
gl_rasterizer: Use ARB_texture_storage.
2018-09-11 23:11:36 -04:00
bunnei
7bb226f22d gl_rasterizer_cache: Always blit on recreate, regardless of format.
- Fixes several rendering issues with Super Mario Odyssey.
2018-09-11 22:54:46 -04:00
bunnei
cdddd71d08 gl_shader_cache: Remove cache_width/cache_height.
- This was once an optimization, but we no longer need it with the cache reserve.
- This is also inaccurate.
2018-09-11 20:12:29 -04:00
Markus Wick
3e973bc4c6 gl_rasterizer: Use ARB_texture_storage.
It allows us to use texture views and it reduces the overhead within the GPU driver.

But it disallows us to reallocate the texture, but we don't do so anyways.

In the end, it is the new way to allocate textures, so there is no need to use the old way.
2018-09-11 22:18:46 +02:00
FernandoS27
5c676dc884 Implemented LEA and PSET 2018-09-11 12:50:52 -04:00
FernandoS27
3f0922715a Implemented encodings for LEA and PSET 2018-09-11 12:50:25 -04:00
FernandoS27
2b48cfd44b Replace old FragmentHeader for the new Header 2018-09-11 12:48:19 -04:00
FernandoS27
e926757c8f Implemented (Partialy) Shader Header 2018-09-11 12:34:27 -04:00
David Marcec
4c3bd33be2 Fixed renderdoc input/output textures not working due to render targets 2018-09-11 13:31:20 +10:00
bunnei
d6e8e16a66
Merge pull request #1286 from bunnei/multi-clear
gl_rasterizer: Implement clear for non-zero render targets.
2018-09-10 20:30:14 -04:00
bunnei
12445b476d
Merge pull request #1285 from bunnei/depth-fix
gl_rasterizer_cache: Only use depth for applicable texture formats.
2018-09-10 20:28:40 -04:00
bunnei
d884e805c5
Merge pull request #1284 from bunnei/bgra8_srgb
gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
2018-09-10 20:28:00 -04:00
Markus Wick
c1b8cd9058 video_core: Refactor command_processor.
Inline the WriteReg helper as it is called ~20k times per frame.
2018-09-10 22:06:16 +02:00
Markus Wick
0cfb0bacb2 video_core: Move command buffer loop.
This moves the hot loop into video_core. This refactoring shall reduce the CPU overhead of calling ProcessCommandList.
2018-09-10 22:06:13 +02:00
Markus Wick
c560043581 rasterizer: Drop unused handler.
This virtual function is called in a very hot spot, and it does nothing.

If this kind of feature is required, please be more specific and add callbacks
in the switch statement within Maxwell3D::WriteReg. There is no point in having
another switch statement within the rasterizer.
2018-09-10 22:03:10 +02:00
bunnei
4c0b1cc1ae gl_rasterizer_cache: Only use depth for applicable texture formats.
- Fixes an issue with Octopath Traveler leaving stale data here.
2018-09-10 00:50:38 -04:00
bunnei
035e6bd407 gl_rasterizer: Implement clear for non-zero render targets.
- Several misc. changes to ConfigureFramebuffers in support of this.
2018-09-10 00:41:20 -04:00
bunnei
1c34498368 gl_rasterizer_cache: Implement RenderTargetFormat::BGRA8_SRGB.
- Used by Octopath Traveler (with multiple render targets).
2018-09-10 00:37:52 -04:00
bunnei
49b15af054 gl_rasterizer: Implement multiple color attachments. 2018-09-09 22:48:28 -04:00
bunnei
e58855c7a4
Merge pull request #1268 from FernandoS27/tmml
shader_decompiler: Implemented TMML
2018-09-09 21:39:39 -04:00
FernandoS27
00131e752d Implemented TMML 2018-09-09 20:46:31 -04:00
bunnei
223ddb2008
Merge pull request #1272 from Subv/dma_2d
GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
2018-09-09 19:53:17 -04:00
bunnei
fcf81147e7
Merge pull request #1280 from zero334/improvements
video_core: fixed arithmetic overflow warnings & improved code style
2018-09-09 19:51:46 -04:00
FernandoS27
073a21ac0b Implemented TXQ dimension query type, used by SMO. 2018-09-09 11:59:01 -04:00
Patrick Elsässer
64e45b04e0 video_core: fixed arithmetic overflow warnings & improved code style
- Fixed all warnings, for renderer_opengl items, which were indicating a
possible incorrect behavior from integral promotion rules and types
larger than those in which arithmetic is typically performed.
- Added const for variables where possible and meaningful.
- Added constexpr where possible.
2018-09-09 17:51:43 +02:00
tech4me
3dcedb36b4 Port Citra #4047 & #4052: add change background color support 2018-09-08 17:00:21 -07:00
FernandoS27
82a313a14c Change name of TEXQ to TXQ, in order to match NVIDIA's naming 2018-09-08 18:08:57 -04:00
Subv
fdb199290b GPU/DMA: Partially implemented the 'enable_2d' bit in the DMA engine.
When not set, this tells the GPU to only use the X size when performing a DMA copy.
This is only implemented for linear->linear and tiled->tiled copies. Conversion copies still retain the assert.

This bit is unset by some games for various purposes, and by nouveau when copying the vertex buffers.
2018-09-08 16:02:16 -05:00
bunnei
af074ee422
Merge pull request #1256 from bunnei/tex-target-support
Initial support for non-2D textures
2018-09-08 16:14:46 -04:00
bunnei
8b08cb925b gl_rasterizer: Use baseInstance instead of moving the buffer points.
This hopefully helps our cache not to redundant upload the vertex buffer.

# Conflicts:
#	src/video_core/renderer_opengl/gl_rasterizer.cpp
2018-09-08 04:05:56 -04:00
Patrick Elsässer
a8974f0556 video_core: Arithmetic overflow warning fix for gl_rasterizer (#1262)
* video_core: Arithmetic overflow fix for gl_rasterizer

- Fixed warnings, which were indicating incorrect behavior from integral
promotion rules and types larger than those in which arithmetic is
typically performed.

- Added const for variables where possible and meaningful.

* Changed the casts from C to C++ style

Changed the C-style casts to C++ casts as proposed.
Took also care about signed / unsigned behaviour.
2018-09-08 02:59:59 -04:00
bunnei
23ae7cf9db gl_rasterizer_cache: Improve accuracy of RecreateSurface for non-2D textures. 2018-09-08 02:53:39 -04:00
bunnei
fdd5c97a14 maxwell_3d: Remove assert that no longer applies. 2018-09-08 02:53:39 -04:00
bunnei
f165a85398 gl_rasterizer_cache: Partially implement several non-2D texture types. 2018-09-08 02:53:38 -04:00
bunnei
0731383124 gl_shader_decompiler: Partially implement several non-2D texture types (Subv). 2018-09-08 02:53:38 -04:00
bunnei
05f6f59ffb gl_rasterizer: Implement texture wrap mode p. 2018-09-08 02:53:38 -04:00
bunnei
ce8291f6c5 gl_rasterizer_cache: Track texture depth. 2018-09-08 02:53:38 -04:00
bunnei
9dccf7e1fa gl_rasterizer_cache: Remove impl. of FlushGLBuffer.
- Will not work for non-2d textures, and was not used anyways.
2018-09-08 02:53:37 -04:00
bunnei
030676b95d gl_rasterizer_cache: Keep track of texture type per surface. 2018-09-08 02:53:37 -04:00
bunnei
a439f7b6e1 gl_rasterizer_cache: Remove unused DownloadGLTexture. 2018-09-08 02:53:37 -04:00
bunnei
b56e5edafc gl_state: Keep track of texture target. 2018-09-08 02:53:37 -04:00
bunnei
9947c6ad59
Merge pull request #1252 from lioncash/header
video_core/CMakeLists: Add missing gl_buffer_cache.h
2018-09-06 19:19:43 -04:00
bunnei
9b50dca2bb
Merge pull request #1253 from lioncash/explicit
video_core/gl_buffer_cache: Minor tidying changes
2018-09-06 19:19:35 -04:00
bunnei
009a2cc9cc
Merge pull request #1255 from bunnei/minor-opt
gl_rasterizer: Call state.Apply only once on SetupShaders.
2018-09-06 19:19:16 -04:00
bunnei
820f646458 gl_rasterizer: Call state.Apply only once on SetupShaders. 2018-09-06 17:41:53 -04:00
bunnei
948f6c0738 gl_shader_decompiler: Implement saturate mode for IPA. 2018-09-06 17:40:03 -04:00
Lioncash
ddcdbce067 gl_buffer_cache: Default initialize member variables
Ensures that the cache always has a deterministic initial state.
2018-09-06 15:07:15 -04:00
Lioncash
8d685a29bc gl_buffer_cache: Make GetHandle() a const member function
GetHandle() internally calls GetHandle() on the stream_buffer instance,
which is a const member function, so this can be made const as well.
2018-09-06 15:07:15 -04:00
Lioncash
14230fe2af gl_buffer_cache: Remove unnecessary includes 2018-09-06 15:05:52 -04:00
Lioncash
68296d9474 gl_buffer_cache: Make constructor explicit
Implicit conversions during construction isn't desirable here.
2018-09-06 14:54:49 -04:00
Lioncash
8f4e09ba07 video_core/CMakeLists: Add missing gl_buffer_cache.h
Without this, the header file won't show up by default within IDEs such
as Visual Studio.
2018-09-06 14:49:51 -04:00
Markus Wick
a781042700 gl_shader_gen: Initialize position.
IMO the old code is fine, but nvidia raises shader compiler warnings.
Trivial fix through...
2018-09-06 13:37:50 +02:00
bunnei
77554ac773
Merge pull request #1243 from degasus/VAO_cache
gl_rasterizer: Implement a VAO cache.
2018-09-05 22:50:52 -04:00
bunnei
6f09c5b128
Merge pull request #1244 from FernandoS27/ipa
shader_decompiler: Implemented IPA Properly (Stage 1)
2018-09-05 21:20:40 -04:00
FernandoS27
e63b229f4a Implemented IPA Properly 2018-09-05 20:15:47 -04:00
Markus Wick
7f15306f78 gl_rasterizer: Skip TODO log.
This is called ~3k times per frame in SMO ingame.
My laptop spends ~3ms per frame on allocating and freeing this string.

Let's just stop printing this kind of redundant information.
2018-09-05 20:20:20 +02:00
Markus Wick
d3ad9469a1 gl_rasterizer: Implement a VAO cache.
This patch caches VAO objects instead of re-emiting all pointers per draw call.
Configuring this pointers is known as a fast task, but it yields too many GL
calls. So for better performance, just bind the VAO instead of 16 pointers.
2018-09-05 18:46:35 +02:00
Markus Wick
50a806ea67 renderer_opengl: Implement a buffer cache.
The idea of this cache is to avoid redundant uploads. So we are going
to cache the uploaded buffers within the stream_buffer and just reuse
the old pointers.
The next step is to implement a VBO cache on GPU memory, but for now,
I want to check the overhead of the cache management. Fetching the
buffer over PCI-E should be quite fast.
2018-09-05 08:03:50 +02:00
Markus Wick
99a71580c4 gl_shader_cache: Use an u32 for the binding point cache.
The std::string generation with its malloc and free requirement
was a noticeable overhead. Also switch to an ordered_map to
avoid the std::hash call. As those maps usually have a size of
two elements, the lookup time shall not matter.
2018-09-04 21:04:41 +02:00
bunnei
9a07e9f805
Merge pull request #1237 from degasus/optimizations
Optimizations
2018-09-04 12:16:06 -04:00
bunnei
26e96d16d0
Merge pull request #1232 from lioncash/copy
gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
2018-09-04 11:52:25 -04:00
Markus Wick
2081ed7db2 command_processor: Use std::array for bound_engines.
subchannel is a 3 bit field. So there must not be more than 8 bound engines.
And using a hashmap for up to 8 values is a bit overpowered.
2018-09-04 14:10:05 +02:00
Markus Wick
10bc725944 Update microprofile scopes.
Blame the subsystems which deserve the blame :)

The updated list is not complete, just the ones I've spotted on random sampling the stack trace.
2018-09-04 11:04:26 +02:00
Lioncash
18a89931a9 gl_shader_decompiler: Use used_shaders member variable directly within GenerateDeclarations()
Using the getter function intended for external code here makes an
unnecessary copy of the already-accessible used_shaders vector.
2018-09-02 13:10:11 -04:00
bunnei
325f3e0693
Merge pull request #1213 from DarkLordZach/octopath-fs
filesystem/maxwell_3d: Various changes to boot Project Octopath Traveller
2018-09-02 10:49:18 -04:00
bunnei
89be49d2f3
Merge pull request #1215 from ogniK5377/texs-nodep-assert
Added assert for TEXS nodep
2018-09-02 10:48:27 -04:00
bunnei
177c45e97d
Merge pull request #1214 from ogniK5377/ipa-assert
Added better asserts to IPA, Renamed IPA modes to match mesa
2018-09-02 10:44:43 -04:00
bunnei
9c206fe94d
Merge pull request #1216 from ogniK5377/ffma-assert
Added FFMA asserts and missing fields
2018-09-02 10:44:13 -04:00
David Marcec
60754b4728 Removed saturate assert
Unneeded as we already implement it
2018-09-01 19:33:32 +10:00
David Marcec
2edab4e840 Removed saturate assert
Saturate already implemented
2018-09-01 19:29:20 +10:00
David Marcec
2bc6abb9a1 Changed tab5980_0 default from 0 -> 1 2018-09-01 19:15:03 +10:00
David Marcec
6f8ed9508d Added FMUL asserts 2018-09-01 19:05:10 +10:00
David Marcec
b89fc407d7 Added FFMA asserts 2018-09-01 18:45:14 +10:00
David Marcec
948bc87a59 Added assert for TEXS nodep 2018-09-01 17:00:01 +10:00
David Marcec
ad3dca7e62 Added better asserts to IPA, Renamed IPA modes to match mesa
IpaMode is changed to IpaInterpMode
IpaMode is suppose to be 2 bits not 3
Added IpaSampleMode
Added Saturate

Renamed modes based on
d27c791891/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gm107.cpp (L2530)
2018-09-01 16:34:27 +10:00
Zach Hilman
f32e28c7b8 maxwell_3d: Use CoreTiming for query timestamp 2018-08-31 23:25:18 -04:00
Lioncash
4a587b81b2 core/core: Replace includes with forward declarations where applicable
The follow-up to e2457418da, which
replaces most of the includes in the core header with forward declarations.

This makes it so that if any of the headers the core header was
previously including change, then no one will need to rebuild the bulk
of the core, due to core.h being quite a prevalent inclusion.

This should make turnaround for changes much faster for developers.
2018-08-31 16:30:14 -04:00
bunnei
7f7eb29323 gl_rasterizer_cache: Use accurate framebuffer setting for accurate copies. 2018-08-31 13:07:28 -04:00
bunnei
123c065086 gl_rasterizer_cache: Also use reserve cache for RecreateSurface. 2018-08-31 13:07:28 -04:00
bunnei
9bc71fcc5f rasterizer_cache: Use boost::interval_map for a more accurate cache. 2018-08-31 13:07:28 -04:00
bunnei
d647d9550c gl_renderer: Cache textures, framebuffers, and shaders based on CPU address. 2018-08-31 13:07:27 -04:00
bunnei
16d65182f9 gl_rasterizer: Fix issues with the rasterizer cache.
- Use a single cached page map.
- Fix calculation of ending page.
2018-08-31 13:07:27 -04:00
greggameplayer
06578e89b2 Implement BC6H_UF16 & BC6H_SF16 (#1092)
* Implement BC6H_UF16 & BC6H_SF16
Require by ARMS

* correct coding style

* correct coding style part 2
2018-08-31 12:11:19 -04:00
bunnei
f08d24e9c0
Merge pull request #1204 from lioncash/pimpl
core: Make the main System class use the PImpl idiom
2018-08-31 11:31:20 -04:00
bunnei
6683bf50b5
Merge pull request #1207 from degasus/hotfix
Report correct shader size.
2018-08-31 11:21:15 -04:00
Lioncash
e2457418da core: Make the main System class use the PImpl idiom
core.h is kind of a massive header in terms what it includes within
itself. It includes VFS utilities, kernel headers, file_sys header,
ARM-related headers, etc. This means that changing anything in the
headers included by core.h essentially requires you to rebuild almost
all of core.

Instead, we can modify the System class to use the PImpl idiom, which
allows us to move all of those headers to the cpp file and forward
declare the bulk of the types that would otherwise be included, reducing
compile times. This change specifically only performs the PImpl portion.
2018-08-31 07:16:57 -04:00
Markus Wick
5be8b7a362 Report correct shader size.
Seems like this was an oversee in regards to 1fd979f50a
It changed GLShader::ProgramCode to a std::vector, so sizeof is wrong.
2018-08-31 09:56:37 +02:00
Hexagon12
d626bc8c62 Added predicate comparison GreaterEqualWithNan 2018-08-31 10:40:18 +03:00
Laku
915ab81ec2 gl_shader_decompiler: Implement POPC (#1203)
* Implement POPC

* implement invert
2018-08-30 21:32:58 -04:00
bunnei
d6accf96ff
Merge pull request #1200 from bunnei/improve-ipa
gl_shader_decompiler: Improve IPA for Pass mode with Position attribute.
2018-08-30 10:31:26 -04:00
tech4me
a6dd577d02 Shaders: Implemented IADD3 2018-08-29 13:44:41 -04:00
bunnei
b1ccd88434 gl_shader_decompiler: Improve IPA for Pass mode with Position attribute. 2018-08-29 00:37:29 -04:00
bunnei
4d7e1662c8
Merge pull request #1193 from lioncash/priv
gpu: Make memory_manager private
2018-08-28 12:28:57 -04:00
bunnei
eb4f2d5596
Merge pull request #1192 from lioncash/unused
gl_rasterizer: Remove unused variables
2018-08-28 12:28:13 -04:00
Lioncash
2e7dc4cac9 gl_shader_cache: Remove unused program_code vector in GetShaderAddress()
Given std::vector is a type with a non-trivial destructor, this
variable cannot be optimized away by the compiler, even if unused.
Because of that, something that was intended to be fairly lightweight,
was actually allocating 32KB and deallocating it at the end of the
function.
2018-08-28 11:20:41 -04:00
Lioncash
45fb74d262 gpu: Make memory_manager private
Makes the class interface consistent and provides accessors for
obtaining a reference to the memory manager instance.

Given we also return references, this makes our more flimsy uses of
const apparent, given const doesn't propagate through pointers in the
way one would typically expect. This makes our mutable state more
apparent in some places.
2018-08-28 11:11:50 -04:00
Lioncash
6771a18c6c gl_rasterizer: Remove unused variables 2018-08-28 10:46:29 -04:00
bunnei
b55d8111e6 renderer_opengl: Implement a new shader cache. 2018-08-27 18:26:46 -04:00
bunnei
a0e1566dc5 gl_rasterizer_cache: Update to use RasterizerCache base class. 2018-08-27 18:26:46 -04:00
bunnei
382852418b video_core: Add RasterizerCache class for common cache management code. 2018-08-27 18:26:45 -04:00
bunnei
2f5ed3877c
Merge pull request #1169 from Lakumakkara/sel
shader_bytecode: fix SEL_IMM bitstring
2018-08-27 18:24:57 -04:00
bunnei
f96ded9815
Merge pull request #1174 from lioncash/debug
debug_utils: Minor individual interface changes
2018-08-27 15:44:29 -04:00
bunnei
be2f1eabd7
Merge pull request #1173 from lioncash/batch
maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
2018-08-25 10:59:54 -04:00
bunnei
23b86fd3ea
Merge pull request #1167 from lioncash/assert
gl_rasterizer: Correct assertion condition in SyncLogicOpState()
2018-08-25 10:50:59 -04:00
Lioncash
c65713832c debug_utils: Remove unused includes
Quite a bit of these aren't necessary directly within the debug_utils
header and can be removed or included where actually necessary.
2018-08-24 20:49:14 -04:00
Lioncash
1e6a209649 debug_utils: Make BreakpointObserver class' constructor explicit
Avoids implicit conversions.
2018-08-24 20:49:14 -04:00
Lioncash
b6425c0511 debug_utils: Initialize active_breakpoint member of DebugContext
Ensures that all class members are initialized.
2018-08-24 20:15:50 -04:00
Lioncash
20800f2df7 maxwell3d: Move FinishedPrimitiveBatch event after AcceleratedDrawBatch()
The start and finish events should likely not be right after one another
like this, otherwise the batch will appear to complete immediately
2018-08-24 19:58:05 -04:00
Laku
36093a3e4d fix SEL_IMM bitstring 2018-08-24 07:18:12 +03:00
Lioncash
8fd9eb71b4 gl_rasterizer: Correct assertion condition in SyncLogicOpState()
Previously the assert would always be hit, since it was the equivalent
of: array == nullptr, which is never true.
2018-08-23 23:00:54 -04:00
tech4me
ba2972bc64 Shaders: Added decodings for IADD3 instructions 2018-08-23 15:46:59 -04:00
bunnei
0dce6d7008
Merge pull request #1160 from bunnei/surface-reserve
gl_rasterizer_cache: Several improvements
2018-08-23 12:04:37 -04:00
bunnei
d65f079cc1 gl_rasterizer_cache: Blit when possible on RecreateSurface. 2018-08-23 11:27:01 -04:00
bunnei
fee8bdd90c gl_rasterizer_cache: Reserve surfaces that have already been created for later use. 2018-08-23 11:27:01 -04:00
bunnei
fde2017a3f gl_rasterizer_cache: Remove assert for RecreateSurface type. 2018-08-23 11:27:00 -04:00
bunnei
ebf5768340 gl_rasterizer_cache: Implement compressed texture copies. 2018-08-23 11:27:00 -04:00
bunnei
a4ac3bed6c gl_rasterizer: Implement stencil test.
- Used by Splatoon 2.
2018-08-23 11:08:49 -04:00
bunnei
da3da6be90 gl_rasterizer: Implement partial color clear and stencil clear. 2018-08-23 11:08:48 -04:00
bunnei
2a472ff54d maxwell_3d: Update to include additional stencil registers. 2018-08-23 11:08:47 -04:00
bunnei
c4ed0b16b1 gl_state: Update to handle stencil front/back face separately. 2018-08-23 11:08:46 -04:00
bunnei
c7f2fb2151
Merge pull request #1157 from lioncash/vec
gl_shader_gen: Use a std::vector to represent program code instead of std::array
2018-08-23 02:19:00 -04:00
bunnei
232b0d9d2a
Merge pull request #1156 from Lakumakkara/lop3
gl_shader_decompiler: Implement LOP3
2018-08-23 02:16:49 -04:00
Lioncash
12ba80a86c gl_shader_gen: Make ShaderSetup's constructor explicit
Prevents implicit conversions.
2018-08-22 17:04:44 -04:00
Lioncash
1fd979f50a gl_shader_gen: Use a std::vector to represent program code instead of std::array
While convenient as a std::array, it's also quite a large set of data as
well (32KB). It being an array also means data cannot be std::moved. Any
situation where the code is being set or relocated means that a full
copy of that 32KB data must be done.

If we use a std::vector we do need to allocate on the heap, however, it
does allow us to std::move the data we have within the std::vector into
another std::vector instance, eliminating the need to always copy the
program data (as std::move in this case would just transfer the pointers
and bare necessities over to the new vector instance).
2018-08-22 17:04:44 -04:00
Laku
b2ca8089ce more fixes 2018-08-23 00:01:40 +03:00
Laku
e70a3c5a5d fixes 2018-08-22 21:33:32 +03:00
Lioncash
dd35b4b18a renderer_opengl: Namespace OpenGL code
Namespaces all OpenGL code under the OpenGL namespace.

Prevents polluting the global namespace and allows clear distinction
between other renderers' code in the future.
2018-08-22 06:14:47 -04:00
Laku
4877e6c2f6 remove debug logging 2018-08-22 11:45:28 +03:00
Laku
8e8326595f implement lop3 2018-08-22 10:09:44 +03:00
bunnei
cea627b0fc
Merge pull request #840 from FearlessTobi/port-3353
Port #3353 from Citra: "citra-qt: Add customizable speed limit target "
2018-08-22 01:19:50 -04:00
bunnei
5abf71fe65
Merge pull request #1154 from OatmealDome/topology-lines
maxwell_to_gl: Implement PrimitiveTopology::Lines
2018-08-22 01:08:34 -04:00
bunnei
125d7122ac
Merge pull request #1124 from Subv/logic_ops
GPU: Implemented logic ops.
2018-08-22 01:05:25 -04:00
OatmealDome
ad1220e1b3
maxwell_to_gl: Implement PrimitiveTopology::Lines
Used by Splatoon 2's debug menu.
2018-08-22 01:01:06 -04:00
bunnei
d63b1d21f1 Revert "Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions."
- This reverts commit 3ef4b3d4b4.
- This commit had broken a lot of games. We really should do a full implementation of this in one change.
2018-08-21 20:07:40 -04:00
Lioncash
a0e2bd85a5 shader_bytecode: Parenthesize conditional expression within GetTextureType()
Resolves a -Wlogical-op-parentheses warning.
2018-08-21 15:08:35 -04:00
bunnei
bf89a99839
Merge pull request #1123 from lioncash/screen
rasterizer_interface: Remove renderer-specific ScreenInfo type from AccelerateDraw() in RasterizerInterface
2018-08-21 01:18:34 -04:00
bunnei
b0f7713fce
Merge pull request #1132 from Subv/gl_FragDepth
Shaders: Implement depth writing in fragment shaders.
2018-08-21 01:17:53 -04:00
bunnei
8c9abe1d41
Merge pull request #1134 from lioncash/log
renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
2018-08-21 01:17:31 -04:00
bunnei
ca58929eb0
Merge pull request #1121 from Subv/tex_reinterpret
Rasterizer: Use PBOs to reinterpret texture formats when games re-use the same memory.
2018-08-21 01:06:40 -04:00
Lioncash
523e4be02c renderer_opengl: Use LOG_DEBUG for GL_DEBUG_SEVERITY_NOTIFICATION and GL_DEBUG_SEVERITY_LOW logs
LOG_TRACE is only enabled on debug builds which can be quite slow when
trying to debug graphics issues. Instead we can log the messages to the
debug log, which is available on both release and debug builds.
2018-08-21 00:23:09 -04:00
bunnei
fde3b1b6f2
Merge pull request #1133 from lioncash/guard
gl_stream_buffer: Add missing header guard
2018-08-20 23:37:55 -04:00
Lioncash
93a4097e9d gl_stream_buffer: Add missing header guard
Prevents potential compilation errors from occuring due to multiple
inclusions
2018-08-20 23:25:08 -04:00
Subv
e3bddf8137 Shaders: Implement depth writing in fragment shaders.
We'll write <last color output reg + 2> to gl_FragDepth.
2018-08-20 21:57:56 -05:00
bunnei
e33452f7e8
Merge pull request #1131 from bunnei/impl-tex3d-texcube
gl_shader_decompiler: Implement TextureCube/Texture3D for TEX/TEXS.
2018-08-20 22:15:18 -04:00
bunnei
5aaee2ff8d
Merge pull request #1106 from Subv/multiple_rendertargets
Shaders: Write all the enabled color outputs when a fragment shader exits.
2018-08-20 21:56:06 -04:00
bunnei
2ae88feea7 shader_bytecode: Replace some UNIMPLEMENTED logs. 2018-08-20 21:53:49 -04:00
bunnei
16db8b9d9f gl_shader_decompiler: Implement Texture3D for TEXS. 2018-08-20 21:53:18 -04:00
bunnei
948002635f gl_shader_decompiler: Implement TextureCube for TEX. 2018-08-20 21:53:00 -04:00
Subv
eac3cf301c Shaders: Fixed the coords in TEX with Texture2D.
The X and Y coordinates should be in gpr8 and gpr8+1, respectively.

This fixes the cutscene rendering in Sonic Mania.
2018-08-20 20:45:46 -05:00
Subv
fc5b489b0f Shaders: Log and crash when using an unimplemented texture type in a texture sampling instruction. 2018-08-20 20:44:56 -05:00
Subv
2b9eee4d1e GPU: Implemented the logic op functionality of the GPU.
This will ASSERT if blending is enabled at the same time as logic ops.
2018-08-20 18:44:47 -05:00
Subv
f24ab6d9e6 GLState: Allow enabling/disabling GL_COLOR_LOGIC_OP independently from blending. 2018-08-20 18:43:11 -05:00
Lioncash
46ef072cf9 rasterizer_interface: Remove ScreenInfo from AccelerateDraw()'s signature
This is an OpenGL renderer-specific data type. Given that, this type
shouldn't be used within the base interface for the rasterizer. Instead,
we can pass this information to the rasterizer via reference.
2018-08-20 19:43:05 -04:00
Subv
6bcdf37d4f GPU: Added registers for the logicop functionality. 2018-08-20 18:42:36 -05:00
Lioncash
bc16f7f3cc renderer_base: Make creation of the rasterizer, the responsibility of the renderers themselves
Given we use a base-class type within the renderer for the rasterizer
(RasterizerInterface), we want to allow renderers to perform more
complex initialization if they need to do such a thing. This makes it
important to reserve type information.

Given the OpenGL renderer is quite simple settings-wise, this is just a
simple shuffling of the initialization code. For something like Vulkan
however this might involve doing something like:

// Initialize and call rasterizer-specific function that requires
// the full type of the instance created.
auto raster = std::make_unique<VulkanRasterizer>(some, params);
raster->CallSomeVulkanRasterizerSpecificFunction();

// Assign to base class variable
rasterizer = std::move(raster)
2018-08-20 19:28:00 -04:00
fearlessTobi
ba8ff096fd Port #3353 from Citra 2018-08-21 01:14:06 +02:00
Subv
7784ce1854 Shaders: Write all the enabled color outputs when a fragment shader exits.
We were only writing to the first render target before.
Note that this is only the GLSL side of the implementation, supporting multiple render targets requires more changes in the OpenGL renderer.

Dual Source blending is not implemented and stuff that uses it might not work at all.
2018-08-20 17:31:25 -05:00
Subv
d7c68fbb12 Rasterizer: Reinterpret the raw texture bytes instead of blitting (and thus doing format conversion) to a new texture when a game requests an old texture address with a different format. 2018-08-20 15:20:35 -05:00
Subv
3fe77be392 Rasterizer: Don't attempt to copy over the old texture's data when doing a format reinterpretation if we're only going to clear the framebuffer. 2018-08-20 15:20:35 -05:00
bunnei
028d90eb79
Merge pull request #1104 from Subv/instanced_arrays
GLRasterizer: Implemented instanced vertex arrays.
2018-08-20 14:32:50 -04:00
bunnei
296e57fa0e
Merge pull request #1115 from Subv/texs_mask
Shaders/TEXS: Write to the correct output register when swizzling.
2018-08-20 14:31:33 -04:00
bunnei
b20ed93884
Merge pull request #1112 from Subv/sampler_types
Shaders: Use the correct shader type when sampling textures.
2018-08-20 14:30:45 -04:00
David Marcec
23d45715dc Implemented RGBA8_UINT
Needed by kirby
2018-08-20 22:26:54 +10:00
Subv
6cf719a4ab Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 17:09:40 -05:00
bunnei
51ddb130c5
Merge pull request #1089 from Subv/neg_bits
Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
2018-08-19 17:01:48 -04:00
bunnei
9b17486be6
Merge pull request #1105 from Subv/convert_neg
Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions.
2018-08-19 17:01:20 -04:00
bunnei
0a1d4fbc5c
Merge pull request #1113 from Subv/texs_mask
Shaders/TEXS: Fixed the component mask in the TEXS instruction.
2018-08-19 17:00:59 -04:00
Subv
f7edbcd7a3 Shaders/TEXS: Fixed the component mask in the TEXS instruction.
Previously we could end up with a TEXS that didn't write any outputs, this was wrong.
2018-08-19 14:00:12 -05:00
bunnei
b0eb580931
Merge pull request #1102 from ogniK5377/mirror-clamp-edge
Added WrapMode MirrorOnceClampToEdge
2018-08-19 13:59:41 -04:00
bunnei
85da529f15
Merge pull request #1101 from Subv/ssy_stack
Shaders: Implemented a stack for the SSY/SYNC instructions.
2018-08-19 13:58:45 -04:00
Subv
7fb406c3fc Shader: Implemented the TLD4 and TLD4S opcodes using GLSL's textureGather.
It is unknown how TLD4S determines the sampler type, more research is needed.
2018-08-19 12:57:58 -05:00
Subv
3ef4b3d4b4 Shader: Use the right sampler type in the TEX, TEXS and TLDS instructions.
Different sampler types have their parameters in different registers.
2018-08-19 12:57:54 -05:00
Subv
73b937b190 Shader: Added bitfields for the texture type of the various sampling instructions. 2018-08-19 12:57:51 -05:00
Subv
656758fd81 Shaders: Added decodings for TLD4 and TLD4S 2018-08-19 12:57:08 -05:00
bunnei
29d4f8c2dd
Merge pull request #1109 from Subv/ldg_decode
Shaders: Added decodings for  the LDG and STG instructions.
2018-08-19 13:31:19 -04:00
bunnei
9baf5de90c
Merge pull request #1108 from Subv/front_facing
Shaders: Implemented the gl_FrontFacing input attribute (attr 63).
2018-08-19 13:21:14 -04:00
Subv
1b92ae136f Shaders: Added decodings for the LDG and STG instructions. 2018-08-19 00:46:34 -05:00
Subv
731701a2d2 Shaders: Implemented the gl_FrontFacing input attribute (attr 63). 2018-08-19 00:14:34 -05:00
Subv
9b1c49a9cf Shader: Remove an unneeded assert, the negate bit is implemented for conversion instructions. 2018-08-18 14:48:05 -05:00
Subv
e0f66c1fbf GLRasterizer: Implemented instanced vertex arrays.
Before each draw call, for every enabled vertex array configured as instanced, we take the current instance id and divide it by its configured divisor, then we multiply that by the corresponding stride and increment the start address by the resulting amount. This way we can simulate the vertex array being incremented once per instance without actually using OpenGL's instancing functions.
2018-08-18 14:42:26 -05:00
Subv
8335b2f115 Shader: Implemented the predicate and mode arguments of LOP.
The mode can be used to set the predicate to true depending on the result of the logic operation. In some cases, this means discarding the result (writing it to register 0xFF (Zero)).

This is used by Super Mario Odyssey.
2018-08-18 14:36:37 -05:00
David Marcec
71cc482bbd Added WrapMode MirrorOnceClampToEdge
Used by splatoon 2
2018-08-19 02:26:50 +10:00
Subv
ff358d97e8 Shaders: Implemented a stack for the SSY/SYNC instructions.
The SSY instruction pushes an address into the stack, and the SYNC instruction pops it. The current stack depth is 20, we should figure out if this is enough or not.
2018-08-18 10:48:12 -05:00
Subv
2e95ba2e9c Shaders: Corrected the 'abs' and 'neg' bit usage in the float arithmetic instructions.
We should definitely audit our shader generator for more errors like this.
2018-08-18 10:22:42 -05:00
David Marcec
63dff47e22 Added predcondition GreaterThanWithNan 2018-08-18 17:49:59 +10:00
bunnei
504cff2b7a
Merge pull request #1096 from bunnei/supported-blits
gl_rasterizer_cache: Remove asserts for supported blits.
2018-08-17 22:41:53 -04:00
bunnei
e341d868ee gl_rasterizer_cache: Remove asserts for supported blits. 2018-08-17 00:10:08 -04:00
bunnei
da7226442f renderer_opengl: Treat OpenGL errors as critical. 2018-08-17 00:09:27 -04:00
bunnei
727136a9c9
Merge pull request #1019 from Subv/vertex_divisor
Rasterizer: Manually implemented instanced rendering.
2018-08-17 00:07:06 -04:00
bunnei
89c3d6a2a3 gl_rasterizer_cache: Treat Depth formats differently from DepthStencil. 2018-08-15 21:24:04 -04:00
Subv
91140f6c0a Shader/Conversion: Implemented the negate bit in F2F and I2I instructions. 2018-08-15 09:27:43 -05:00
Subv
38592a3b5e Shader/I2F: Implemented the negate I2F_C instruction variant. 2018-08-15 09:25:02 -05:00
Subv
40ecdda19e Shader/F2I: Implemented the negate bit in the I2F instruction 2018-08-15 09:18:55 -05:00
Subv
5ef447cc0e Shader/F2I: Implemented the F2I_C instruction variant. 2018-08-15 09:16:35 -05:00
Subv
11c221cc62 Shader/F2I: Implemented the negate bit in the F2I instruction. 2018-08-15 09:15:55 -05:00
bunnei
40f83fee6a
Merge pull request #1077 from bunnei/rgba16u
gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
2018-08-15 09:25:15 -04:00
bunnei
b1148d269d gl_rasterizer_cache: Cleanup some PixelFormat names and logging. 2018-08-14 23:31:45 -04:00
Subv
c5284efd4f Rasterizer: Implemented instanced rendering.
We keep track of the current instance and update an uniform in the shaders to let them know which instance they are.

Instanced vertex arrays are not yet implemented.
2018-08-14 22:25:07 -05:00
bunnei
8599e1e4fc gl_rasterizer_cache: Add RGBA16U to PixelFormatFromTextureFormat.
- Used by Breath of the Wild.
2018-08-14 23:18:34 -04:00
bunnei
3aad82b1a3
Merge pull request #1069 from bunnei/vtx-sz
maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes.
2018-08-14 23:14:44 -04:00
bunnei
2a42dea568
Merge pull request #1070 from bunnei/cbuf-sz
gl_rasterizer: Fix upload size for constant buffers.
2018-08-14 23:14:24 -04:00
bunnei
c8cd1785e6
Merge pull request #1071 from bunnei/fix-ldc
gl_shader_decompiler: Several fixes for indirect constant buffer loads.
2018-08-14 23:14:09 -04:00
bunnei
991eb4824c
Merge pull request #1068 from bunnei/g8r8s
gl_rasterizer_cache: Implement G8R8S format.
2018-08-14 23:13:43 -04:00
greggameplayer
6eda9ebbdb Implement Z16_UNORM in PixelFormatFromTextureFormat function
Require by Zelda Breath Of The Wild
2018-08-15 04:14:15 +02:00
bunnei
5e66a24423 gl_shader_decompiler: Several fixes for indirect constant buffer loads. 2018-08-14 20:47:50 -04:00
bunnei
290439a6a5 gl_rasterizer: Fix upload size for constant buffers. 2018-08-14 20:44:19 -04:00
bunnei
dc876fd63a maxwell_to_gl: Properly handle UnsignedInt/SignedInt sizes. 2018-08-14 20:43:02 -04:00
bunnei
d8fd3ef4fe gl_rasterizer_cache: Implement G8R8S format.
- Used by Super Mario Odyssey.
2018-08-14 20:41:49 -04:00
bunnei
4dacb8a4b1
Merge pull request #1058 from greggameplayer/BC7U_Fix
Fix BC7U
2018-08-14 08:03:07 -04:00
greggameplayer
6bfcf13187 Fix BC7U 2018-08-14 02:36:00 +02:00
bunnei
6e52f37d5b renderer_opengl: Implement RenderTargetFormat::RGBA16_UNORM.
- Used by Breath of the Wild.
2018-08-13 18:20:07 -04:00
bunnei
46fbf6dd92
Merge pull request #1052 from ogniK5377/xeno
Implement RG32UI and R32UI
2018-08-13 12:31:39 -04:00
David Marcec
45cc022ea9 Implement RG32UI and R32UI
Needed for xenoblade
2018-08-13 22:55:16 +10:00
bunnei
41b77c4e0a maxwell_to_gl: Implement VertexAttribute::Size::Size_8.
- Used by Breath of the Wild.
2018-08-13 01:34:21 -04:00
bunnei
bdf17fe0cc renderer_opengl: Implement RenderTargetFormat::RGBA16_UINT.
- Used by Breath of the Wild.
2018-08-13 00:06:22 -04:00
bunnei
54ef9302a2
Merge pull request #1045 from bunnei/rg8-unorm
renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
2018-08-13 00:05:25 -04:00
bunnei
8fe118bcaa maxwell_to_gl: Implement PrimitiveTopology::LineStrip.
- Used by Breath of the Wild.
2018-08-12 23:09:32 -04:00
bunnei
c56a0e3c34 renderer_opengl: Implement RenderTargetFormat::RG8_UNORM.
- Used by Breath of the Wild.
2018-08-12 23:08:50 -04:00
bunnei
534abf9d97 gl_shader_decompiler: Implement XMAD instruction. 2018-08-12 18:30:24 -04:00
Markus Wick
0eb39922f6 gl_rasterizer: Use a shared helper to upload from CPU memory. 2018-08-12 16:10:26 +02:00
Markus Wick
0af7e93763 gl_state: Don't track constant buffer mappings. 2018-08-12 16:10:26 +02:00
Markus Wick
6ff7906ddc gl_rasterizer: Use the stream buffer for constant buffers. 2018-08-12 16:10:26 +02:00
Markus Wick
ce722e317b gl_rasterizer: Use the streaming buffer itself for the constant buffer.
Don't emut copies, especially not for data, which is used once. They just end in a huge GPU overhead.
2018-08-12 15:48:59 +02:00
Markus Wick
6f6bba3ff1 gl_rasterizer: Use a helper for aligning the buffer. 2018-08-12 15:47:35 +02:00
Markus Wick
d7298ec262 Update the stream_buffer helper from Citra.
Please see https://github.com/citra-emu/citra/pull/3666 for more details.
2018-08-12 15:47:35 +02:00
bunnei
639ebb39f6 gl_shader_decompiler: Fix SetOutputAttributeToRegister empty check. 2018-08-12 02:22:42 -04:00
bunnei
c68aa65226 gl_shader_decompiler: Fix GLSL compiler error with KIL instruction. 2018-08-12 00:06:48 -04:00
bunnei
ee07041b3a
Merge pull request #1020 from lioncash/namespace
core: Namespace EmuWindow
2018-08-11 22:40:08 -04:00
bunnei
9c977d2215
Merge pull request #1021 from lioncash/warn
gl_rasterizer: Silence implicit truncation warning in SetupShaders()
2018-08-11 22:39:46 -04:00
bunnei
f2c7b5dcd6
Merge pull request #1024 from Subv/blend_gl
GPU/Maxwell3D: Implemented an alternative set of blend factors.
2018-08-11 22:39:02 -04:00
bunnei
d37da52cb3
Merge pull request #1023 from Subv/invalid_attribs
RasterizerGL: Ignore invalid/unset vertex attributes.
2018-08-11 22:18:40 -04:00
Subv
969326bd58 GPU/Maxwell3D: Implemented an alternative set of blend factors.
These are used by nouveau and some games like SMO.
2018-08-11 20:57:16 -05:00
greggameplayer
224071a652 Implement R8_UINT RenderTargetFormat & PixelFormat (#1014)
- Used by Go Vacation
2018-08-11 21:44:42 -04:00
Subv
2dad1204e8 RasterizerGL: Ignore invalid/unset vertex attributes.
This should make the es2gears example not crash anymore.
2018-08-11 20:36:40 -05:00
Lioncash
28e90fa0e0 gl_rasterizer: Silence implicit truncation warning in SetupShaders()
Previously this would warn of truncating a std::size_t to a u32. This is
safe because we'll obviously never have more than UINT32_MAX amount of
uniform buffers.
2018-08-11 20:32:03 -04:00
Lioncash
0a93b45b6a core: Namespace EmuWindow
Gets the class out of the global namespace.
2018-08-11 20:20:21 -04:00
bunnei
403dfd68fc
Merge pull request #1010 from bunnei/unk-vert-attrib-shader
gl_shader_decompiler: Improve handling of unknown input/output attributes.
2018-08-11 19:56:28 -04:00
bunnei
c519354506
Merge pull request #1009 from bunnei/rg8-rgba8-snorm
Implement render target formats RGBA8_SNORM and RG8_SNORM.
2018-08-11 19:55:41 -04:00
bunnei
0b668d5ff3 gl_shader_decompiler: Improve handling of unknown input/output attributes. 2018-08-11 19:26:45 -04:00
bunnei
670a2c1f80
Merge pull request #1018 from Subv/ssy_sync
GPU/Shader: Implemented SSY and SYNC as a set_target/jump pair.
2018-08-11 19:10:02 -04:00
bunnei
88ffa422d4 gl_rasterizer: Implement render target format RG8_SNORM.
- Used by Super Mario Odyssey.
2018-08-11 19:06:42 -04:00
bunnei
0471976b48 gl_rasterizer: Implement render target format RGBA8_SNORM.
- Used by Super Mario Odyssey.
2018-08-11 18:59:14 -04:00
Subv
c1ad973881 GPU/Shader: Don't predicate instructions that don't have a predicate field (SSY). 2018-08-11 16:00:14 -05:00
Subv
305a05f820 GPU/Shaders: Implemented SSY and SYNC as a way to modify control flow during shader execution.
SSY sets the target label to jump to when the SYNC instruction is executed.
2018-08-11 15:55:11 -05:00
bunnei
d64303d185
Merge pull request #1016 from lioncash/video
video_core: Get rid of global variable g_toggle_framelimit_enabled
2018-08-11 14:10:55 -04:00
bunnei
b8b9f41b6b
Merge pull request #1003 from lioncash/var
video_core: Use variable template variants of type_traits interfaces where applicable
2018-08-11 14:08:12 -04:00
greggameplayer
dfcde52f39 Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats and more (R16_UNORM needed by Fate Extella) (#848)
* Implement R16S & R16UI & R16I RenderTargetFormats & PixelFormats


Do a separate function in order to get Bytes Per Pixel of DepthFormat


Apply the new function in gpu.h


delete unneeded white space

* correct merging error
2018-08-11 14:01:50 -04:00
Lioncash
20c2928c2b video_core; Get rid of global g_toggle_framelimit_enabled variable
Instead, we make a struct for renderer settings and allow the renderer
to update all of these settings, getting rid of the need for
global-scoped variables.

This also uncovered a few indirect inclusions for certain headers, which
this commit also fixes.
2018-08-10 19:00:09 -04:00
Lioncash
f380496728 renderer_base: Remove unused kFramebuffer enumeration
This is entirely unused and can be removed.
2018-08-10 18:31:13 -04:00
Lioncash
2e80e7480d video_core: Remove unused Renderer enumeration
Currently we only have an OpenGL renderer, so this is unused in code
(and occupies the Renderer identifier in the VideoCore namespace).
2018-08-10 18:27:40 -04:00
bunnei
6b0bc48a42 maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.
- Used by Super Mario Odyssey.
2018-08-10 12:47:00 -04:00
bunnei
a5b65df9cf maxwell_to_gl: Implement VertexAttribute::Size::Size_32_32_32.
- Used by Super Mario Odyssey.
2018-08-10 12:46:49 -04:00
bunnei
57626fda7b
Merge pull request #1004 from lioncash/unused
gl_rasterizer_cache: Remove unused viewport parameter of GetFramebufferSurfaces()
2018-08-10 12:13:32 -04:00
bunnei
6313d54cef
Merge pull request #1008 from yuzu-emu/revert-697-disable-depth-cull
Revert "gl_state: Temporarily disable culling and depth test."
2018-08-10 12:13:09 -04:00
bunnei
2156cb3cbe
Revert "gl_state: Temporarily disable culling and depth test." 2018-08-10 10:39:46 -04:00
Lioncash
0e1510ac29 gl_rasterizer_cache: Remove unused viewport parameter of GetFramebufferSurfaces() 2018-08-09 20:55:41 -04:00
Lioncash
b8c43b6080 video_core: Use variable template variants of type_traits interfaces where applicable 2018-08-09 20:45:48 -04:00
bunnei
3a67876252 textures: Refactor out for Texture/Depth FormatFromPixelFormat. 2018-08-09 20:36:03 -04:00
bunnei
6828c25498
Merge pull request #995 from bunnei/gl-buff-bounds
gl_rasterizer_cache: Add bounds checking for gl_buffer copies.
2018-08-09 20:23:30 -04:00
bunnei
05c33d89a1
Merge pull request #1001 from lioncash/reserve
gl_shader_decompiler: Reserve element memory beforehand in BuildRegisterList()
2018-08-09 19:27:35 -04:00
bunnei
e8c52d4c89 gl_rasterizer_cache: Add bounds checking for gl_buffer copies. 2018-08-09 19:20:17 -04:00
bunnei
37e1ed3744
Merge pull request #991 from bunnei/ignore-mac
maxwell_3d: Ignore macros that have not been uploaded yet.
2018-08-09 19:16:28 -04:00
Khangaroo
75e12a33ae Implement SNORM for BC5/DXN2 (#998)
* Implement BC5/DXN2 (#996)

- Used by Kirby Star Allies.

* Implement BC5/DXN2 SNORM

UNORM for Kirby Star Allies
SNORM for Super Mario Odyssey
2018-08-09 19:15:32 -04:00
Lioncash
6ef027b958 gl_shader_decompiler: Reserve element memory beforehand in BuildRegisterList()
Avoids potentially perfoming multiple reallocations when we know the
total amount of memory we need beforehand.
2018-08-09 17:29:11 -04:00
Lioncash
59ea37daa7 gl_rasterizer_cache: Avoid iterator invalidation issues within InvalidateRegion()
A range-based for loop can't be used when the container being iterated
is also being erased from.
2018-08-09 15:30:20 -04:00
bunnei
0bfe974281
Merge pull request #992 from bunnei/declr-pred
gl_shader_decompiler: Declare predicates on use.
2018-08-09 14:36:52 -04:00
bunnei
88b18b9ba4
Merge pull request #994 from lioncash/const
gl_rasterizer_cache: Use std::vector::assign vs resize() then copy for the non-tiled case
2018-08-09 14:36:06 -04:00
bunnei
b125137493
Merge pull request #993 from bunnei/smo-vtx-pts
Implement VertexAttribute::Size::Size_16_16_16_16 and PrimitiveTopology::Points.
2018-08-09 13:28:14 -04:00
bunnei
f765a6b902
Merge pull request #984 from bunnei/rt-none
gl_rasterizer: Do not render when no render target is configured.
2018-08-09 13:12:28 -04:00
Khangaroo
5cb6eceecf Implement BC5/DXN2 (#996)
- Used by Kirby Star Allies.
2018-08-09 12:57:13 -04:00
bunnei
c333bfc193
Merge pull request #977 from bunnei/bgr565
gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.
2018-08-08 23:43:04 -04:00
Lioncash
e831b80d69 gl_rasterizer_cache: Invert conditional in LoadGLBuffer()
It's generally easier to follow code using conditionals that operate in
terms of the true case followed by the false case (no chance of
overlooking the exclamation mark).
2018-08-08 23:34:57 -04:00
Lioncash
434f352eb3 gl_rasterizer_cache: Use std::vector::assign in LoadGLBuffer() for the non-tiled case
resize() causes the vector to expand and zero out the added members to
the vector, however we can avoid this zeroing by using assign().

Given we have the pointer to the data we want to copy, we can calculate
the end pointer and directly copy the range of data without the
need to perform the resize() beforehand.
2018-08-08 23:34:58 -04:00
bunnei
dfc3eed0cb maxwell_to_gl: Implement VertexAttribute::Size::Size_16_16_16_16.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:28:17 -04:00
bunnei
06d0b96ca9 maxwell_to_gl: Implement PrimitiveTopology::Points.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:28:00 -04:00
bunnei
4283019aa0 gl_shader_decompiler: Declare predicates on use.
- Used by Super Mario Odyssey (when going in game).
2018-08-08 23:26:31 -04:00
bunnei
efe6b473c5 maxwell_3d: Ignore macros that have not been uploaded yet.
- Used by Super Mario Odyssey (in game).
2018-08-08 23:25:37 -04:00
Lioncash
557c466994 gl_rasterizer_cache: Make pointer const in LoadGLBuffer()
This is only ever read from, so we can make the data it's pointing to
const.
2018-08-08 23:14:57 -04:00
bunnei
25ba4d1b68
Merge pull request #982 from bunnei/stub-unk-63
gl_shader_decompiler: Stub input attribute Unknown_63.
2018-08-08 22:28:18 -04:00
bunnei
ddec200290 gl_rasterizer: Do not render when no render target is configured.
- Used by Super Mario Odyssey.
2018-08-08 19:29:45 -04:00
bunnei
cf917a5e93
Merge pull request #976 from bunnei/shader-imm
gl_shader_decompiler: Let OpenGL interpret floats.
2018-08-08 19:17:01 -04:00
bunnei
9ceceb212f
Merge pull request #981 from bunnei/cbuf-corrupt
maxwell_3d: Use correct const buffer size and check bounds.
2018-08-08 19:16:34 -04:00
bunnei
cc2526dd51
Merge pull request #985 from bunnei/rt-r11g11b10
gpu: Add R11G11B10_FLOAT to RenderTargetBytesPerPixel.
2018-08-08 18:21:34 -04:00
bunnei
096b04f1a4
Merge pull request #979 from bunnei/vtx88
maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8.
2018-08-08 18:18:50 -04:00
bunnei
7bf422d58c gpu: Add R11G11B10_FLOAT to RenderTargetBytesPerPixel.
- Used by Super Mario Odyssey.
2018-08-08 02:42:14 -04:00
bunnei
7f0d0a93f7 gl_shader_decompiler: Stub input attribute Unknown_63. 2018-08-08 02:35:59 -04:00
bunnei
57982df105 maxwell_3d: Use correct const buffer size and check bounds.
- Fixes mem corruption with Super Mario Odyssey and Pokkén Tournament DX.
2018-08-08 02:10:25 -04:00
bunnei
8c6338b6f9 renderer_opengl: Use trace log in a few places. 2018-08-08 01:53:23 -04:00
bunnei
c120ed7d18 maxwell_to_gl: Implement VertexAttribute::Size::Size_8_8. 2018-08-08 01:50:53 -04:00
bunnei
aaf8d9ac2f gl_rasterizer_cached: Implement RenderTargetFormat::B5G6R5_UNORM.
- Used by Super Mario Odyssey.
2018-08-08 01:48:27 -04:00
bunnei
e542356d0c gl_shader_decompiler: Let OpenGL interpret floats.
- Accuracy is lost in translation to string, e.g. with NaN.
- Needed for Super Mario Odyssey.
2018-08-08 01:45:23 -04:00
bunnei
4fa3511a63
Merge pull request #964 from Hexagon12/lower-logs
Lowered down the logging for command processor methods
2018-08-07 19:00:19 -04:00
Hexagon12
7139f05fc5 Fixed the sRGB pixel format (#963)
* Changed the sRGB pixel format return

* Add a message about SRGBA -> RGBA conversion
2018-08-07 18:59:50 -04:00
Hexagon12
bc6d91a103 Lowered down the logging for methods 2018-08-07 19:51:40 +03:00
bunnei
904d7eaa94 maxwell_3d: Remove outdated assert. 2018-08-05 23:57:19 -04:00
bunnei
57eb936200 gl_rasterizer_cache: Avoid superfluous surface copies. 2018-08-05 23:40:03 -04:00
bunnei
c8e5c74092
Merge pull request #927 from bunnei/fix-texs
gl_shader_decompiler: Fix TEXS mask and dest.
2018-08-05 16:42:21 -04:00
bunnei
c0af42d6eb
Merge pull request #912 from lioncash/global-var
video_core: Eliminate the g_renderer global variable
2018-08-05 16:37:39 -04:00
bunnei
fd715e54a1 gl_shader_decompiler: Fix TEXS mask and dest. 2018-08-05 01:47:09 -04:00
David Marcec
b96010bfa9 added braces for conditions 2018-08-05 11:36:55 +10:00
David Marcec
6d1e30e041 fix the attrib format for ints 2018-08-05 11:29:21 +10:00
bunnei
13d6593753
Merge pull request #919 from lioncash/sign
gl_shader_manager: Amend sign differences in an assertion comparison in SetShaderUniformBlockBinding()
2018-08-04 14:29:59 -04:00
Lioncash
3b678b9e8e gl_shader_manager: Invert conditional in SetShaderUniformBlockBinding()
This lets us indent the majority of the code and places the error case
first.
2018-08-04 02:57:11 -04:00
Lioncash
dde5dce736 gl_shader_manager: Amend sign differences in an assertion comparison in SetShaderUniformBlockBinding()
Ensures both operands have the same sign in the comparison.

While we're at it, we can get rid of the redundant casting of ub_size to
an int. This type will always be trivial and alias a built-in type (not
doing so would break backwards compatibility at a standard level).
2018-08-04 02:55:03 -04:00
Lioncash
2665457f4a renderer_base: Make Rasterizer() return the rasterizer by reference
All calling code assumes that the rasterizer will be in a valid state,
which is a totally fine assumption. The only way the rasterizer wouldn't
be is if initialization is done incorrectly or fails, which is checked
against in System::Init().
2018-08-04 02:36:58 -04:00
Lioncash
6030c5ce41 video_core: Eliminate the g_renderer global variable
We move the initialization of the renderer to the core class, while
keeping the creation of it and any other specifics in video_core. This
way we can ensure that the renderer is initialized and doesn't give
unfettered access to the renderer. This also makes dependencies on types
more explicit.

For example, the GPU class doesn't need to depend on the
existence of a renderer, it only needs to care about whether or not it
has a rasterizer, but since it was accessing the global variable, it was
also making the renderer a part of its dependency chain. By adjusting
the interface, we can get rid of this dependency.
2018-08-04 02:36:57 -04:00
bunnei
762fcaf5de
Merge pull request #911 from lioncash/prototype
video_core: Remove unimplemented Start() function prototype
2018-08-04 02:18:38 -04:00
bunnei
29f31356d8
Merge pull request #910 from lioncash/unused
gl_shader_decompiler: Remove unused variable in GenerateDeclarations()
2018-08-03 15:54:11 -04:00
Lioncash
b4e050e6c4 video_core: Remove unimplemented Start() function prototype
Given this has no definition, we can just remove it entirely.
2018-08-03 12:48:14 -04:00
Lioncash
b45e5c2399 gl_shader_decompiler: Remove unused variable in GenerateDeclarations()
This variable was being incremented, but we were never actually using
it.
2018-08-03 12:18:31 -04:00
Lioncash
555d76d065 gl_shader_manager: Make ProgramManager's GetCurrentProgramStage() a const member function
This function doesn't modify class state, so it can be made const.
2018-08-03 12:08:17 -04:00
bunnei
00ba704a7f
Merge pull request #892 from lioncash/global
video_core: Make global EmuWindow instance part of the base renderer …
2018-08-03 00:31:32 -04:00
bunnei
52da0ce399
Merge pull request #901 from lioncash/ref
gl_shader_manager: Take ShaderSetup instances by const reference in UseProgrammableVertexShader() and UseProgrammableFragmentShader()
2018-08-02 23:00:56 -04:00
bunnei
bae1822aed
Merge pull request #902 from lioncash/array
gl_state: Make texture_units a std::array
2018-08-02 14:57:42 -04:00
greggameplayer
fe64e1d38e Implement RGB32F PixelFormat (#886) (used by Go Vacation) 2018-08-02 14:56:38 -04:00
Lioncash
6b32e24161 gl_state: Make texture_units a std::array
Gets rid of the use of a raw C array.
2018-08-02 11:19:58 -04:00
Lioncash
d92e8ab062 gl_shader_manager: Take ShaderSetup instances by const reference in UseProgrammableVertexShader() and UseProgrammableFragmentShader()
Avoids performing unnecessary copies of 65560 byte sized ShaderSetup
instances, considering it's only used as part of lookup and not
modified.

Given the parameters were already const, it's likely taking these
parameters by reference was intended but the ampersand was forgotten.
2018-08-02 11:09:46 -04:00
Lioncash
0f2ac928f2 video_core: Make global EmuWindow instance part of the base renderer class
Makes the global a member of the RendererBase class. We also change this
to be a reference. Passing any form of null pointer to these functions
is incorrect entirely, especially given the code itself assumes that the
pointer would always be in a valid state.

This also makes it easier to follow the lifecycle of instances being
used, as we explicitly interact the renderer with the rasterizer, rather
than it just operating on a global pointer.
2018-08-01 21:40:30 -04:00
Unknown
0d8fcab136 Implement R32_FLOAT RenderTargetFormat 2018-08-01 15:31:42 +02:00
bunnei
3575c076cb
Merge pull request #869 from Subv/ubsan
Corrected a few error cases detected by asan/ubsan
2018-07-31 09:24:13 -07:00
Subv
8191273a3d MacroInterpreter: Avoid left shifting negative values.
The branch target is signed, so multiply by 4 instead of left shifting by 2
2018-07-30 20:38:24 -05:00
bunnei
e013fdc2b2
Merge pull request #808 from lioncash/mem-dedup
video_core/memory_manager: Avoid repeated unnecessary page slot lookups
2018-07-26 11:50:27 -07:00
Subv
f85cff0f48 GPU: Allow using R16F as a render target format. 2018-07-26 08:52:21 -05:00
Unknown
4672a01cbf Implement R16_G16
correct trailing white spaces


Delete tabs


correct placement
Add RG16F & RG16UI & RG16I & RG16S PixelFormats
Return correct data according to changes done previously
correct PixelFormat declaration
correct coding style error
correct coding style error part 2
correct RG16S Declaration error
correct alignment
2018-07-26 02:01:29 +02:00
bunnei
c88382517c
Merge pull request #819 from Subv/srgb
GPU: Use the right texture format for sRGBA framebuffers.
2018-07-25 14:47:26 -07:00
Subv
c5b838aeef GPU: Use the right texture format for sRGBA framebuffers. 2018-07-25 09:52:39 -05:00
Subv
ee8123bf13 GPU: Allow the use of Z24S8 as a texture format. 2018-07-25 09:41:24 -05:00
bunnei
0686183c3e
Merge pull request #816 from Subv/z32_s8
GPU: Implemented the Z32_S8_X24 depth buffer format.
2018-07-25 07:37:00 -07:00
bunnei
af787744ab
Merge pull request #815 from Subv/z32f_tex
GPU: Allow using Z32 as a texture format.
2018-07-25 07:33:09 -07:00
bunnei
704824d50a
Merge pull request #814 from Subv/rt_r8
GPU: Allow the usage of R8 as a render target format.
2018-07-25 07:32:18 -07:00
bunnei
a6ea6febc9
Merge pull request #809 from lioncash/rasterizer
gl_rasterizer: Minor cleanup
2018-07-24 19:31:34 -07:00
bunnei
e0106a7d68
Merge pull request #811 from Subv/code_address_assert
GPU: Remove the assert that required the CODE_ADDRESS to be 0.
2018-07-24 19:31:09 -07:00
Subv
daf2504d31 GPU: Implemented the Z32_S8_X24 depth buffer format. 2018-07-24 20:41:40 -05:00
Subv
f747a7e35d GPU: Allow using Z32 as a texture format. 2018-07-24 19:54:23 -05:00
Subv
4f574201ea GPU: Allow the usage of R8 as a render target format. 2018-07-24 19:49:36 -05:00
Subv
8f2c4191ab GPU: Remove the assert that required the CODE_ADDRESS to be 0.
Games usually just leave it at 0 but nouveau sets it to something else. This already works fine, the assert is useless.
2018-07-24 13:54:12 -05:00
Subv
4cc1e180ec GPU: Implemented the R16 and R16F texture formats. 2018-07-24 13:39:16 -05:00
Lioncash
0162f8b3a7 gl_rasterizer: Replace magic number with GL_INVALID_INDEX in SetupConstBuffers()
This is just the named constant that OpenGL provides, so we can use that
instead of using a literal -1
2018-07-24 12:24:49 -04:00
Lioncash
16139ed53b gl_rasterizer: Use std::string_view instead of std::string when checking for extensions
We can avoid heap allocations here by just using a std::string_view
instead of performing unnecessary copying of the string data.
2018-07-24 12:10:37 -04:00
Lioncash
b5eb3905cd gl_rasterizer: Use in-class member initializers where applicable
We can just assign to the members directly in these cases.
2018-07-24 12:08:12 -04:00
Lioncash
bf608f125e video_core/memory_manager: Replace a loop with std::array's fill() function in PageSlot()
We already have a function that does what this code was doing, so let's
use that instead.
2018-07-24 11:56:30 -04:00
Lioncash
d71e19fd75 video_core/memory_manager: Avoid repeated unnecessary page slot lookups
We don't need to keep calling the same function over and over again in a
loop, especially when the behavior is slightly non-trivial. We can just
keep a reference to the looked up location and do all the checking and
assignments based off it instead.
2018-07-24 11:19:54 -04:00
bunnei
0f830d08f1
Merge pull request #799 from Subv/tex_r32f
GPU: Implement texture format R32F.
2018-07-24 04:46:07 -07:00
bunnei
b70f757913
Merge pull request #796 from bunnei/gl-uint
maxwell_to_gl: Implement VertexAttribute::Type::UnsignedInt.
2018-07-24 04:44:56 -07:00
bunnei
69c45ce71c gl_rasterizer: Implement texture border color. 2018-07-23 23:34:42 -04:00
bunnei
6b3e54621f maxwell_to_gl: Implement Texture::WrapMode::Border. 2018-07-23 23:34:41 -04:00
Subv
ccc42702b5 GPU: Implement texture format R32F. 2018-07-23 22:21:31 -05:00
bunnei
1ff3bea6c7
Merge pull request #791 from bunnei/rg32f-rgba32f-bgra8
gl_rasterizer_cache: Implement formats BGRA8_UNORM/RGBA32_FLOAT/RG32_FLOAT
2018-07-23 20:13:19 -07:00
bunnei
2ff86f5765 maxwell_to_gl: Implement VertexAttribute::Type::UnsignedInt. 2018-07-23 23:12:14 -04:00
bunnei
92304181d5
Merge pull request #792 from lioncash/retval
gl_shader_decompiler: Correct return value of WriteTexsInstruction()
2018-07-23 20:06:48 -07:00
bunnei
47ac369180
Merge pull request #790 from bunnei/shader-print-instr
gl_shader_decompiler: Print instruction value in shader comments.
2018-07-23 19:48:47 -07:00
bunnei
c2b4ff5d48
Merge pull request #788 from bunnei/shader-check-zero
gl_shader_decompiler: Check if SetRegister result is ZeroIndex.
2018-07-23 19:44:05 -07:00
Lioncash
33e2033af5 gl_shader_decompiler: Correct return value of WriteTexsInstruction()
This should be returning void, not a std::string
2018-07-23 22:31:58 -04:00
bunnei
9505283989 gl_shader_decompiler: Implement shader instruction TLDS. 2018-07-23 22:02:51 -04:00
bunnei
a27c0099ed gl_rasterizer_cache: Implement RenderTargetFormat RG32_FLOAT. 2018-07-23 21:22:54 -04:00
bunnei
3a19c1098d gl_rasterizer_cache: Implement RenderTargetFormat RGBA32_FLOAT. 2018-07-23 21:22:53 -04:00
bunnei
bcc184acfa gl_rasterizer_cache: Implement RenderTargetFormat BGRA8_UNORM. 2018-07-23 21:22:44 -04:00
bunnei
89db8c2171 gl_rasterizer_cache: Add missing log statements. 2018-07-23 21:20:09 -04:00
bunnei
c4322ce87e gl_shader_decompiler: Print instruction value in shader comments. 2018-07-23 21:11:05 -04:00
bunnei
81aa02424b gl_shader_decompiler: Check if SetRegister result is ZeroIndex. 2018-07-23 21:08:40 -04:00
Lioncash
3b88ce3dcb gl_shader_decompiler: Simplify GetCommonDeclarations() 2018-07-23 17:11:18 -04:00
bunnei
e85a528bb9
Merge pull request #769 from bunnei/shader-mask-fixes
shader_bytecode: Implement other TEXS masks.
2018-07-22 18:03:31 -07:00
Lioncash
0797657bc0 gl_shader_decompiler: Remove redundant Subroutine construction in AddSubroutine()
We don't need to toss away the Subroutine instance after the find() call
and reconstruct another instance with the same data right after it.
Particularly give Subroutine contains a std::set.
2018-07-22 03:30:35 -04:00
bunnei
148a5bef7e shader_bytecode: Implement other TEXS masks. 2018-07-22 03:23:15 -04:00
bunnei
af4bde8cd1
Merge pull request #767 from bunnei/shader-cleanup
gl_shader_decompiler: Remove unused state tracking and minor cleanup.
2018-07-22 00:03:17 -07:00
bunnei
f5a2944ab6 gl_shader_decompiler: Remove unused state tracking and minor cleanup. 2018-07-22 01:00:44 -04:00
bunnei
c43eaa94f3 gl_shader_decompiler: Implement SEL instruction. 2018-07-22 00:37:12 -04:00
bunnei
63fbf9a7d3 gl_rasterizer_cache: Blit surfaces on recreation instead of flush and load. 2018-07-21 21:51:06 -04:00
bunnei
4301f0b539 gl_rasterizer_cache: Use GPUVAddr as cache key, not parameter set. 2018-07-21 21:51:06 -04:00
bunnei
cd47391c2d gl_rasterizer_cache: Use zeta_width and zeta_height registers for depth buffer. 2018-07-21 21:51:06 -04:00
bunnei
d8c60029d6 gl_rasterizer: Use zeta_enable register to enable depth buffer. 2018-07-21 21:51:06 -04:00
bunnei
5287991a36 maxwell_3d: Add depth buffer enable, width, and height registers. 2018-07-21 21:51:05 -04:00
bunnei
3ac736c003
Merge pull request #748 from lioncash/namespace
video_core: Use nested namespaces where applicable
2018-07-21 18:50:14 -07:00
bunnei
ff8754f921
Merge pull request #747 from lioncash/unimplemented
gl_shader_manager: Remove unimplemented function prototype
2018-07-21 10:54:58 -07:00
Lioncash
d5bc9aef4e gl_shader_manager: Replace unimplemented function prototype
This was just a linker error waiting to happen.
2018-07-20 18:39:54 -04:00
Lioncash
863579736c gpu: Rename Get3DEngine() to Maxwell3D()
This makes it match its const qualified equivalent.
2018-07-20 18:34:49 -04:00
Lioncash
bb960c8cb4 video_core: Use nested namespaces where applicable
Compresses a few namespace specifiers to be more compact.
2018-07-20 18:23:54 -04:00
bunnei
29f49bd3c1
Merge pull request #738 from lioncash/sign
gl_state: Get rid of mismatched sign conversions in Apply()
2018-07-20 09:21:57 -07:00
bunnei
fbc2bcd4a9
Merge pull request #735 from lioncash/video-unused
maxwell_3d: Remove unused variable within GetStageTextures()
2018-07-20 09:16:15 -07:00
bunnei
204d707ce7
Merge pull request #731 from lioncash/shadow
gl_shader_decompiler: Eliminate variable and declaration shadowing
2018-07-20 09:13:36 -07:00
Lioncash
0faa13baeb gl_state: Make references const where applicable in Apply() 2018-07-20 01:12:29 -04:00
Lioncash
e6b3d3a9ea gl_state: Get rid of mismatched sign conversions
While we're at it, amend the loop variable type to be the same width as
that returned by the .size() call.
2018-07-20 01:11:20 -04:00
Lioncash
8b08f82dc7 maxwell_3d: Remove unused variable within GetStageTextures() 2018-07-19 22:38:28 -04:00
Lioncash
f26866ff6a gl_shader_decompiler: Eliminate variable and declaration shadowing
Ensures that no identifiers are being hidden, which also reduces
compiler warnings.
2018-07-19 20:32:49 -04:00
Lioncash
c2121cb059 gl_shader_decompiler: Remove unnecessary const from return values
This adds nothing from a behavioral point of view, and can inhibit the
move constructor/RVO
2018-07-19 20:11:04 -04:00
bunnei
cf30c4be22 gl_state: Temporarily disable culling and depth test. 2018-07-18 23:21:43 -04:00
bunnei
49b0966003
Merge pull request #687 from lioncash/instance
core: Don't construct instance of Core::System, just to access its live instance
2018-07-18 18:55:58 -07:00
bunnei
b496a9eefe decoders: Fix calc of swizzle image_width_in_gobs. 2018-07-18 21:42:52 -04:00
Lioncash
3a4841e403 core: Don't construct instance of Core::System, just to access its live instance
This would result in a lot of allocations and related object
construction, just to toss it all away immediately after the call.

These are definitely not intentional, and it was intended that all of
these should have been accessing the static function GetInstance()
through the name itself, not constructed instances.
2018-07-18 18:18:27 -04:00
bunnei
b87a71b3fe
Merge pull request #678 from lioncash/astc
astc: Minor changes
2018-07-17 22:06:20 -07:00
Lioncash
6a03badcbc astc: Initialize vector size directly in Decompress
There's no need to perform a separate resize.
2018-07-17 23:58:14 -04:00
Lioncash
0f148548f3 astc: Mark functions as internally linked where applicable 2018-07-17 23:58:14 -04:00
Lioncash
c5803e30d3 astc: const-correctness changes where applicable
A few member functions didn't actually modify class state, so these can
be amended as necessary.
2018-07-17 23:58:14 -04:00
Lioncash
e3fadb9616 astc: Delete Bits' copy contstructor and assignment operator
This also potentially avoids warnings, considering the copy assignment
operator is supposed to have a return value.
2018-07-17 23:58:14 -04:00
Lioncash
4cd52a34b9 astc: In-class initialize member variables where appropriate 2018-07-17 23:58:10 -04:00
bunnei
c3dd456d51 vi: Partially implement buffer crop parameters. 2018-07-17 20:13:17 -04:00
Subv
3d3b10adc7 GPU: Added register definitions for the stencil parameters. 2018-07-17 15:00:21 -05:00
bunnei
3a96670f2d gl_rasterizer_cache: Implement texture format G8R8. 2018-07-15 01:33:42 -04:00
bunnei
aaec0b7e70
Merge pull request #665 from bunnei/fix-z24-s8
gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8.
2018-07-14 22:18:55 -07:00
bunnei
3145114190 gl_rasterizer_cache: Fix incorrect offset in ConvertS8Z24ToZ24S8. 2018-07-15 00:02:05 -04:00
bunnei
e21190f47f gl_rasterizer_cache: Implement depth format Z16_UNORM. 2018-07-14 23:43:28 -04:00
bunnei
2cb3fdca86
Merge pull request #598 from bunnei/makedonecurrent
OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering.
2018-07-14 20:18:11 -07:00
bunnei
05cb10530f OpenGL: Use MakeCurrent/DoneCurrent for multithreaded rendering. 2018-07-14 02:50:35 -04:00
Subv
b37354cca8 GPU: Always enable the depth write when clearing the depth buffer.
The GPU ignores that register when clearing, but OpenGL obeys the glDepthMask parameter, so we set the depth mask to GL_TRUE when clearing the depth buffer. It will be restored to the correct value automatically on the next draw call.
2018-07-14 00:52:23 -05:00
bunnei
8aeff9cf8e gl_rasterizer: Fix check for if a shader stage is enabled. 2018-07-12 22:57:57 -04:00
bunnei
c4015cd93a gl_shader_gen: Implement dual vertex shader mode.
- When VertexA shader stage is enabled, we combine with VertexB program to make a single Vertex Shader stage.
2018-07-12 22:25:36 -04:00
bunnei
64b5e5d5d9
Merge pull request #655 from bunnei/pred-lt-nan
gl_shader_decompiler: Implement PredCondition::LessThanWithNan.
2018-07-12 18:59:15 -07:00
bunnei
49c0c081c4 gl_shader_decompiler: Implement PredCondition::LessThanWithNan. 2018-07-12 20:04:35 -04:00
bunnei
4757ffdcce gl_shader_decompiler: Use FlowCondition field in EXIT instruction. 2018-07-12 20:00:37 -04:00
Sebastian Valle
274d1fb0fc
Merge pull request #652 from Subv/fadd32i
GPU: Implement the FADD32I shader instruction.
2018-07-12 17:36:51 -05:00
bunnei
3ff21345b4
Merge pull request #651 from Subv/ffma_decode
GPU: Corrected the decoding of FFMA for immediate operands.
2018-07-12 12:42:58 -07:00
Subv
c1ae841f47 GPU: Implement the FADD32I shader instruction. 2018-07-12 12:00:31 -05:00
Subv
0cad310e12 GPU: Corrected the decoding of FFMA for immediate operands. 2018-07-12 10:15:48 -05:00
bunnei
854f474f52 gl_rasterizer: Flip triangles when regs.viewport_transform[0].scale_y is negative.
- Fixes a regression with Binding of Isaac.
2018-07-08 16:16:24 -04:00
bunnei
639346bcfb
Merge pull request #625 from Subv/imnmx
GPU: Implemented the IMNMX shader instruction.
2018-07-07 19:33:50 -07:00
Subv
4633dd9505 GPU: Implemented the BC7U texture format.
Note: Our version of glad exports GL_COMPRESSED_RGBA_BPTC_UNORM as GL_COMPRESSED_RGBA_BPTC_UNORM_ARB, maybe it's time we update it.
2018-07-07 09:17:48 -05:00
bunnei
51bd76a5fd
Merge pull request #629 from Subv/depth_test
GPU: Allow using the old NV04 values for the depth test function.
2018-07-05 16:43:10 -04:00
Subv
9f6a5660e8 GPU: Allow using the old NV04 values for the depth test function.
These seem to be just a valid as the GL token values. Thanks @ReinUsesLisp

This restores graphical output to Disgaea 5
2018-07-05 13:01:31 -05:00
bunnei
762bf6a522
Merge pull request #626 from Subv/shader_sync
GPU: Stub the shader SYNC and DEPBAR instructions.
2018-07-05 12:54:19 -04:00
bunnei
637f9d780a
Merge pull request #624 from Subv/f2f_round
GPU: Implemented the F2F 'round' rounding mode.
2018-07-05 11:30:29 -04:00
bunnei
956b5db52e
Merge pull request #623 from Subv/vertex_types
GPU: Implement the Size_16_16 and Size_10_10_10_2 vertex attribute types
2018-07-05 11:30:01 -04:00
bunnei
8b815877a6
Merge pull request #622 from Subv/unused_tex
GPU: Ignore unused textures and corrected the TEX shader instruction decoding.
2018-07-05 11:29:17 -04:00
bunnei
1b0a74e23f
Merge pull request #621 from Subv/psetp_
GPU: Implemented the PSETP shader instruction.
2018-07-05 11:28:50 -04:00
bunnei
9a3c0b161e
Merge pull request #620 from Subv/depth_z32f
GPU: Implemented the 32 bit float depth buffer format.
2018-07-05 11:09:15 -04:00
Subv
b0c92b80b1 GPU: Implemented the IMNMX shader instruction.
It's similar to the FMNMX instruction but it works on integers.
2018-07-04 15:44:37 -05:00
Subv
d800a02b4b GPU: Implemented the F2F 'round' rounding mode.
It's implemented via the GLSL 'roundEven()' function.
2018-07-04 15:43:21 -05:00
Subv
77cfe4f027 GPU: Stub the shader SYNC and DEPBAR instructions.
It is unknown at this moment if we actually need to do something with these instructions or if the GLSL compiler takes care of that for us.
2018-07-04 15:29:51 -05:00
Subv
ce39ae3e57 GPU: Implement the Size_16_16 and Size_10_10_10_2 vertex attribute types.
Both signed and unsigned variants.
2018-07-04 15:22:34 -05:00
Subv
4bda9693be GPU: Ignore textures that the GLSL compiler deemed unused when binding textures to the shaders. 2018-07-04 15:20:12 -05:00
Subv
c42b818cf9 GPU: Corrected the decoding for the TEX shader instruction. 2018-07-04 15:19:20 -05:00
Subv
53a55bd751 GPU: Implemented the PSETP shader instruction.
It's similar to the isetp and fsetp instructions but it works on predicates instead.
2018-07-04 15:15:03 -05:00
Subv
016e357c75 GPU: Implemented the 32 bit float depth buffer format. 2018-07-04 10:42:33 -05:00
Subv
c1bebdef5e GPU: Flip the triangle front face winding if the GPU is configured to not flip the triangles.
OpenGL's default behavior is already correct when the GPU is configured to flip the triangles.

This fixes 1-2 Switch's splash screen.
2018-07-04 10:26:46 -05:00
Subv
5a9df3c675 GPU: Only configure the used framebuffers during clear.
Don't try to configure the color buffer if it is not being cleared, it may not be completely valid at this point.
2018-07-03 22:32:59 -05:00
bunnei
c996787d84
Merge pull request #609 from Subv/clear_buffers
GPU: Implemented the CLEAR_BUFFERS register.
2018-07-03 19:34:34 -04:00
Subv
78443a7f29 GPU: Factor out the framebuffer configuration code for both Clear and Draw commands. 2018-07-03 16:56:47 -05:00
Subv
c1811ed3d1 GPU: Support clears that don't clear the color buffer. 2018-07-03 16:56:47 -05:00
Subv
be51120d23 GPU: Bind and clear the render target when the CLEAR_BUFFERS register is written to. 2018-07-03 16:56:44 -05:00
Subv
827bb08c91 GPU: Added registers for the CLEAR_BUFFERS and CLEAR_COLOR methods. 2018-07-03 16:56:31 -05:00
bunnei
9da1552417 gl_rasterizer_cache: Implement PixelFormat S8Z24. 2018-07-03 14:58:13 -04:00
bunnei
15e68cdbaa
Merge pull request #607 from jroweboy/logging
Logging - Customizable backends
2018-07-03 00:26:45 -04:00
bunnei
e3ca561ea0
Merge pull request #612 from bunnei/fix-cull
gl_rasterizer: Only set cull mode and front face if enabled.
2018-07-02 23:48:52 -04:00
bunnei
ddb767f1b6
Merge pull request #611 from Subv/enabled_depth_test
GPU: Don't try to parse the depth test function if the depth test is disabled and use only the least significant 3 bits in the depth test func
2018-07-02 23:47:11 -04:00
bunnei
5410b4659d
Merge pull request #610 from Subv/mufu_8
GPU: Implemented MUFU suboperation 8, sqrt.
2018-07-02 22:26:42 -04:00
bunnei
a9cacd03f6 gl_rasterizer: Only set cull mode and front face if enabled. 2018-07-02 22:22:25 -04:00
Subv
6e0eba9917 GPU: Use only the least significant 3 bits when reading the depth test func.
Some games set the full GL define value here (including nouveau), but others just seem to set those last 3 bits.
2018-07-02 21:06:36 -05:00
Subv
65c664560c GPU: Don't try to parse the depth test function if the depth test is disabled. 2018-07-02 21:02:46 -05:00
James Rowe
0d46f0df12 Update clang format 2018-07-02 21:45:47 -04:00
James Rowe
638956aa81 Rename logging macro back to LOG_* 2018-07-02 21:45:47 -04:00
bunnei
92c7135065
Merge pull request #608 from Subv/depth
GPU: Implemented the depth buffer and depth test + culling
2018-07-02 21:24:43 -04:00
Subv
a6d4903aaf GPU: Set up the culling configuration on each draw. 2018-07-02 19:51:29 -05:00
Subv
6e4e0b2b41 GPU: Implemented MUFU suboperation 8, sqrt. 2018-07-02 19:48:15 -05:00
Sebastian Valle
055f1546d7
Merge pull request #606 from Subv/base_vertex
GPU: Fixed the index offset and implement BaseVertex when doing indexed rendering.
2018-07-02 14:07:38 -05:00
Sebastian Valle
9685dd5840
Merge pull request #605 from Subv/dma_copy
GPU: Directly copy the pixels when performing a same-layout DMA.
2018-07-02 14:06:56 -05:00
Subv
18c8ae7750 GPU: Set up the depth test state on every draw. 2018-07-02 13:33:06 -05:00
Subv
d480b63e0d MaxwellToGL: Added conversion functions for depth test and cull mode. 2018-07-02 13:31:49 -05:00
Subv
c1f55c32c8 GPU: Added registers for depth test and cull mode. 2018-07-02 13:31:20 -05:00
Subv
0f929762b3 GPU: Implemented the Z24S8 depth format and load the depth framebuffer. 2018-07-02 12:42:04 -05:00
Subv
4c59105adf GPU: Implement offsetted rendering when using non-indexed drawing. 2018-07-02 11:23:36 -05:00
Subv
fca3d1cc65 GPU: Fixed the index offset rendering, and implemented the base vertex functionality.
This fixes Stardew Valley.
2018-07-02 11:22:17 -05:00
Subv
cc73bad293 GPU: Added register definitions for the vertex buffer base element. 2018-07-02 11:21:23 -05:00
bunnei
3d41fdfbba
Merge pull request #604 from Subv/invalid_textures
GPU: Ignore invalid and disabled textures when drawing.
2018-07-02 11:48:18 -04:00
Subv
ca633a5a3c GPU: Directly copy the pixels when performing a same-layout DMA. 2018-07-02 09:46:33 -05:00
Subv
80c5e8ae99 GPU: Ignore disabled textures and textures with an invalid address. 2018-07-02 09:43:38 -05:00
Subv
e9d147349b GPU: Allow GpuToCpuAddress to return boost::none for unmapped addresses. 2018-07-02 09:42:48 -05:00
bunnei
066d6184d4
Merge pull request #602 from Subv/mufu_subop
GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation.
2018-07-01 11:06:04 -04:00
bunnei
b611d852db
Merge pull request #601 from Subv/rgba32_ui
GPU: Implement the RGBA32_UINT rendertarget format.
2018-07-01 03:22:38 -04:00
Subv
f33e406ff2 GPU: Corrected the size of the MUFU subop field, and removed incorrect "min" operation. 2018-06-30 14:48:25 -05:00
Subv
c0e2d52758 GPU: Implemented the RGBA32_UINT rendertarget format. 2018-06-30 14:23:13 -05:00
Subv
b11072d54a GLCache: Specify the component type along the texture type in the format tuple. 2018-06-30 14:08:51 -05:00
bunnei
c96da97630 gl_shader_decompiler: Implement predicate NotEqualWithNan. 2018-06-30 03:01:25 -04:00
bunnei
50ef2beb58
Merge pull request #595 from bunnei/raster-cache
Rewrite the OpenGL rasterizer cache
2018-06-29 14:07:28 -04:00
bunnei
c18425ef98 gl_rasterizer_cache: Only dereference color_surface/depth_surface if valid. 2018-06-29 13:08:08 -04:00
bunnei
7fa9177830
gl_shader_decompiler: Add a return path for unknown instructions. 2018-06-27 01:14:34 -04:00
bunnei
1dd754590f gl_rasterizer_cache: Implement caching for texture and framebuffer surfaces.
gl_rasterizer_cache: Improved cache management based on Citra's implementation.

gl_surface_cache: Add some docstrings.
2018-06-27 00:15:44 -04:00
bunnei
8af1ae46aa gl_rasterizer_cache: Various fixes for ASTC handling. 2018-06-27 00:08:04 -04:00
bunnei
c7c379bd19 gl_rasterizer_cache: Use SurfaceParams as a key for surface caching. 2018-06-27 00:08:04 -04:00
bunnei
6a28a66832 maxwell_3d: Add a struct for RenderTargetConfig. 2018-06-27 00:08:04 -04:00
bunnei
3f9f047375 gl_rasterizer: Implement AccelerateDisplay to forward textures to framebuffers. 2018-06-27 00:08:03 -04:00
bunnei
ff6785f3e8 gl_rasterizer_cache: Cache size_in_bytes as a const per surface. 2018-06-27 00:08:03 -04:00
bunnei
9f2f819bb6 gl_rasterizer_cache: Refactor to make SurfaceParams members const. 2018-06-27 00:08:03 -04:00
bunnei
5f57ab1b2a gl_rasterizer_cache: Remove Citra's rasterizer cache, always load/flush surfaces. 2018-06-27 00:08:03 -04:00
bunnei
10422f3c18 gl_rasterizer: Workaround for when exceeding max UBO size. 2018-06-26 23:07:34 -04:00
bunnei
dfac394e60
Merge pull request #593 from bunnei/fix-swizzle
gl_state: Fix state management for texture swizzle.
2018-06-26 22:05:49 -04:00
bunnei
73de9bab1a
Merge pull request #592 from bunnei/cleanup-gl-state
gl_state: Remove unused state management from 3DS.
2018-06-26 22:05:03 -04:00
bunnei
8447d20a11 gl_state: Fix state management for texture swizzle. 2018-06-26 17:15:58 -04:00
bunnei
20b58bab9c gl_state: Remove unused state management from 3DS. 2018-06-26 17:09:25 -04:00
bunnei
41b3725d28 gl_rasterizer_cache: Fix inverted B5G6R5 format. 2018-06-26 17:07:36 -04:00
bunnei
36dedae842
Merge pull request #554 from Subv/constbuffer_ubo
Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
2018-06-26 10:25:56 -04:00
mailwl
ad39bab271 Fix crash at exit 2018-06-25 18:01:08 +03:00
Subv
a3d82ef5d9 Build: Fixed some MSVC warnings in various parts of the code. 2018-06-20 11:39:10 -05:00
bunnei
7a0bb406d5
Merge pull request #574 from Subv/shader_abs_neg
GPU: Perform negation after absolute value in the float shader instructions.
2018-06-18 22:24:57 -04:00
Subv
38989bef43 GPU: Perform negation after absolute value in the float shader instructions. 2018-06-18 19:56:29 -05:00
Subv
eab7457c00 GPU: Don't mark uniform buffers and registers as used for instructions which don't have them.
Like the MOV32I and FMUL32I instructions.
This fixes a potential crash when using these instructions.
2018-06-18 19:50:35 -05:00
bunnei
0e13d9cb7b
Merge pull request #570 from bunnei/astc
gl_rasterizer: Implement texture format ASTC_2D_4X4.
2018-06-18 19:08:49 -04:00
bunnei
ea080501fb
Merge pull request #571 from Armada651/loose-blend
gl_rasterizer: Get loose on independent blending.
2018-06-18 11:36:50 -04:00
Jules Blok
7c7f4a9be2 gl_rasterizer: Get loose on independent blending. 2018-06-18 09:27:06 +02:00
bunnei
61779fa072 gl_rasterizer: Implement texture format ASTC_2D_4X4. 2018-06-18 01:56:59 -04:00
bunnei
fe906fff36 gl_rasterizer_cache: Loosen things up a bit. 2018-06-18 00:55:59 -04:00
bunnei
afdd657d30 gl_shader_decompiler: Implement LOP instructions. 2018-06-17 15:27:48 -04:00
bunnei
5673ce39c7 gl_shader_decompiler: Refactor LOP32I instruction a bit in support of LOP. 2018-06-17 13:31:39 -04:00
bunnei
d383043e07 gl_shader_decompiler: Implement integer size conversions for I2I/I2F/F2I. 2018-06-15 22:42:02 -04:00
bunnei
fb5bd0920d
Merge pull request #564 from bunnei/lop32i_passb
gl_shader_decompiler: Implement LOP32I LogicOperation PassB.
2018-06-15 22:04:03 -04:00
bunnei
55c49d5bf4 gl_shader_gen: Set position.w to 1. 2018-06-15 20:47:04 -04:00
bunnei
61f9d9c4ab gl_shader_decompiler: Implement LOP32I LogicOperation PassB. 2018-06-15 20:43:33 -04:00
bunnei
019d7208c8
Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei
2015a1b180
Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv
db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv
987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei
5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
Subv
004b1b3830 GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.
This corrects the invalid position values in some games when doing attribute-less rendering.
2018-06-10 13:50:19 -05:00
Subv
2a7653142d Rasterizer: Use UBOs instead of SSBOs for uploading const buffers.
This should help a bit with GPU performance once we're GPU-bound.
2018-06-09 18:02:05 -05:00
Subv
b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv
3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei
d81aaa3ed3
Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei
e2176dc7ce
Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei
5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv
abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei
bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei
79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
Subv
c011b6f67e GPU: Synchronize the blend state on every draw call.
Only independent blending on render target 0 is implemented for now.

This fixes the elongated squids in Splatoon 2's boot screen.
2018-06-08 17:05:52 -05:00
Subv
c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei
a931cf9e8b
Merge pull request #547 from Subv/compressed_alignment
GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
2018-06-08 16:40:49 -04:00
Subv
8d9534d830 GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
This fixes issues with retrieving non-block-aligned tiled compressed textures from the cache.
2018-06-08 12:27:19 -05:00
Subv
47dc5e0dab Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.
This fixes the flip_viewport uniform having invalid values when drawing.
2018-06-08 12:22:39 -05:00
bunnei
ee318d4015
Merge pull request #543 from Subv/uniforms
GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
2018-06-07 11:21:36 -04:00
Subv
86146ef819 GLRenderer: Write the shader stage configuration UBO data *before* copying it to the GPU.
This should fix the bug with the vs_config UBO being uninitialized during shader execution.
2018-06-07 08:33:23 -05:00
bunnei
0639e03055
Merge pull request #542 from bunnei/bfe_imm
gl_shader_decompiler: Implement BFE_IMM instruction.
2018-06-07 01:49:45 -04:00
bunnei
930487c7fb
Merge pull request #541 from Subv/blittextures
GLCache: Fixed copying compressed textures in the rasterizer cache.
2018-06-07 01:35:01 -04:00
bunnei
92209f905f gl_shader_decompiler: Implement BFE_IMM instruction. 2018-06-07 00:58:12 -04:00
Subv
f22e090b86 GLCache: Use the full uncompressed size when blitting from one texture to another.
This avoids the problem of only copying a tiny piece of the textures when they are compressed.
2018-06-06 23:26:36 -05:00
Subv
218a08df93 GLCache: Simplify the logic to copy from one texture to another in BlitTextures.
We now use glCopyImageSubData, this should avoid errors with trying to attach a compressed texture as a framebuffer's color attachment and then blitting to it.

Maybe in the future we can change this to glCopyTextureSubImage which only requires GL_ARB_direct_state_access.
2018-06-06 23:25:24 -05:00
bunnei
128aeba0f3 gl_shader_decompiler: F2F: Implement rounding modes. 2018-06-06 22:21:29 -04:00
bunnei
03f877919d
Merge pull request #537 from bunnei/misc-shader
gl_shader_decompiler: Additional decodings, remove unused stuff from TEX
2018-06-06 21:44:37 -04:00
bunnei
37f50c8773
Merge pull request #535 from Subv/gpu_swizzle
GPU: Support changing the texture swizzles for Maxwell textures.
2018-06-06 21:39:47 -04:00
bunnei
00c830405b gl_shader_decompiler: Remove some attribute stuff that has nothing to do with TEX/TEXS. 2018-06-06 19:47:41 -04:00
bunnei
4b114e1b8a shader_bytecode: Add instruction decodings for BFE, IMNMX, and XMAD. 2018-06-06 19:47:34 -04:00
bunnei
0a49c46353 gl_shader_decompiler: Implement ISETP_IMM instruction. 2018-06-06 19:45:58 -04:00
Subv
47629c89a8 GPU: Support changing the texture swizzles for Maxwell textures. 2018-06-06 18:36:15 -05:00
Subv
89e81a9be2 GLState: Support changing the GL_TEXTURE_SWIZZLE parameter of each texture unit. 2018-06-06 18:36:13 -05:00
bunnei
0ff2929644
Merge pull request #534 from Subv/multitexturing
GPU: Implement sampling multiple textures in the generated glsl shaders.
2018-06-06 19:12:52 -04:00
bunnei
4669f15f8b gl_shader_decompiler: Implement LD_C instruction. 2018-06-06 18:09:06 -04:00
bunnei
4112aa68a6 gl_shader_gen: Add uniform handling for indirect const buffer access. 2018-06-06 18:09:05 -04:00
bunnei
6e386a334b gl_shader_decompiler: Refactor uniform handling to allow different decodings. 2018-06-06 17:57:15 -04:00
Subv
dbfc39d214 GPU: Implement sampling multiple textures in the generated glsl shaders.
All tested games that use a single texture show no regression.

Only Texture2D textures are supported right now, each shader gets its own "tex_fs/vs/gs" sampler array to maintain independent textures between shader stages, the textures themselves are reused if possible.
2018-06-06 12:58:16 -05:00
Sebastian Valle
ce026332a5
Merge pull request #531 from bunnei/fix-shl
gl_shader_decompiler: Fix un/signed mismatch with SHL.
2018-06-06 08:28:42 -05:00
Sebastian Valle
fa220dd709
Merge pull request #530 from bunnei/wrap-mirror
maxwell_to_gl: Implement WrapMode Mirror.
2018-06-06 08:28:27 -05:00
bunnei
9a85277d83
Merge pull request #527 from Subv/rgba32f_texcopy
GPU: Allow the usage of RGBA32_FLOAT and RGBA16_FLOAT in the texture copy engine.
2018-06-06 00:24:13 -04:00
bunnei
05dc93399b
Merge pull request #528 from Subv/rg11b10f
GPU: Implemented the R11FG11FB10F texture and rendertarget formats.
2018-06-06 00:22:54 -04:00
bunnei
566f97b580 gl_shader_decompiler: Fix un/signed mismatch with SHL. 2018-06-05 23:58:06 -04:00
bunnei
bf0543af23 maxwell_to_gl: Implement WrapMode Mirror. 2018-06-05 23:56:45 -04:00
Subv
adf47cd59a GPU: Allow the usage of RGBA16_FLOAT in the texture copy engine. 2018-06-05 22:01:20 -05:00
Subv
c531a92eda GPU: Implemented the R11FG11FB10F texture and rendertarget formats. 2018-06-05 21:57:16 -05:00
Subv
14afc704d4 GPU: Fixed the compression factor for RGBA16F textures.
They're not compressed.
2018-06-05 21:55:17 -05:00
Subv
8d70d1ea45 GPU: Allow the usage of RGBA32_FLOAT in the texture copy engine. 2018-06-05 21:07:40 -05:00
bunnei
5fb99e6a16
Merge pull request #516 from Subv/f2i_r
GPU: Implemented the F2I_R shader instruction.
2018-06-05 22:01:29 -04:00
bunnei
38eb33f150
Merge pull request #521 from Subv/bra
GPU: Corrected the branch targets for the shader bra instruction.
2018-06-05 10:09:35 -04:00
bunnei
b54a72afc0
Merge pull request #520 from bunnei/shader-shl
gl_shader_decompiler: Implement SHL instruction.
2018-06-05 10:08:42 -04:00
Subv
e7dfcdde74 GPU: Corrected the branch targets for the shader bra instruction. 2018-06-04 22:56:28 -05:00
Subv
4b89348c00 GPU: Implemented the F2I_R shader instruction. 2018-06-04 22:06:50 -05:00
bunnei
8c99dd055c
Merge pull request #518 from Subv/incomplete_shaders
GPU: Implemented predicated exit instructions in the shader programs.
2018-06-04 22:43:46 -04:00
bunnei
799e632ccb gl_shader_decompiler: Fix typo with ISCADD instruction. 2018-06-04 22:41:10 -04:00
bunnei
c23c30c76f gl_shader_decompiler: Implement SHL instruction. 2018-06-04 22:36:49 -04:00
bunnei
6ea1576513 gl_shader_decompiler: Implement PredCondition::NotEqual. 2018-06-04 22:00:47 -04:00
Subv
23b1e6eded GPU: Implement the ISCADD shader instructions. 2018-06-04 20:17:41 -05:00
Subv
438a9b70cc GPU: Added decodings for the ISCADD instructions. 2018-06-04 20:17:39 -05:00
bunnei
e8bfff7b4b
Merge pull request #514 from Subv/lop32i
GPU: Implemented the LOP32I instruction.
2018-06-04 20:48:15 -04:00
bunnei
f564822e78
Merge pull request #510 from Subv/isetp
GPU: Implemented the ISETP_R and ISETP_C instructions
2018-06-04 20:47:11 -04:00
Subv
6cf6fa2842 GPU: Implement predicated exit instructions in the shader programs. 2018-06-04 19:18:11 -05:00
Subv
d27279092f GPU: Take into account predicated exits when performing shader control flow analysis. 2018-06-04 19:14:23 -05:00
bunnei
37fd4e6d9b
Merge pull request #512 from Subv/fset
GPU: Corrected the FSET and I2F instructions.
2018-06-04 19:04:20 -04:00
bunnei
cdd92dc692
Merge pull request #501 from Subv/shader_bra
GPU: Partially implemented the bra shader instruction
2018-06-04 18:31:07 -04:00
bunnei
38d25a4cb2
Merge pull request #515 from Subv/viewport_fix
GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
2018-06-04 18:11:36 -04:00
Subv
2933521a08 GPU: Use the bf bit in FSET to determine whether to write 0xFFFFFFFF or 1.0f. 2018-06-04 16:41:28 -05:00
Subv
f6679ce422 GPU: Corrected the I2F_R implementation. 2018-06-04 16:41:27 -05:00
Subv
5d55403f94 GPU: Calculate the correct viewport dimensions based on the scale and translate registers.
This is how nouveau calculates the viewport width and height. For some reason some games set 0xFFFF in the VIEWPORT_HORIZ and VIEWPORT_VERT registers, maybe those are a misnomer and actually refer to something else?
2018-06-04 16:36:54 -05:00
Subv
0c688b421c GPU: Implemented the LOP32I instruction. 2018-06-04 13:56:31 -05:00
Subv
cb47abecc6 GLCache: Corrected a mismatch between storing compressed sizes and verifying the uncompressed alignment in GetSurface. 2018-06-04 13:01:53 -05:00
Subv
90cddf1996 GPU: Use explicit types when retrieving the uniform values for fsetp/fset and isetp instead of the type of an invalid output register. 2018-06-04 11:22:26 -05:00
Subv
7c181fd4f4 GPU: Implemented the ISETP_R and ISETP_C shader instructions. 2018-06-04 11:12:03 -05:00
Subv
b481d8a00d GPU: Partially implemented the shader BRA instruction. 2018-06-03 22:26:36 -05:00
Subv
06c72b4fcf GPU: Added decoding for the BRA instruction. 2018-06-03 22:14:00 -05:00
bunnei
ba117854f9
Merge pull request #500 from Subv/long_queries
GPU: Partial implementation of long GPU queries.
2018-06-03 21:24:50 -04:00
Subv
d57333406d GPU: Partial implementation of long GPU queries.
Long queries write a 128-bit result value to memory, which consists of a 64 bit query value and a 64 bit timestamp.

In this implementation, only select=Zero of the Crop unit is implemented, this writes the query sequence as a 64 bit value, and a 0u64 value for the timestamp, since we emulate an infinitely fast GPU.

This specific type was hwtested, but more rigorous tests should be performed in the future for the other types.
2018-06-03 19:17:31 -05:00
bunnei
1efcba346a gl_shader_decompiler: Implement TEXS component mask. 2018-06-03 12:08:17 -04:00
bunnei
bb9d39b8fe
Merge pull request #494 from bunnei/shader-tex
gl_shader_decompiler: Implement TEX, fixes for TEXS.
2018-06-03 12:05:38 -04:00
bunnei
27c0f9e02d
Merge pull request #495 from bunnei/improve-rro
gl_shader_decompiler: Implement RRO as a register move.
2018-06-03 12:05:26 -04:00
bunnei
e54ea773fc gl_shader_decompiler: Implement RRO as a register move. 2018-06-03 11:14:31 -04:00
Subv
99f9d47d16 GPU: Implemented the DXN1 (BC4) texture format. 2018-06-02 13:17:09 -05:00
bunnei
888eb345c0 gl_shader_decompiler: Implement TEX instruction. 2018-05-31 23:36:45 -04:00
bunnei
4c727d0ba8 gl_shader_decompiler: Support multi-destination for TEXS. 2018-05-31 22:57:32 -04:00
bunnei
49309b5848 gl_rasterizer_cache: Assert that component type is UNorm or format is RGBA16F. 2018-05-30 22:50:41 -04:00
bunnei
ca5a4a704b gl_rasterizer_cache: Implement PixelFormat RGBA16F. 2018-05-30 22:24:07 -04:00
bunnei
15086a22be
Merge pull request #489 from Subv/vertexid
Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader.
2018-05-30 14:10:48 -04:00
Subv
99f12b05fa Shaders: Implemented reading the gl_InstanceID and gl_VertexID variables in the vertex shader. 2018-05-30 10:58:03 -05:00
Sebastian Valle
8df011a57f
Merge pull request #483 from bunnei/sonic
Several GPU fixes to boot Sonic Mania
2018-05-30 07:31:46 -05:00
bunnei
6fcc7e9c36 gl_shader_decompiler: F2F_R instruction: Implement abs. 2018-05-29 23:52:54 -04:00
bunnei
68937a662d gl_shader_decompiler: Partially implement F2F_R instruction. 2018-05-29 23:10:44 -04:00
Subv
734106dcb9 GPU: Implemented the R8 texture format (0x1D) 2018-05-29 21:49:37 -05:00
bunnei
0d843eaba6 gl_rasterize_cache: Invert order of tex format RGB565. 2018-05-29 22:16:18 -04:00
greggameplayer
220d4672df add all the known TextureFormat (#474) 2018-05-28 19:26:17 -04:00
bunnei
d809f65827
Merge pull request #472 from bunnei/greater-equal
gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual.
2018-05-27 12:14:30 -04:00
bunnei
7f155ba713
Merge pull request #476 from Subv/a1bgr5
GPU: Implemented the A1B5G5R5 texture format (0x14)
2018-05-27 12:14:08 -04:00
Subv
7ddc872b52 GPU: Implemented the A1B5G5R5 texture format (0x14) 2018-05-27 09:02:05 -05:00
bunnei
c23ce3365d gl_shader_decompiler: Implement GetPredicateComparison GreaterEqual. 2018-05-25 23:21:29 -04:00
bunnei
ee53688ca7 shader_bytecode: Implement other variants of FMNMX. 2018-05-25 23:18:50 -04:00
bunnei
aee356bd10
Merge pull request #468 from Subv/compound_preds
Shader: Implemented compound predicates in the fset and fsetp instructions
2018-05-25 22:28:47 -04:00
Subv
e2cdf54177 Shader: Implemented compound predicates in fset.
You can specify a predicate in the fset instruction:

Result = ((Value1 Comp Value2) OP P0) ? 1.0 : 0.0;
2018-05-24 17:39:59 -05:00
Subv
e2db7a83f6 GPU: Allow command lists to rebind a channel to another engine in the middle of the command list. 2018-05-24 17:32:46 -05:00
Subv
126270d963 Shader: Implemented compound predicates in fsetp.
You can specify three predicates in an fsetp instruction:

P1 = (Value1 Comp Value2) OP P0;
P2 = !(Value1 Comp Value2) OP P0;
2018-05-24 17:22:36 -05:00
bunnei
58857b9f46
Merge pull request #456 from Subv/unmap_buffer
Implemented nvhost-as-gpu's UnmapBuffer and nvmap's Free ioctls.
2018-05-20 23:54:50 -04:00
bunnei
898f0fa029
Merge pull request #458 from Subv/fmnmx
Shaders: Implemented the FMNMX shader instruction.
2018-05-20 23:44:07 -04:00
Sebastian Valle
6486544e09
Merge pull request #452 from Subv/psetp
ShadersDecompiler: Added decoding for the PSETP instruction.
2018-05-20 20:00:55 -05:00
Sebastian Valle
2dbfcd32d7
Merge pull request #451 from Subv/gl_array_size
GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
2018-05-20 20:00:40 -05:00
Subv
8440cef223 Shaders: Implemented the FMNMX shader instruction. 2018-05-20 17:53:06 -05:00
Subv
72b5c448cf GPU: Implemented nvhost-as-gpu's UnmapBuffer ioctl.
It removes a mapping previously created with the MapBufferEx ioctl.
2018-05-20 14:25:56 -05:00
Subv
a056d5ad8c ShadersDecompiler: Added decoding for the PSETP instruction. 2018-05-19 11:41:14 -05:00
Subv
98b143c2d6 GLRenderer: Remove unused hw_vao_enabled_attributes variable. 2018-05-19 11:36:38 -05:00
Subv
370ab5df9b GLRenderer: Remove unused vertex buffer and increase the size of the stream buffer to 128 MB.
The stream buffer is where all the vertex data is copied, some games require this to be much bigger than the 4 MB we used to have.
2018-05-19 11:36:09 -05:00
Subv
21959ddfef GLRenderer: Log the shader source code when program linking fails. 2018-05-19 11:19:34 -05:00
Lioncash
7c9644646f
general: Make formatting of logged hex values more straightforward
This makes the formatting expectations more obvious (e.g. any zero padding specified
is padding that's entirely dedicated to the value being printed, not any pretty-printing
that also gets tacked on).
2018-05-02 09:49:36 -04:00
bunnei
225ff1130f
Merge pull request #422 from bunnei/shader-mov
Shader instructions MOV_C, MOV_R, and several minor GPU things
2018-04-29 21:47:42 -04:00
bunnei
f41eb95e13 maxwell_3d: Reset vertex counts after drawing. 2018-04-29 16:23:31 -04:00
bunnei
08b8fcbe6d gl_shader_decompiler: Implement MOV_R. 2018-04-29 16:05:18 -04:00
bunnei
316327f487 maxwell_to_gl: Implement type SignedNorm, Size_8_8_8_8. 2018-04-29 16:05:17 -04:00
bunnei
c7ce472eeb shader_bytecode: Add decoding for FMNMX instruction. 2018-04-29 16:05:17 -04:00
Subv
da32c648bf Shaders: Implemented predicate condition 3 (LessEqual) in the fset and fsetp instructions. 2018-04-29 12:49:41 -05:00
bunnei
a71346cd7c gl_shader_decompiler: Implement MOV_C. 2018-04-29 13:13:13 -04:00
bunnei
6c464a2a4a
Merge pull request #416 from bunnei/shader-ints-p3
gl_shader_decompiler: Implement MOV32I, partially implement I2I, I2F
2018-04-29 12:56:16 -04:00
bunnei
f87ea8fa8b fermi_2d: Fix surface copy block height. 2018-04-28 20:40:03 -04:00
bunnei
0c01c34eff gl_shader_decompiler: Partially implement I2I_R, and I2F_R. 2018-04-28 20:03:19 -04:00
bunnei
e73927cfc2 gl_shader_decompiler: More cleanups, etc. with how we handle register types. 2018-04-28 20:03:19 -04:00
bunnei
c691fa4074 GLSLRegister: Simplify register declarations, etc. 2018-04-28 20:03:19 -04:00
bunnei
f2dcb39049 shader_bytecode: Add decodings for i2i instructions. 2018-04-28 20:03:18 -04:00
bunnei
a7b5ab4d9a gl_shader_decompiler: Implement MOV32_IMM instruction. 2018-04-28 20:03:18 -04:00
bunnei
6b365f7703
Merge pull request #408 from bunnei/shader-ints-p2
gl_shader_decompiler: Add GLSLRegisterManager class to track register state.
2018-04-27 16:06:09 -04:00
Lioncash
16198f979e
renderer_opengl: Replace usages of LOG_GENERIC with fmt-capable equivalents 2018-04-27 12:09:35 -04:00
bunnei
e6242ab5e6 gl_shader_decompiler: Add GLSLRegisterManager class to track register state. 2018-04-27 11:49:26 -04:00
Lioncash
8475496630
general: Convert assertion macros over to be fmt-compatible 2018-04-27 10:04:02 -04:00
bunnei
c9d7abe9c9 gl_shader_decompiler: Boilerplate for handling integer instructions. 2018-04-26 14:38:42 -04:00
bunnei
37fa9a15cd gl_shader_decompiler: Move color output to EXIT instruction. 2018-04-26 14:38:41 -04:00
bunnei
f81b915fd8
Merge pull request #396 from Subv/shader_ops
Shaders: Implemented the FSET instruction.
2018-04-25 22:42:54 -04:00
Subv
20d86d8a36 GPU: Partially implemented the Fermi2D surface copy operation.
The hardware allows for some rather complicated operations to be performed on the data during the copy, this is not implemented.
Only same-format same-size raw copies are implemented for now.
2018-04-25 12:54:26 -05:00
Subv
e9ad8e9185 Shaders: Added bit decodings for the I2I instruction. 2018-04-25 12:52:55 -05:00
Subv
1740aa5444 Shaders: Implemented the FSET instruction.
This instruction is similar to the FSETP instruction, but it doesn't set a predicate, it sets the destination register to 1.0 if the condition holds, and 0 otherwise.
2018-04-25 12:52:32 -05:00
Subv
1dd4861d38 GPU: Make the Textures::CopySwizzledData function accessible from the outside of the file. 2018-04-25 11:55:30 -05:00
Subv
a6da2b93c1 GPU: Added a function to retrieve the bytes per pixel of the render target formats. 2018-04-25 11:55:29 -05:00
Subv
378c881427 GPU: Added surface copy registers to Fermi2D 2018-04-25 11:55:29 -05:00
Subv
b1109931b9 GPU: Added boilerplate code for the Fermi2D engine 2018-04-25 11:55:29 -05:00
Subv
c16cfbbc6c GPU: Reduce the number of registers of Maxwell3D to 0xE00.
The rest are just macro shim registers.
2018-04-25 11:55:28 -05:00
Subv
a994446b6e GPU: Move the Maxwell3D macro uploading code to the inside of the Maxwell3D processor.
It doesn't belong in the PFIFO handler.
2018-04-25 11:55:27 -05:00
Subv
e2f2a49d2d GPU: Corrected the upper bound of the PFIFO method ids in the command processor. 2018-04-25 11:53:54 -05:00
Lioncash
b7551e457b
video-core: Move logging macros over to new fmt-capable ones 2018-04-25 09:13:57 -04:00
Subv
0369ee7248 Shaders: Added decodings for the FSET instructions. 2018-04-24 22:42:54 -05:00
bunnei
c30cd898fc renderer_opengl: Use correct byte order for framebuffer pixel format ABGR8. 2018-04-24 22:31:46 -04:00
bunnei
f1a4a004fb gl_rasterizer_cache: Use CHAR_BIT for bpp conversions instead of 8. 2018-04-24 22:31:46 -04:00
bunnei
0a023cfb4f gl_rasterizer_cache: Use GPU PAGE_BITS/SIZE, not CPU. 2018-04-24 22:31:46 -04:00
bunnei
9022d926eb gl_rasterizer_cache: Use new logger. 2018-04-24 22:31:46 -04:00
bunnei
fbb3cd110c gl_rasterizer_cache: Add a function for finding framebuffer GPU address. 2018-04-24 22:31:46 -04:00
bunnei
bc0f1896fc gl_rasterizer_cache: Handle compressed texture sizes. 2018-04-24 22:31:46 -04:00
bunnei
4415e00181 gl_rasterizer_cache: Update to be based on GPU addresses, not CPU addresses. 2018-04-24 22:31:45 -04:00
bunnei
10c6d89119 memory_manager: Add implement CpuToGpuAddress. 2018-04-24 17:49:20 -04:00
bunnei
239ac8abe2 memory_manager: Make GpuToCpuAddress return an optional. 2018-04-24 17:49:19 -04:00
bunnei
9e11a76e92 memory_manager: Use GPUVAdddr, not PAddr, for GPU addresses. 2018-04-24 17:40:43 -04:00
bunnei
e8c2bb24b2
Merge pull request #386 from Subv/gpu_query
GPU: Added asserts to our code for handling the QUERY_GET GPU command.
2018-04-24 16:13:51 -04:00
Lioncash
d1b23b2b51
renderer_opengl: Silence a -Wdangling-else warning in DrawScreenTriangles() 2018-04-24 11:13:08 -04:00
bunnei
07dc0bbf3e
Merge pull request #379 from Subv/multi_buffers
GPU: Support multiple enabled vertex arrays.
2018-04-24 01:09:02 -04:00
Subv
f208953585 GPU: Added asserts to our code for handling the QUERY_GET GPU command.
This is based on research from nouveau. Many things are currently unknown and will require hwtests in the future.
This commit also stubs QueryMode::Write2 to do the same as Write. Nouveau code treats them interchangeably, it is currently unknown what the difference is.
2018-04-23 17:06:57 -05:00
bunnei
3967f9c6ef
Merge pull request #383 from Subv/gpu_mmu
GPU: Make the GPU virtual memory manager use 16 page bits and 10 pagetable bits.
2018-04-23 14:00:52 -04:00
Subv
9531a29283 GPU: Support multiple enabled vertex arrays.
The vertex arrays will be copied to the stream buffer one after the other, and the attributes will be set using the ARB_vertex_attrib_binding extension.

yuzu now thus requires OpenGL 4.3 or the ARB_vertex_attrib_binding extension.
2018-04-23 11:34:50 -05:00
Subv
f823c1d599 GPU: Make the GPU virtual memory manager use 16 page bits and 10 page table bits.
Also removed some dead code and added memory map consistency asserts.
2018-04-23 10:57:12 -05:00
Subv
010227e149 GPU: Implement the RGB10_A2 RenderTarget format, it will use the same format as the A2BGR10 texture format. 2018-04-23 10:50:28 -05:00
Subv
c079cf4eec GPU: Implement the A2BGR10 texture format. 2018-04-21 17:32:25 -05:00
bunnei
f8764bb5d3
Merge pull request #376 from bunnei/shader-decoder
Shader opcode decoding
2018-04-21 00:04:51 -04:00
bunnei
f8a037ead4
Merge pull request #375 from lioncash/header
opengl: Remove unnecessary header inclusions
2018-04-20 23:08:47 -04:00
bunnei
d08fd7e86d gl_shader_decompiler: Skip RRO instruction. 2018-04-20 22:30:56 -04:00
bunnei
8b28dc55e6 gl_shader_decompiler: Cleanup error logging. 2018-04-20 22:30:56 -04:00
bunnei
e1630c4d43 shader_bytecode: Add several more instruction decodings. 2018-04-20 22:30:56 -04:00
bunnei
9f6d305eab shader_bytecode: Decode instructions based on bit strings. 2018-04-20 22:30:56 -04:00
bunnei
8ac3a3f45e
Merge pull request #369 from Subv/shader_instr2
ShaderGen: Implemented fsetp/kil and predicated instruction execution.
2018-04-20 22:29:39 -04:00
bunnei
634d9ee18b
Merge pull request #374 from lioncash/noexcept
gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
2018-04-20 22:28:47 -04:00
Subv
17a0ef1e1e ShaderGen: Implemented the KIL instruction, which is equivalent to 'discard'. 2018-04-20 21:09:34 -05:00
Subv
c3a8ea76f1 ShaderGen: Implemented predicated instruction execution.
Each predicated instruction will be wrapped in an `if (predicate) { instruction_body; }` in the GLSL, where `predicate` is one of the predicate boolean variables previously set by fsetp.
2018-04-20 21:09:33 -05:00
Subv
0a5e01b710 ShaderGen: Implemented the fsetp instruction.
Predicate variables are now added to the generated shader code in the form of 'pX' where X is the predicate id.
These predicate variables are initialized to false on shader startup and are set via the fsetp instructions.

TODO:

* Not all the comparison types are implemented.
* Only the single-predicate version is implemented.
2018-04-20 21:09:33 -05:00
Lioncash
eafdcc1b8a opengl: Remove unnecessary header inclusions 2018-04-20 20:19:37 -04:00
Lioncash
ab71997b2c gl_resource_manager: Add missing noexcept specifiers to move constructors and assignment operators
Standard library containers may use std::move_if_noexcept to perform
move operations. If a move cannot be performed under these
circumstances, then a copy is attempted. Given we only intend for these
types to be move-only this can be somewhat problematic. By defining
these to be noexcept we prevent cases where copies may be attempted.
2018-04-20 20:04:00 -04:00
Lioncash
7db0b8d74f gl_rasterizer_cache: Make MatchFlags an enum class
Prevents implicit conversions and scope pollution.
2018-04-20 19:50:05 -04:00
Subv
d03fc77475 ShaderGen: Register id 255 is special and is hardcoded to return 0 (SR_ZERO). 2018-04-20 14:57:40 -05:00
Subv
2e0a9f66a0 ShaderGen: Ignore the 'sched' instruction when generating shaders.
The 'sched' instruction has a very convoluted encoding, but fortunately it seems to only appear on a fixed interval (once every 4 instructions).
2018-04-20 14:57:40 -05:00
bunnei
326b044c19
Merge pull request #367 from lioncash/clamp
math_util: Remove the Clamp() function
2018-04-20 14:18:03 -04:00
Lioncash
fae2dd0344
math_util: Remove the Clamp() function
C++17 adds clamp() to the standard library, so we can remove ours in
favor of it.
2018-04-20 10:14:13 -04:00
bunnei
701dd649e6
Merge pull request #363 from lioncash/array-size
common_funcs: Remove ARRAY_SIZE macro
2018-04-20 09:43:02 -04:00
Lioncash
d9e316e353 common_funcs: Remove ARRAY_SIZE macro
C++17 has non-member size() which we can just call where necessary.
2018-04-19 22:36:52 -04:00
Lioncash
3841ec4200 renderer_opengl: Add missing header guards 2018-04-19 21:13:59 -04:00
bunnei
17ad56c1dc
Merge pull request #356 from lioncash/shader
glsl_shader_decompiler: Minor API changes to ShaderWriter
2018-04-19 21:09:25 -04:00
Lioncash
e3b6f6c016 glsl_shader_decompiler: Use std::string_view instead of std::string for AddLine()
This function doesn't need to take ownership of the string data being
given to it, considering all we do is append the characters to the
internal string instance.

Instead, use a string view to simply reference the string data without
any potential heap allocation.

Now anything that is a raw const char* won't need to be converted to a
std::string before appending.
2018-04-19 20:12:58 -04:00
Lioncash
412b31ad72 glsl_shader_decompiler: Add AddNewLine() function to ShaderWriter
Avoids constructing a std::string just to append a newline character
2018-04-19 20:09:27 -04:00
Lioncash
aa26baa3db glsl_shader_decompiler: Add char overload for ShaderWriter's AddLine()
Avoids constructing a std::string just to append a character.
2018-04-19 20:04:09 -04:00
Lioncash
4ef392906b glsl_shader_decompiler: Append indentation without constructing a separate std::string
The interface of std::string already lets us append N copies of a
character to an existing string.
2018-04-19 19:59:25 -04:00
Subv
fe84842137 ShaderGen: Implemented the fmul32i shader instruction. 2018-04-19 13:46:32 -05:00
Subv
5367935d35 ShaderGen: Fixed a case where the TEXS instruction would use the same registers for the input and the output.
It will now save the coords before writing the outputs in a subscope.
2018-04-19 13:33:17 -05:00
Subv
057170928c GPU: Add support for the DXT23 and DXT45 compressed texture formats. 2018-04-18 20:48:53 -05:00
bunnei
60e6e8953e
Merge pull request #351 from Subv/tex_formats
GPU: Implemented the B5G6R5 format.
2018-04-18 20:20:51 -04:00
Subv
2985056340 GPU: Implemented the B5G6R5 format. 2018-04-18 18:16:45 -05:00
bunnei
ce4f159b1c
gl_shader_gen: Support vertical/horizontal viewport flipping. (#347)
* gl_shader_gen: Support vertical/horizontal viewport flipping.

* fixup! gl_shader_gen: Support vertical/horizontal viewport flipping.
2018-04-18 16:42:40 -04:00
Subv
43d98ca8fe GLCache: Added boilerplate code to make supporting configurable texture component types.
For now only the UNORM type is supported.
2018-04-18 14:17:28 -05:00
Subv
5b3fab6766 GLCache: Unify texture and framebuffer formats when converting to OpenGL. 2018-04-18 14:17:28 -05:00
Subv
b2c1672e10 GPU: Texture format 8 and framebuffer format 0xD5 are actually ABGR8. 2018-04-18 14:17:27 -05:00
Subv
48d4efbd69 GPU: Pitch textures are now supported, don't assert when encountering them. 2018-04-18 12:52:53 -05:00
Subv
a3e82e8e1f GLCache: Take into account the texture's block height when caching and unswizzling. 2018-04-18 12:52:53 -05:00
Subv
ac09b5a2e9 GLCache: Added a function to convert cached PixelFormats back to texture formats.
TODO: The way we handle cached formats must change, framebuffer and texture formats are too different to keep them in the same place.
2018-04-18 12:52:52 -05:00
Subv
6b63aaa5b4 GPU: Allow using a configurable block height when unswizzling textures. 2018-04-18 12:52:51 -05:00
Subv
db5f2bfa7e GPU/TIC: Added the pitch and block height fields to the TIC structure. 2018-04-18 11:38:39 -05:00
bunnei
c93ea96366
Merge pull request #346 from bunnei/misc-gpu-improvements
Misc gpu improvements
2018-04-17 22:17:07 -04:00
bunnei
71b4a3b9f6
Merge pull request #344 from bunnei/shader-decompiler-p2
Shader decompiler changes part 2
2018-04-17 22:10:53 -04:00
bunnei
7222d9a4c3 gl_rasterizer_cache: Add missing LOG statements. 2018-04-17 21:44:36 -04:00
bunnei
9df8e924fb texture: Add missing formats. 2018-04-17 21:41:36 -04:00
bunnei
3ed8a1cac7 gpu: Add several framebuffer formats to RenderTargetFormat. 2018-04-17 21:40:38 -04:00
bunnei
4a8eb6745e maxwell3d: Allow Texture2DNoMipmap as Texture2D. 2018-04-17 21:39:15 -04:00
bunnei
531c25386e shader_bytecode: Make ctor's constexpr and explicit. 2018-04-17 21:27:07 -04:00
bunnei
174cba5c58 renderer_opengl: Implement BlendEquation and BlendFunc. 2018-04-17 18:11:48 -04:00
bunnei
1f6fe062ca gl_shader_decompiler: Fix warnings with MarkAsUsed. 2018-04-17 16:36:44 -04:00
bunnei
ed542a7309 gl_shader_decompiler: Cleanup logging, updating to NGLOG_*. 2018-04-17 16:36:44 -04:00
bunnei
ef2d5ab0c1 gl_shader_decompiler: Implement several MUFU subops and abs_d. 2018-04-17 16:36:43 -04:00
bunnei
59f4ff4659 gl_shader_decompiler: Fix swizzle in GetRegister. 2018-04-17 16:36:42 -04:00
bunnei
5a28dce9eb gl_shader_decompiler: Implement FMUL/FADD/FFMA immediate instructions. 2018-04-17 16:36:42 -04:00
bunnei
8d4899d6ea gl_shader_decompiler: Allow vertex position to be used in fragment shader. 2018-04-17 16:36:40 -04:00
bunnei
95144cc39c gl_shader_decompiler: Implement IPA instruction. 2018-04-17 16:36:39 -04:00
bunnei
8b4443c966 gl_shader_decompiler: Add support for TEXS instruction. 2018-04-17 16:36:38 -04:00
bunnei
5ba71369ac gl_shader_decompiler: Use fragment output color for GPR 0-3. 2018-04-17 15:25:54 -04:00
bunnei
5d529698c9 gl_shader_decompiler: Partially implement MUFU. 2018-04-17 15:25:54 -04:00
bunnei
2b082e2710
Merge pull request #343 from Subv/tex_wrap_4
GPU: Implement some wrap modes
2018-04-17 12:25:24 -04:00
Subv
636ad34707 MaxwellToGL: Implemented tex wrap mode 1 (Wrap, GL_REPEAT). 2018-04-17 10:17:18 -05:00
Subv
7fc516cc1a MaxwellToGL: Added a TODO and partial implementation of maxwell wrap mode 4 (Clamp, GL_CLAMP).
This clamp mode was removed from OpenGL as of 3.1, we can emulate it by using GL_CLAMP_TO_BORDER to get the border color of the texture, and then manually sampling the edge to mix them in the fragment shader.
2018-04-17 10:16:50 -05:00
bunnei
77bdc49343 gl_rendering: Use NGLOG* for changed code. 2018-04-16 21:23:28 -04:00
bunnei
1a1af3fda3 gl_rasterizer: Implement indexed vertex mode. 2018-04-16 21:10:15 -04:00
Subv
477aab5960 GPU: Use the same buffer names in the generated GLSL and the buffer uploading code. 2018-04-15 15:02:50 -05:00
Subv
14ac40436e GPU: Don't use explicit binding points when uploading the constbuffers to opengl.
The bindpoints will now be dynamically calculated based on the number of buffers used by the previous shader stage.
2018-04-15 14:14:57 -05:00
Subv
e128e90350 GPU: Don't use GetPointer when uploading the constbuffer data to the GPU. 2018-04-15 11:18:09 -05:00
Subv
7da47da66e GPU: Use the buffer hints from the shader decompiler to upload only the necessary const buffers for each shader stage. 2018-04-15 11:15:54 -05:00
bunnei
73d9c494ea shaders: Expose hints about used const buffers. 2018-04-15 11:50:10 -04:00
Subv
c9b511da08 GPU: Upload the entirety of each constbuffer for each shader stage as SSBOs.
We're going to need the shader generator to give us a mapping of the actual used const buffers to properly bind them to the shader.
2018-04-14 23:02:05 -05:00
Subv
1957640ea2 GPU: Allow configuring ssbos in the opengl state manager. 2018-04-14 22:54:23 -05:00
Subv
ae58e46036 GPU: Added a function to determine whether a shader stage is enabled or not. 2018-04-14 22:54:23 -05:00
bunnei
1b41b875dc shaders: Add NumTextureSamplers const, remove unused #pragma. 2018-04-14 18:50:06 -04:00
bunnei
e6224fec27 shaders: Address PR review feedback. 2018-04-14 16:01:41 -04:00
bunnei
eabeedf6af gl_shader_decompiler: Cleanup log statements. 2018-04-14 16:01:41 -04:00
bunnei
0d408b965b shaders: Fix GCC and clang build issues. 2018-04-14 16:01:40 -04:00
bunnei
86135864da gl_shader_decompiler: Implement negate, abs, etc. and lots of cleanup. 2018-04-14 16:01:40 -04:00
bunnei
7639667562 shader_bytecode: Add FSETP and KIL to GetInfo. 2018-04-14 16:01:40 -04:00
bunnei
5a47832221 shader_bytecode: Add SubOp decoding. 2018-04-14 16:01:40 -04:00
bunnei
50023bdae7 gl_shader_decompiler: Add shader stage hint. 2018-04-14 16:01:39 -04:00
bunnei
a992aac5eb renderer_opengl: Fix Morton copy byteswap, etc. 2018-04-14 16:01:39 -04:00
bunnei
0ca8fce9d0 gl_shader_manager: Implement SetShaderSamplerBindings. 2018-04-13 23:48:30 -04:00
bunnei
beddc8afd2 gl_rasterizer: Generate shaders and upload uniforms. 2018-04-13 23:48:29 -04:00
bunnei
85d77a3d24 gl_shader_decompiler: Basic impl. for very simple vertex shaders.
- Tested with Puyo Puyo Tetris and Cave Story+
2018-04-13 23:48:28 -04:00
bunnei
51f37f5061 gl_shader_manager: Cleanup and consolidate uniform handling. 2018-04-13 23:48:28 -04:00
bunnei
35aca0bf1f maxwell_3d: Make memory_manager public. 2018-04-13 23:48:27 -04:00
bunnei
33bb53571b maxwell_3d: Fix shader_config decodings. 2018-04-13 23:48:26 -04:00
bunnei
5617831d5f gl_rasterizer: Use shader program manager, remove test shader. 2018-04-13 23:48:26 -04:00
bunnei
459826a705 renderer_opengl: Add gl_shader_manager class. 2018-04-13 23:48:25 -04:00
bunnei
8aa21a03b3 maxwell_to_gl: Add a few types, etc. 2018-04-13 23:48:24 -04:00
bunnei
10953495c1 gl_shader_gen: Add hashable setup/config structs. 2018-04-13 23:48:23 -04:00
bunnei
2fcbb35ad2 gl_shader_util: Add missing includes. 2018-04-13 23:48:23 -04:00
bunnei
da1114ca59 renderer_opengl: Use OGLProgram instead of OGLShader. 2018-04-13 23:48:21 -04:00
bunnei
4f2b2d0bc5 gl_shader_util: Grab latest upstream. 2018-04-13 23:48:21 -04:00
bunnei
dbfd106ba0 gl_resource_manager: Grab latest upstream. 2018-04-13 23:48:20 -04:00
bunnei
ed7e597b44 gl_shader_decompiler: Add skeleton code from Citra for shader analysis. 2018-04-13 23:48:20 -04:00
bunnei
4e7e0f8112 shader_bytecode: Add initial module for shader decoding. 2018-04-13 23:48:19 -04:00
James Rowe
0b855f1c21 Fix clang format issues 2018-04-06 22:00:48 -06:00
Subv
dcc27d6dc1 GPU: Assert when finding a texture with a format type other than UNORM. 2018-04-06 20:44:46 -06:00
Subv
b0ca330e14 GL: Set up the textures used for each draw call.
Each Maxwell shader stage can have an arbitrary number of textures, but we're limited to a certain number in OpenGL. We try to only use the minimum amount of host textures by not keeping a 1:1 relation between guest texture ids and host texture ids, ie, guest texture id 8 can be host texture id 0 if it's the only texture used in the guest shader program.
This mapping will have to be passed to the shader decompiler so it can rewrite the texture accesses.
2018-04-06 20:44:46 -06:00
Subv
cb3183212d GL: Bind the textures to the shaders used for drawing. 2018-04-06 20:44:46 -06:00
Subv
65faeb9b2a GLCache: Specialize the MortonCopy function for the DXT1 texture format.
It will now use the UnswizzleTexture function instead of the MortonCopyPixels128, which doesn't seem to work for textures.
2018-04-06 20:44:46 -06:00
Subv
b258403f0d GLCache: Implemented GetTextureSurface. 2018-04-06 20:44:45 -06:00
Subv
65ea52394b GLCache: Support uploading compressed textures to the GPU.
Compressed texture formats like DXT1, DXT2, DXT3, etc will use this to ease the load on the CPU.
2018-04-06 20:44:45 -06:00
Subv
73eaef9c05 GL: Remove remaining references to 3DS-specific pixel formats 2018-04-06 20:44:42 -06:00
Subv
b305646c44 RasterizerCache: Remove 3DS-specific pixel formats.
We're only left with RGB8 and DXT1 for now. More will be added as they are needed.
2018-04-06 20:40:24 -06:00
Subv
c28ed85875 GL: Create the sampler objects when starting up the GL rasterizer. 2018-04-06 20:40:24 -06:00
Subv
ca96b04a0c GL: Ported the SamplerInfo struct from citra. 2018-04-06 20:40:24 -06:00
Subv
0171ec606b GL: Rename PicaTexture to MaxwellTexture. 2018-04-06 20:40:24 -06:00
Subv
f73a280eeb GL: Added functions to convert Maxwell tex filters and wrap modes to OpenGL. 2018-04-06 20:40:23 -06:00
Subv
ad1810e895 Textures: Added a helper function to know if a texture is blocklinear or pitch. 2018-04-06 20:40:23 -06:00
N00byKing
d1d7582a5b
rasterizer_interface.h: Update from citra to yuzu 2018-04-04 23:07:58 +02:00
N00byKing
27dbbd8227
gl_rasterizer_cache.cpp: Update from citra to yuzu 2018-04-04 23:05:10 +02:00
N00byKing
cfc28e0c1a
gl_rasterizer_cache.h: Update from citra to yuzu 2018-04-04 23:04:24 +02:00
N00byKing
ca17f581f5
renderer_opengl.h: Update from citra to yuzu 2018-04-04 23:03:02 +02:00
Subv
11b4ab9685 GPU: Use the MacroInterpreter class to execute the GPU macros instead of HLEing them. 2018-04-01 12:07:26 -05:00
Subv
1ec8d2123d GPU: Implemented a gpu macro interpreter.
The Ryujinx macro interpreter and envydis were used as reference.

Macros are programs that are uploaded by the games during boot and can later be called by writing to their method id in a GPU command buffer.
2018-04-01 12:07:26 -05:00
bunnei
5e343edc9e renderer_opengl: Use better naming for DrawScreens and DrawSingleScreen. 2018-03-26 21:17:07 -04:00
bunnei
c33abac275 gl_rasterizer: Move code to bind framebuffer surfaces before draw to its own function. 2018-03-26 21:17:05 -04:00
bunnei
d30110348b gl_rasterizer: Add a SyncViewport method. 2018-03-26 21:17:04 -04:00
bunnei
67bc2f5ecd gl_rasterizer: Move PrimitiveTopology check to MaxwellToGL. 2018-03-26 21:17:03 -04:00
bunnei
666d53299c graphics_surface: Fix merge conflicts. 2018-03-26 21:17:03 -04:00
bunnei
ac19e3d061 gl_rasterizer: Use ReadBlock instead of GetPointer for SetupVertexArray. 2018-03-26 21:17:02 -04:00
bunnei
a6cab532f8 gl_rasterizer: Normalize vertex array data as appropriate. 2018-03-26 21:17:02 -04:00
bunnei
527ce12ce4 maxwel_to_gl: Fix string formatting in log statements. 2018-03-26 21:17:01 -04:00
bunnei
d89bfec5f5 rasterizer: Rename DrawTriangles to DrawArrays. 2018-03-26 21:17:00 -04:00
bunnei
1bfc0dc2db gl_rasterizer: Use passthrough shader for SetupVertexShader. 2018-03-26 21:17:00 -04:00
bunnei
0a5832798a renderer_opengl: Logging, etc. cleanup. 2018-03-26 21:16:59 -04:00
bunnei
7504df52fc renderer_opengl: Remove framebuffer RasterizerFlushVirtualRegion hack. 2018-03-26 21:16:58 -04:00
bunnei
c1ccbf332f gl_rasterizer_cache: Implement UpdatePagesCachedCount. 2018-03-26 21:16:58 -04:00
bunnei
c2dbdefedf gl_rasterizer: Implement SetupVertexArray. 2018-03-26 21:16:56 -04:00
bunnei
cd8bb6ea9b gl_rasterizer_cache: Fix an ASSERT_MSG. 2018-03-26 21:16:56 -04:00
bunnei
4369af6b7e maxwell_to_gl: Add module and function for decoding VertexType. 2018-03-26 21:16:55 -04:00
bunnei
3754e0fdfd maxwell_3d: Use names that match envytools for VertexType. 2018-03-26 21:16:55 -04:00
bunnei
15925b8293 maxwell_3d: Add VertexAttribute struct and cleanup. 2018-03-26 21:16:54 -04:00
bunnei
0ee38e1363 gl_rasterizer: Use 32 texture units instead of 3. 2018-03-26 21:16:53 -04:00
bunnei
0162a2d5cb gl_rasterizer: Implement DrawTriangles. 2018-03-26 21:16:53 -04:00
bunnei
33c0bf9dc5 Maxwell3D: Call AccelerateDrawBatch on DrawArrays. 2018-03-26 21:16:52 -04:00
bunnei
ed2134784e gl_rasterizer: Implement AnalyzeVertexArray. 2018-03-26 21:16:52 -04:00
bunnei
8041d72a1f gl_rasterizer_cache: MortonCopy Switch-style. 2018-03-26 21:16:51 -04:00
bunnei
170ac3f9ee gl_rasterizer_cache: Implement GetFramebufferSurfaces. 2018-03-26 21:16:51 -04:00
bunnei
94c70693f9 maxwell: Add RenderTargetFormat enum. 2018-03-26 21:16:49 -04:00
bunnei
1a9df83535 renderer_opengl: Only draw the screen if a framebuffer is specified. 2018-03-26 21:16:49 -04:00
Subv
4697025b73 GPU: Load the sampler info (TSC) when retrieving active textures. 2018-03-26 15:46:49 -05:00
Subv
56e2013c1f GPU: Added the TSC structure. It contains information about the sampler. 2018-03-26 15:45:05 -05:00
Subv
6afe9e0105 GPU: Added more fields to the TIC structure. 2018-03-26 15:44:20 -05:00
Subv
0ce52b1da2 GPU: Make the debug_context variable a member of the frontend instead of a global. 2018-03-24 23:35:06 -05:00
Subv
2c785bd06c GPU: Added a function to retrieve the active textures for a shader stage.
TODO: A shader may not use all of these textures at the same time, shader analysis should be performed to determine which textures are actually sampled.
2018-03-24 11:31:53 -05:00
Subv
39e60cfeb1 Frontend: Updated the surface view debug widget to work with Maxwell surfaces. 2018-03-24 11:31:53 -05:00
Subv
1c31e2b3d2 GPU: Implement the Incoming/FinishedPrimitiveBatch debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv
1ad97c75a0 GPU: Implement the MaxwellCommandLoaded/Processed debug breakpoints. 2018-03-24 11:31:50 -05:00
Subv
77fd0d47e7 Frontend: Ported the GPU breakpoints and surface viewer widgets from citra. 2018-03-24 11:31:49 -05:00
Subv
1b8d798835 GPU: Added a method to unswizzle a texture without decoding it.
Allow unswizzling of DXT1 textures.
2018-03-24 11:30:56 -05:00
Subv
71ebc3e90d GPU: Preliminary work for texture decoding. 2018-03-24 11:30:56 -05:00
Subv
9b9de30086 GPU: Added viewport registers to Maxwell3D's reg structure. 2018-03-24 01:22:19 -05:00
bunnei
d561e4acc8 gl_rasterizer: Fake render in green, because it's cooler. 2018-03-23 22:27:53 -04:00
bunnei
4ed54738fc gl_rasterizer: Log warning instead of sync'ing unimplemented funcs. 2018-03-23 22:24:16 -04:00
bunnei
b7da9d5a54 gl_rasterizer_cache: Add missing include for vm_manager. 2018-03-23 16:54:20 -04:00
bunnei
0f8401906b renderer_opengl: Only invalidate the framebuffer region, not flush. 2018-03-23 15:52:14 -04:00
bunnei
054393917e renderer_opengl: Fixes for properly flushing & rendering the framebuffer. 2018-03-23 15:49:04 -04:00
bunnei
b36b627d4d RasterizerCacheOpenGL: FlushAll should flush full memory region. 2018-03-23 15:25:16 -04:00
bunnei
11047d7fd5 rasterizer: Flush and invalidate regions should be 64-bit. 2018-03-23 15:01:45 -04:00
bunnei
cdf541fb5b renderer_opengl: Add framebuffer_transform_flags member variable. 2018-03-23 14:59:14 -04:00
bunnei
ec4e1a3685 renderer_opengl: Better handling of framebuffer transform flags. 2018-03-23 14:58:27 -04:00
bunnei
c2c55e0811 renderer_opengl: Use accelerated framebuffer load with LoadFBToScreenInfo. 2018-03-22 23:28:37 -04:00
bunnei
a0b1235f82 gl_rasterizer: Implement AccelerateDisplay method from Citra. 2018-03-22 23:06:54 -04:00
bunnei
f61b9f7338 LoadGLBuffer: Use bytes_per_pixel, not bits. 2018-03-22 23:01:57 -04:00
bunnei
6ced80bb47 gl_rasterizer_cache: LoadGLBuffer should do a morton copy. 2018-03-22 22:54:04 -04:00
bunnei
740310113b video_core: Move MortonCopyPixels128 to utils header. 2018-03-22 22:52:40 -04:00
bunnei
8a250de987 video_core: Remove usage of PAddr and replace with VAddr. 2018-03-22 21:13:46 -04:00
bunnei
bfe45774f1 video_core: Move FramebufferInfo to FramebufferConfig in GPU. 2018-03-22 21:04:30 -04:00
bunnei
c6362543d4 gl_rasterizer: Replace a bunch of UNIMPLEMENTED with ASSERT. 2018-03-22 20:19:34 -04:00
bunnei
f707c2dac4 gl_rasterizer: Add a simple passthrough shader in lieu of shader generation. 2018-03-22 20:00:41 -04:00
bunnei
7c3a263839 gpu: Expose Maxwell3D engine. 2018-03-22 19:48:20 -04:00
bunnei
3a6604e8fa maxwell_3d: Add some format decodings and string helper functions. 2018-03-22 19:47:28 -04:00
bunnei
656de23d93 renderer: Create rasterizer and cleanup. 2018-03-22 19:46:37 -04:00
Subv
c450d264eb GPU: Added vertex attribute format registers. 2018-03-21 09:26:47 -05:00
Subv
ae28a52277 GPU: Added registers for the number of vertices to render. 2018-03-20 23:28:06 -05:00
bunnei
0b3ab30762
Merge pull request #254 from bunnei/port-citra-renderer
Port Citra OpenGL rasterizer code
2018-03-20 21:37:43 -04:00
bunnei
6e3222363c renderer_gl: Port boilerplate rasterizer code over from Citra. 2018-03-20 00:07:32 -04:00
bunnei
9c468e0c55 gl_shader_util: Sync latest version with Citra. 2018-03-20 00:07:31 -04:00
bunnei
d7b1ebe4a8 renderer_gl: Port over gl_shader_gen module from Citra. 2018-03-20 00:07:30 -04:00
Mat M
f4700ccabf
Merge pull request #253 from Subv/rt_depth
GPU: Added registers for color and Z buffers.
2018-03-19 23:37:47 -04:00
bunnei
4bdb46e4c2 renderer_gl: Port over gl_shader_decompiler module from Citra. 2018-03-19 23:14:03 -04:00
bunnei
a3e10b1a72 renderer_gl: Port over gl_rasterizer_cache module from Citra. 2018-03-19 23:14:03 -04:00
bunnei
db0cfb8e8b gl_resource_manager: Sync latest version with Citra. 2018-03-19 23:14:02 -04:00
bunnei
0e4b9cdde4 renderer_gl: Port over gl_stream_buffer module from Citra. 2018-03-19 23:14:02 -04:00
bunnei
6a0902e56d gl_state: Sync latest version with Citra. 2018-03-19 23:13:49 -04:00
Subv
7a27a11770 GPU: Added Z buffer registers to Maxwell3D's reg structure. 2018-03-19 16:55:33 -05:00
Subv
21d9519032 GPU: Added the render target (RT) registers to Maxwell3D's reg structure. 2018-03-19 16:46:29 -05:00
N00byKing
1d8b6ad13b Clang Fixes 2018-03-19 17:53:35 +01:00
N00byKing
ef875d6a35 Clean Warnings (?) 2018-03-19 17:07:08 +01:00
Subv
dcae0c9a4f GPU: Added the TSC registers to the Maxwell3D register structure. 2018-03-19 00:36:25 -05:00
Subv
cff7b29bba GPU: Added the TIC registers to the Maxwell3D register structure. 2018-03-19 00:32:57 -05:00
Subv
03156d0c9a GPU: Implement macro 0xE1A BindTextureInfoBuffer in HLE.
This macro simply sets the current CB_ADDRESS to the texture buffer address for the input shader stage.
2018-03-18 19:03:40 -05:00
Subv
7b6868e908 GPU: Implement the BindStorageBuffer macro method in HLE.
This macro binds the SSBO Info Buffer as the current ConstBuffer.
This buffer is usually bound to c0 during shader execution.
Games seem to use this macro instead of directly writing the address for some reason.
2018-03-18 16:50:42 -05:00
Subv
85d820b1b4 GPU: Handle writes to the CB_DATA method.
Writing to this method will cause the written value to be stored in the currently-set ConstBuffer plus CB_POS.

This method is usually used to upload uniforms or other shader-visible data.
2018-03-18 15:23:24 -05:00
Subv
a64b936cbe GPU: Move the GPU's class constructor and destructors to a cpp file.
This should reduce recompile times when editing the Maxwell3D register structure.
2018-03-18 15:23:24 -05:00
Subv
aa586fa268 GPU: Store uploaded GPU macros and keep track of the number of method parameters. 2018-03-18 11:51:46 -05:00
Subv
7ac8657432 GPU: Macros are specific to the Maxwell3D engine, so handle them internally. 2018-03-18 11:51:45 -05:00
Subv
ccb8da1512 GPU: Renamed ShaderType to ShaderStage as that is less confusing. 2018-03-17 18:32:57 -05:00
Subv
88698c156f GPU: Store shader constbuffer bindings in the GPU state. 2018-03-17 18:32:57 -05:00
Subv
66dae22790 GPU: Corrected some register offsets and removed superfluous macro registers. 2018-03-17 18:32:56 -05:00
Subv
1d9d9c16e8 GPU: Make the SetShader macro call do the same as the real macro's code.
It'll now set the CB_SIZE, CB_ADDRESS and CB_BIND registers when it's called.

Presumably this SetShader function is binding the constant shader uniforms to buffer 1 (c1[]).
2018-03-17 18:32:55 -05:00
Subv
579000e747 GPU: Corrected the parameter documentation for the SetShader macro call.
Register 0xE24 is actually a macro that sets some shader parameters in the register structure.

Macros are uploaded to the GPU at startup and have their own ISA, we'll probably write an interpreter for this in the future.
2018-03-17 13:55:42 -05:00
bunnei
516ef4f19f
Merge pull request #242 from Subv/set_shader
GPU: Handle the SetShader method call (0xE24) and store the shader config.
2018-03-17 00:34:17 -04:00
Subv
f93d769a1c GPU: Handle the SetShader method call (0xE24) and store the shader config. 2018-03-16 22:51:06 -05:00
Subv
d2888f7e90 GPU: Added the vertex array registers. 2018-03-16 22:47:45 -05:00
bunnei
cd4e8a989c
Merge pull request #241 from Subv/gpu_method_call
GPU: Process command mode 5 (IncreaseOnce) differently from other commands
2018-03-16 22:28:22 -04:00
Subv
29feece4b8 GPU: Process command mode 5 (IncreaseOnce) differently from other commands.
Accumulate all arguments before calling the desired method.

Note: Maybe we should do the same for the NonIncreasing mode?
2018-03-16 20:32:44 -05:00
Subv
bf310a41b8 GPU: Assert that we get a 0 CODE_ADDRESS register in the 3D engine.
Shader address calculation depends on this value to some extent, we do not currently know what it being 0 entails.
2018-03-16 19:24:41 -05:00
Subv
cbec739e7b GPU: Added Maxwell registers for Shader Program control. 2018-03-16 19:23:11 -05:00
Subv
5fb4c718cc GPU: Intercept writes to the VERTEX_END_GL register.
This is the register that gets written after a game calls DrawArrays().

We should collect all GPU state and draw using our graphics API here.
2018-03-04 19:14:04 -05:00
Lioncash
490d0e36a0
maxwell_3d: Make constructor explicit 2018-02-13 23:47:51 -05:00
bunnei
af8ae770ef
Merge pull request #187 from Subv/maxwell3d_query
GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
2018-02-13 23:25:07 -05:00
bunnei
be5ba4d952
Merge pull request #178 from Subv/command_buffers
GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines
2018-02-12 13:51:52 -05:00
Subv
ac61a7d1e6 GPU: Partially implemented the QUERY_* registers in the Maxwell3D engine.
Only QueryMode::Write is supported at the moment.
2018-02-12 12:34:41 -05:00
Subv
6cddf9d88e Make a GPU class in VideoCore to contain the GPU state.
Also moved the GPU MemoryManager class to video_core since it makes more sense for it to be there.
2018-02-11 23:44:12 -05:00
Subv
e01a8f2187 GPU: Added a command processor to decode the GPU pushbuffers and forward the commands to their respective engines. 2018-02-11 22:42:48 -05:00
bunnei
deadcb39c2 renderer_opengl: Support framebuffer flip vertical. 2018-02-11 21:03:55 -05:00
MerryMage
738f91a57d memory: Replace all memory hooking with Special regions 2018-01-27 15:16:39 +00:00
James Rowe
096be16636 Format: Run the new clang format on everything 2018-01-20 16:45:11 -07:00
Lioncash
e710a1b989 CMakeLists: Derive the source directory grouping from targets themselves
Removes the need to store to separate SRC and HEADER variables, and then
construct the target in most cases.
2018-01-17 21:51:43 -05:00
MerryMage
e35644c005 clang-format 2018-01-16 18:05:21 +00:00
bunnei
92801b1c34 renderer_gl: Clear screen to black before rendering framebuffer. 2018-01-15 00:20:19 -05:00
bunnei
ebd613c2cc renderer: Render previous frame when no new one is available. 2018-01-14 23:54:56 -05:00
MerryMage
e86bdb1601 Fix build on macOS and linux 2018-01-13 22:38:52 +00:00
James Rowe
389979018c Remove gpu debugger and get yuzu qt to compile 2018-01-12 19:11:04 -07:00
James Rowe
1d28b2e142 Remove references to PICA and rasterizers in video_core 2018-01-12 19:11:03 -07:00
bunnei
11adef4843 renderer_opengl: Fix LOG_TRACE in LoadFBToScreenInfo. 2018-01-11 22:32:44 -05:00
bunnei
ee4691297f renderer_opengl: Support rendering Switch framebuffer. 2018-01-10 23:28:59 -05:00
bunnei
236d463c52 render_base: Add a struct describing framebuffer metadata. 2018-01-10 23:28:56 -05:00
bunnei
866e66dc31 renderer_opengl: Add MortonCopyPixels function for Switch framebuffer. 2018-01-10 23:28:53 -05:00
bunnei
9e2ad45c98 renderer_opengl: Update DrawScreens for Switch. 2018-01-10 23:28:49 -05:00
bunnei
93480b10ef core/video_core: Fix a bunch of u64 -> u32 warnings. 2018-01-01 15:40:35 -05:00
bunnei
960a1416de hle: Initial implementation of NX service framework and IPC. 2017-10-14 22:18:42 -04:00
Huw Pascoe
b3b34a1e76 Extracted the attribute setup and draw commands into their own functions 2017-10-04 01:08:29 +01:00
Huw Pascoe
a13ab958cb Fixed type conversion ambiguity 2017-09-30 09:34:35 +01:00
Subv
a321bce378 Disable unary operator- on Math::Vec2/Vec3/Vec4 for unsigned types.
It is unlikely we will ever use this without first doing a Cast to a signed type.
Fixes 9 "unary minus operator applied to unsigned type, result still unsigned" warnings on MSVC2017.3
2017-09-27 09:06:41 -05:00
B3n30
dc6a365337 Merge pull request #2951 from huwpascoe/perf-4
Optimized Morton
2017-09-25 08:28:55 +02:00
Huw Pascoe
903906da3b Optimized Float<M,E> multiplication
Before:

ucomiss xmm1, xmm1
jp      .L9
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm0, xmm2
setp    al
cmovne  eax, edx
test    al, al
jne     .L9
.L3:
movaps  xmm0, xmm2
ret
.L9:
ucomiss xmm0, xmm0
jp      .L10
pxor    xmm2, xmm2
mov     edx, 1
ucomiss xmm1, xmm2
setp    al
cmovne  eax, edx
test    al, al
je      .L3

After:

movaps  xmm2, xmm1
mulss   xmm2, xmm0
ucomiss xmm2, xmm2
jnp     .L3
ucomiss xmm1, xmm0
jnp     .L11
.L3:
movaps  xmm0, xmm2
ret
.L11:
pxor    xmm2, xmm2
jmp     .L3
2017-09-25 00:54:02 +01:00
Huw Pascoe
876aa82c29 Optimized Morton 2017-09-24 22:27:14 +01:00
James Rowe
93930a966f Merge pull request #2921 from jroweboy/batch-fix-2
GPU: Add draw for immediate and batch modes
2017-09-24 07:57:16 -06:00
James Rowe
19d41dcc6e Remove pipeline.gpu_mode and fix minor issues 2017-09-23 09:28:20 -06:00
Yuri Kunde Schlesner
a7758b0b36 Merge pull request #2928 from huwpascoe/master
Fixed framebuffer warning
2017-09-22 04:06:38 +02:00
Huw Pascoe
a234e4c200 Improved performance of FromAttributeBuffer
Ternary operator is optimized by the compiler
whereas std::min() is meant to return a value.

I've noticed a 5%-10% emulation speed increase.
2017-09-17 15:56:36 +01:00
Huw Pascoe
6a110ac5f5 Fixed framebuffer warning 2017-09-17 11:57:06 +01:00
Yuri Kunde Schlesner
699c920991 Merge pull request #2900 from wwylele/clip-2
PICA: implement custom clip plane
2017-09-16 10:23:00 +02:00
James Rowe
ad0b57f407 GPU: Add draw for immediate and batch modes
PR #1461 introduced a regression where some games would change configuration
even while in the poorly named "drawing" mode, which broke the heuristic
citra was using to determine when to draw the batch. This change adds
back in a draw call for batching, and also adds in a draw call in
immediate mode each time it adds a triangle.
2017-09-11 09:21:43 -06:00
bunnei
11baa40d75 Merge pull request #2865 from wwylele/gs++
PICA: implemented geometry shader
2017-09-07 23:02:59 -04:00
bunnei
ff4941fb3a Merge pull request #2914 from wwylele/fresnel-fix
pica/lighting: only apply Fresnel factor for the last light
2017-09-05 10:00:49 -04:00
wwylele
12fbc8c8df pica/lighting: only apply Fresnel factor for the last light 2017-09-03 08:22:03 +03:00
wwylele
e2c41a5891 video_core: report telemetry for gas mode 2017-08-31 12:54:17 +03:00
bunnei
f0e461bf6f Merge pull request #2891 from wwylele/sw-bump
SwRasterizer/Lighting: implement bump mapping
2017-08-30 21:07:30 -04:00
Weiyi Wang
647f017c6d Merge pull request #2892 from Subv/warnings2
Warnings: Fixed a few missing-return warnings in video_core.
2017-08-28 03:21:51 -05:00
Subv
da88f3b8f0 Warnings: Fixed a few missing-return warnings in video_core. 2017-08-26 11:58:22 -05:00
wwylele
417cb45e3f SwRasterizer/Clipper: flip the sign convention to match PICA and OpenGL 2017-08-25 07:26:45 +03:00
wwylele
addbcd5784 gl_rasterizer: implement custom clip plane 2017-08-25 07:26:45 +03:00
wwylele
ea51a3af26 SwRasterizer: implement custom clip plane 2017-08-24 15:34:27 +03:00
wwylele
17c6104d2a gl_rasterizer/lighting: more accurate CP formula 2017-08-22 09:34:44 +03:00
wwylele
b5aa570354 SwRasterizer/Lighting: implement LUT input CP 2017-08-22 09:34:44 +03:00
wwylele
3e478ca131 SwRasterizer/Lighting: implement bump mapping 2017-08-22 09:34:44 +03:00
wwylele
63b6e802cd swrasterizer: remove invalid TODO
This function is called in clipping, before the pespective divide, and is not used in later rasterization. Thus it doesn't need perspective correction.
2017-08-21 08:03:07 +03:00
wwylele
72b26ac32f swrasterizer/clipper: remove tested TODO
hwtested. Current implementation is the correct behavior
2017-08-21 08:03:07 +03:00
wwylele
5a4af616c6 gl_shader_gen: simplify and clarify the depth transformation between vertex shader and fragment shader 2017-08-21 08:03:07 +03:00
wwylele
1eca380886 gl_rasterizer: add clipping plane z<=0 defined in PICA 2017-08-21 08:03:07 +03:00
Yuri Kunde Schlesner
46d1ca768d Merge pull request #2872 from wwylele/sw-geo-factor
SwRasterizer/Lighting: implement geometric factor
2017-08-20 17:49:42 -07:00
James Rowe
8afa81ac1b Merge pull request #2871 from wwylele/sw-spotlight
SwRasterizer/Lighting: implement spot light
2017-08-19 20:10:24 -06:00
wwylele
0f35755572 pica/command_processor: build geometry pipeline and run geometry shader
The geometry pipeline manages data transfer between VS, GS and primitive assembler. It has known four modes:
 - no GS mode: sends VS output directly to the primitive assembler (what citra currently does)
 - GS mode 0: sends VS output to GS input registers, and sends GS output to primitive assembler
 - GS mode 1: sends VS output to GS uniform registers, and sends GS output to primitive assembler. It also takes an index from the index buffer at the beginning of each primitive for determine the primitive size.
 - GS mode 2: similar to mode 1, but doesn't take the index and uses a fixed primitive size.
hwtest shows that immediate mode also supports GS (at least for mode 0), so the geometry pipeline gets refactored into its own class for supporting both drawing mode.
In the immediate mode, some games don't set the pipeline registers to a valid value until the first attribute input, so a geometry pipeline reset flag is set in `pipeline.vs_default_attributes_setup.index` trigger, and the actual pipeline reconfigure is triggered in the first attribute input.
In the normal drawing mode with index buffer, the vertex cache is a little bit modified to support the geometry pipeline. Instead of OutputVertex, it now holds AttributeBuffer, which is the input to the geometry pipeline. The AttributeBuffer->OutputVertex conversion is done inside the pipeline vertex handler. The actual hardware vertex cache is believed to be implemented in a similar way (because this is the only way that makes sense).
Both geometry pipeline and GS unit rely on states preservation across drawing call, so they are put into the global state. In the future, the other three vertex shader units should be also placed in the global state, and a scheduler should be implemented on top of the four units. Note that the current gs_unit already allows running VS on it in the future.
2017-08-19 10:13:20 +03:00
wwylele
8285ca4ad8 pica/shader/jit: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
36981a5aa6 pica/primitive_assembly: Handle winding for GS primitive
hwtest shows that, although GS always emit a group of three vertices as one primitive, it still respects to the topology type, as if the three vertices are input into the primitive assembler independently and sequentially. It is also shown that the winding flag in SETEMIT only takes effect for Shader topology type, which is believed to be the actual difference between List and Shader (hence removed the TODO). However, only Shader topology type is observed in official games when GS is in use, so the other mode seems to be just unintended usage.
2017-08-19 10:13:20 +03:00
wwylele
bb63ae3052 correct constness 2017-08-19 10:13:20 +03:00
wwylele
28128348f2 pica/shader/interpreter: implement SETEMIT and EMIT 2017-08-19 10:13:20 +03:00
wwylele
46c6973d2b pica/shader: extend UnitState for GS
Among four shader units in pica, a special unit can be configured to run both VS and GS program. GSUnitState represents this unit, which extends UnitState (which represents the other three normal units) with extra state for primitive emitting. It uses lots of raw pointers to represent internal structure in order to keep it standard layout type for JIT to access.
This unit doesn't handle triangle winding (inverting) itself; instead, it calls a WindingSetter handler. This will be explained in the following commits
2017-08-19 10:13:20 +03:00
wwylele
686fb3e78c gl_shader_gen: don't call SampleTexture when bump map is not used 2017-08-11 18:35:00 +03:00
wwylele
945f9a1b04 SwRasterizer/Lighting: implement spot light 2017-08-11 01:19:10 +03:00
wwylele
14ee32c46a SwRasterizer/Lighting: implement geometric factor 2017-08-11 01:18:43 +03:00
wwylele
5d9d42f0d0 SwRasterizer/Lighting: use make_tuple instead of constructor
implicit tuple constructor is a c++17 thing, which is not supported by some not-so-old libraries. Play safe for now
2017-08-10 12:19:58 +03:00
wwylele
db309b2423 pica/regs: layout geometry shader configuration regs
All the register meanings are derived from ctrulib (3dbrew is outdated for most of them)
2017-08-10 01:53:08 +03:00
Weiyi Wang
792dee47a7 Merge pull request #2822 from wwylele/sw_lighting-2
Implement fragment lighting in the sw renderer (take 2)
2017-08-09 18:54:29 +03:00
wwylele
baa24f4ea9 pica: upload shared shader code to both unit 2017-08-07 10:30:05 +03:00
wwylele
2252a63f80 SwRasterizer/Lighting: shorten file name 2017-08-03 13:51:22 +03:00
wwylele
eda28266fb SwRasterizer/Lighting: move to its own file 2017-08-02 22:20:40 +03:00
wwylele
48b4105871 SwRasterizer/Lighting: reduce confusion 2017-08-02 22:07:15 +03:00
wwylele
c59ed47608 SwRasterizer/Lighting: move quaternion normalization to the caller 2017-08-02 22:05:53 +03:00
wwylele
c89f804a01 pica/shader_interpreter: fix off-by-one in LOOP 2017-07-27 13:48:27 +03:00
Sebastian Valle
c6a2e519ef Merge pull request #2816 from wwylele/proctex-lutlutlut
gl_rasterizer: use texture buffer for proctex LUT
2017-07-22 23:03:48 -05:00
Sebastian Valle
e646bd902d Merge pull request #2834 from wwylele/depth-enable-fix
gl_rasterizer_cache: fix using_depth_fb
2017-07-22 23:02:59 -05:00
bunnei
df8b9863f9 telemetry: Log performance, configuration, and system data. 2017-07-17 21:32:28 -04:00
wwylele
4feff63ffa SwRasterizer/Lighting: dist atten lut input need to be clamp 2017-07-11 22:19:00 +03:00
wwylele
56e5425e59 SwRasterizer/Lighting: unify float suffix 2017-07-11 22:15:35 +03:00
wwylele
e415558a4f SwRasterizer/Lighting: get rid of nested return 2017-07-11 22:15:35 +03:00
wwylele
c6d1472513 SwRasterizer/Lighting: refactor GetLutValue into a function.
merging similar pattern. Also makes the code more similar to the gl one
2017-07-11 22:15:35 +03:00
wwylele
f13cf506e0 SwRasterizer: only interpolate quat and view when lighting is enabled 2017-07-11 21:35:57 +03:00
wwylele
efc655aec0 SwRasterizer/Lighting: pass lighting state as parameter 2017-07-11 20:06:26 +03:00
Subv
9906feefbd SwRasterizer/Lighting: Move the clamp highlight calculation to the end of the per-light loop body. 2017-07-11 19:39:15 +03:00
Subv
7526af5e52 SwRasterizer/Lighting: Move the lighting enable check outside the ComputeFragmentsColors function. 2017-07-11 19:39:15 +03:00
Subv
b8229a7684 SwRasterizer/Lighting: Do not use global registers state in ComputeFragmentsColors. 2017-07-11 19:39:15 +03:00
Subv
7bc467e872 SwRasterizer/Lighting: Do not use global state in LookupLightingLut. 2017-07-11 19:39:15 +03:00
Subv
37ac2b6657 SwRasterizer/Lighting: Fixed a bug where the distance attenuation bias was being set to the dist atten scale. 2017-07-11 19:39:15 +03:00
Subv
6250f52e93 SwRasterizer: Fixed a few conversion warnings and moved per-light values into the per-light loop. 2017-07-11 19:39:15 +03:00
Subv
2d69a9b8bf SwRasterizer: Run clang-format 2017-07-11 19:39:15 +03:00
Subv
73566ff7a9 SwRasterizer: Flip the vertex quaternions before clipping (if necessary). 2017-07-11 19:39:15 +03:00
Subv
2a75837bc3 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
f2d4d5c219 SwRasterizer: Corrected the light LUT lookups. 2017-07-11 19:39:15 +03:00
Subv
80b6fc592e SwRasterizer: Fixed the lighting lut lookup function. 2017-07-11 19:39:15 +03:00
Subv
10b0bea060 SwRasterizer: Calculate fresnel for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
46b8c8e1da SwRasterizer: Calculate specular_1 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
be25e78b07 SwRasterizer: Calculate specular_0 for fragment lighting. 2017-07-11 19:39:15 +03:00
Subv
b2f472a2b1 SwRasterizer: Implement primary fragment color. 2017-07-11 19:39:15 +03:00
wwylele
8482933db8 gl_rasterizer: use texture buffer for proctex LUT 2017-07-01 11:02:48 +03:00
wwylele
8978ecb09c gl_rasterizer: use texture buffer for fog LUT 2017-06-22 20:41:00 +03:00
wwylele
f1e377f57e gl_rasterizer: create the texture before applying the state
this is a rebasing error from #2792. It doesn't affect much though, because the later more Apply() call fixes/hides it
2017-06-22 17:47:46 +03:00
wwylele
457659fe01 gl_state: reset 1d textures 2017-06-21 23:13:06 +03:00
wwylele
42f7ca7412 gl_rasterizer: fix glGetUniformLocation type 2017-06-21 23:13:06 +03:00
wwylele
be9e952bdc gl_rasterizer: manage texture ids in one place 2017-06-21 23:13:06 +03:00
wwylele
ab60414122 gl_rasterizer/lighting: fix LUT interpolation 2017-06-21 23:13:06 +03:00
Yuri Kunde Schlesner
d0888f8548 Merge pull request #2776 from wwylele/geo-factor
Fragment lighting: implement geometric factor
2017-06-18 14:18:48 -07:00
wwylele
5a454173a8 gl_rasterizer/lighting: use the formula from the paper for germetic factor 2017-06-18 10:29:02 +03:00
Yuri Kunde Schlesner
f6715f98f5 Stop using reserved operator names (and/or/xor) with Xbyak
Also has the Dynarmic upgrade with the same change
2017-06-17 12:20:22 -07:00
wwylele
7052d43a67 gl_rasterizer/lighting: implement geometric factor 2017-06-15 14:59:01 +03:00
Yuri Kunde Schlesner
da1bec121a Merge pull request #2762 from wwylele/light-cp-tangent
Fragment lighting: implement lut input 5 (CP) and tangent mapping
2017-06-14 20:08:26 -07:00
Yuri Kunde Schlesner
5fe5ccac42 Merge pull request #2743 from wwylele/wrap-fix
pica/rasterizer: implement/stub texture wrap mode 4-7
2017-06-13 21:28:12 -07:00
Yuri Kunde Schlesner
791cd14c8d Merge pull request #2767 from yuriks/quaternion-flip-comment
OpenGL: Update comment on AreQuaternionsOpposite with new information
2017-06-12 16:31:55 -07:00
wwylele
972548e3ee gl_rasterizer/lighting: Implement tangent mapping 2017-06-11 21:30:53 +03:00
wwylele
40b7d0bf3f gl_rasterizer/lighting: implement lut input 5 (CP) 2017-06-11 21:30:53 +03:00
Sebastian Valle
39c7c1f580 Merge pull request #2727 from wwylele/spot-light
Fragment lighting: implement spot light
2017-06-11 18:23:47 +00:00
wwylele
b3b9468573 gl_rasterizer_cache: depth write is disabled if allow_depth_stencil_write is false 2017-06-10 15:10:34 +03:00
Yuri Kunde Schlesner
ba01a8302a OpenGL: Update comment on AreQuaternionsOpposite with new information
While debugging the software renderer implementation, it was noticed
that this is actually exactly what the hardware does, upgrading the
status of this "hack" to being a proper implementation. And there was
much rejoicing.
2017-06-10 01:55:17 -07:00
wwylele
28d1e73d2f pica/rasterizer: implement/stub texture wrap mode 4-7 2017-06-04 09:47:25 +03:00
bunnei
54ea95cca7 Merge pull request #2721 from wwylele/texture-cube
swrasterizer: implemented TextureCube
2017-05-30 10:21:05 -04:00
wwylele
10906dceec gl_rasterizer: implement spot light 2017-05-30 10:54:58 +03:00
wwylele
686cbf3ac6 gl_rasterizer: sync spot light status 2017-05-30 10:54:58 +03:00
wwylele
b5addf8fb8 pica: prepare registers for spotlight 2017-05-30 10:54:58 +03:00
Yuri Kunde Schlesner
a4f88c7d7c Merge pull request #2734 from yuriks/cmake-imported-libs
CMake: Use CMake target properties for all libraries
2017-05-29 15:12:21 -07:00
wwylele
0b9bb082c3 swrasterizer: implement TextureCube 2017-05-29 22:28:48 +03:00
wwylele
077cc683e5 pica: add registers for texture cube 2017-05-29 22:03:08 +03:00
Yuri Kunde Schlesner
3df85a103a Merge pull request #2729 from yuriks/quaternion-fix
OpenGL: Improve accuracy of quaternion interpolation
2017-05-28 01:24:06 -07:00
Yuri Kunde Schlesner
d736cca848 CMake: Create INTERFACE targets for microprofile and nihstro 2017-05-27 22:34:52 -07:00
Yuri Kunde Schlesner
4660bc1c78 CMake: Use IMPORTED target for libpng 2017-05-27 20:44:51 -07:00
Yuri Kunde Schlesner
7b81903756 CMake: Correct inter-module dependencies and library visibility
Modules didn't correctly define their dependencies before, which relied
on the frontends implicitly including every module for linking to
succeed.

Also changed every target_link_libraries call to specify visibility of
dependencies to avoid leaking definitions to dependents when not
necessary.
2017-05-27 18:41:24 -07:00
Yuri Kunde Schlesner
eb10f25025 Move screen size constants from video_core to core
video_core didn't even properly use them, and they were the source of
many otherwise-unnecessary dependencies from core to video_core.
2017-05-27 18:41:24 -07:00
Yuri Kunde Schlesner
6665557ff7 OpenGL: Remove unused RendererOpenGL fields 2017-05-27 18:02:46 -07:00
Yuri Kunde Schlesner
669ef82aee OpenGL: Improve accuracy of quaternion interpolation
Current order of operations (rotate then normalize) seems to produce a
lot more distortion than normalizing and then rotating. This makes Citra
results match pretty closesly with hardware, and indicates that hardware
may also be using lerp instead of slerp to interpolate the quaternions.
2017-05-27 00:13:41 -07:00
wwylele
90c8d09098 gl_shader: refactor texture sampler into its own function 2017-05-27 01:56:22 +03:00
Yuri Kunde Schlesner
bae3799bd5 Merge pull request #2697 from wwylele/proctex
Implemented Procedural Texture (Texture Unit 3)
2017-05-24 21:37:42 -07:00
wwylele
36526c63ef swrasterizer: add missing tc0_w and fragment lighting attribute processing 2017-05-21 09:09:15 +03:00
wwylele
4d62e75fb2 gl_rasterizer: implement procedural texture 2017-05-20 13:50:50 +03:00
wwylele
ade45b5b99 pica/swrasterizer: implement procedural texture 2017-05-20 13:50:50 +03:00
wwylele
393fee10a2 pica: use correct register value for shader bool_uniforms
variable value is not masked. the masked and combined register value should be used instead
2017-05-17 22:14:09 +03:00
Yuri Kunde Schlesner
8d558777a6 Merge pull request #2703 from wwylele/pica-reg-revise
pica: correct bit field length for some registers
2017-05-16 10:00:37 -07:00
wwylele
86ee1f6101 pica: correct bit field length for some registers 2017-05-16 19:24:06 +03:00
Jannik Vogel
ba722be2ac Pica: Write GS registers
This adds the handlers for the geometry shader register writes which will call the functions from the previous commit to update registers for the GS.
2017-05-12 16:22:37 +02:00
Jannik Vogel
3fd3775d35 Pica: Write shader registers in functions
The commit after this one adds GS register writes, so this moves the VS handlers into functions so they can be re-used and extended more easily.
2017-05-12 16:22:37 +02:00
Jannik Vogel
925724c990 Pica: Set program code / swizzle data limit to 4096
One of the later commits will enable writing to GS regs.
It turns out that on startup, most games will write 4096 GS program words.

The current limit of 1024 would hence result in 3072 (4096 - 1024) error messages:
```
HW.GPU <Error> video_core/shader/shader.cpp:WriteProgramCode:229: Invalid GS program offset 1024
```

New constants have been introduced to represent these limits.
The swizzle data size has also been raised. This matches the given field sizes of [GPUREG_SH_OPDESCS_INDEX](https://3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_OPDESCS_INDEX) and [GPUREG_SH_CODETRANSFER_INDEX](https://www.3dbrew.org/wiki/GPU/Internal_Registers#GPUREG_SH_CODETRANSFER_INDEX) (12 bit = [0; 4095]).
2017-05-11 15:01:27 +02:00
wwylele
039b293092 pica: shader_dirty if texture2 coord changed 2017-05-05 15:35:17 +03:00
wwylele
0f664ef89d pica: use correct coordinates for texture 2 2017-05-03 22:12:46 +03:00
bunnei
ea53d6085a Merge pull request #2671 from wwylele/dot3-rgba
rasterizer: implement combiner operation 7 (Dot3_RGBA)
2017-04-21 17:03:22 -04:00
wwylele
2c2e872b31 gl_shader_gen: remove TODO about Lerp behaviour verification. The implementation is verified against hardware 2017-04-20 22:56:07 +03:00
wwylele
b624a95205 rasterizer: implement combiner operation 7 (Dot3_RGBA) 2017-04-19 23:48:10 +03:00
Yuri Kunde Schlesner
52a4489d65 OpenGL: Pass Pica regs via parameter 2017-04-17 10:34:45 -07:00
Yuri Kunde Schlesner
a6fd4533f6 OpenGL: Move PicaShaderConfig to gl_shader_gen.h
Also move the implementation of CurrentConfig to the cpp file.
2017-04-16 21:49:32 -07:00
Yuri Kunde Schlesner
40e28f6217 OpenGL: Move Attributes enum to a more appropriate file 2017-04-16 20:47:04 -07:00
Jannik Vogel
1b397c77fa Pica/Regs: Correct bit width for blend-equations 2017-04-08 18:33:17 +02:00
wwylele
e02c4b7195 Input: remove unused stuff & clean up
1. removed zl, zr and c-stick from HID::PadState. They are handled by IR, not HID
2. removed button handling in EmuWindow
3. removed key_map
4. cleanup #include
2017-03-01 23:30:57 +02:00
Mat M
0cb52ee74a Doxygen: Amend minor issues (#2593)
Corrects a few issues with regards to Doxygen documentation, for example:

- Incorrect parameter referencing.
- Missing @param tags.
- Typos in @param tags.

and a few minor other issues.
2017-02-26 17:58:51 -08:00
Yuri Kunde Schlesner
fb1979d7e2 Core: Re-write frame limiter
Now based on std::chrono, and also works in terms of emulated time
instead of frames, so we can in the future frame-limit even when the
display is disabled, etc.

The frame limiter can also be enabled along with v-sync now, which
should be useful for those with displays running at more than 60 Hz.
2017-02-26 17:22:04 -08:00
Yuri Kunde Schlesner
b285c2a4ed Core: Make PerfStats internally locked
More ergonomic to use and will be required for upcoming changes.
2017-02-26 17:22:03 -08:00
Yuri Kunde Schlesner
3b4e400333 Remove built-in (non-Microprofile) profiler 2017-02-26 17:22:03 -08:00
Yuri Kunde Schlesner
c75ae6c585 Add performance statistics to status bar 2017-02-26 17:22:03 -08:00
Jannik Vogel
e594e63bb5 OpenGL: Check if uniform block exists before updating it (#2581) 2017-02-18 11:46:26 -08:00
Weiyi Wang
e085e6a768 video_core: remove #pragma once in cpp file (#2570) 2017-02-15 00:16:50 -08:00
Yuri Kunde Schlesner
426fda1d52 SWRasterizer: Move more framebuffer functions to file 2017-02-12 18:13:04 -08:00
Yuri Kunde Schlesner
1683cb0ec9 SWRasterizer: Move texturing functions to their own file 2017-02-12 18:12:37 -08:00
Yuri Kunde Schlesner
f9026e8a7a SWRasterizer: Convert large no-capture lambdas to standalone functions 2017-02-12 18:11:05 -08:00
Yuri Kunde Schlesner
e1ad7d69b9 SWRasterizer: Move framebuffer operation functions to their own file 2017-02-12 18:11:03 -08:00
Yuri Kunde Schlesner
e24717bca0 VideoCore: Move software rasterizer files to sub-directory 2017-02-12 18:08:11 -08:00
Yuri Kunde Schlesner
e10b11a5d0 video_core/shader: Document sanitized MUL operation 2017-02-12 13:29:14 -08:00
Yuri Kunde Schlesner
443bb3d522 Merge pull request #2550 from yuriks/pica-refactor2
Small VideoCore cleanups
2017-02-12 12:33:26 -08:00
Yuri Kunde Schlesner
e2fa1ca5e1 video_core: Fix benign out-of-bounds indexing of array (#2553)
The resulting pointer wasn't written to unless the index was verified as
valid, but that's still UB and triggered debug checks in MSVC.

Reported by garrettboast on IRC
2017-02-10 20:51:09 -08:00
Yuri Kunde Schlesner
553e672777 VideoCore: Split u64 Pica reg unions into 2 separate u32 unions
This eliminates UB when aliasing it with the array of u32 regs, and
is compatible with non-LE architectures.
2017-02-09 00:04:25 -08:00
Yuri Kunde Schlesner
bfb1531352 VideoCore: Force enum sizes to u32 in LightingRegs
All enums that are used with BitField must have their type forced to u32
to ensure correctness.
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
af65e1c0a0 OpenGL: Remove unused duplicate of IsPassThroughTevStage
This copy was left behind when the shader generation code was moved to a
separate file.
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
60fc0b086f VideoCore: Split regs.h inclusions 2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
f241bb72f5 Pica/Regs: Use binary search to look up reg names
This gets rid of the static unordered_map. Also changes the return type
const char*, avoiding unnecessary allocations (the result was only used
by calling .c_str() on it.)
2017-02-09 00:04:24 -08:00
Yuri Kunde Schlesner
602f57da38 VideoCore: Use union to index into Regs struct
Also remove some unused members.
2017-02-08 22:13:25 -08:00
Yuri Kunde Schlesner
2889372e47 Merge pull request #2482 from yuriks/pica-refactor
Split up monolithic Regs struct
2017-02-08 22:07:34 -08:00
Lectem
f146a6d45a Use std::array<u8,2> instead of u8[2] to fix MSVC build 2017-02-05 14:55:51 +01:00
Yuri Kunde Schlesner
5759d94b5c VideoCore: Move Regs to its own file 2017-02-04 13:59:12 -08:00
Yuri Kunde Schlesner
f7c7f422c6 VideoCore: Split shader regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
8fca90b5d5 VideoCore: Split geometry pipeline regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
f443c7e5b0 VideoCore: Split lighting regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
23713d5dee VideoCore: Split framebuffer regs from Regs struct 2017-02-04 13:59:11 -08:00
Yuri Kunde Schlesner
9017093f58 VideoCore: Split texturing regs from Regs struct 2017-02-04 13:59:09 -08:00
Yuri Kunde Schlesner
000e78144c VideoCore: Split rasterizer regs from Regs struct 2017-02-04 13:08:47 -08:00
Yuri Kunde Schlesner
97e06b0a0d Merge pull request #2476 from yuriks/shader-refactor3
Oh No! More shader changes!
2017-02-04 13:02:48 -08:00
Yuri Kunde Schlesner
c74787a11c Pica/Texture: Move part of ETC1 decoding to new file and cleanups 2017-02-04 12:33:28 -08:00
Yuri Kunde Schlesner
09a750e866 Pica/Texture: Simplify/cleanup texture tile addressing 2017-02-04 12:33:25 -08:00
Yuri Kunde Schlesner
a1c9ac7845 VideoCore: Move LookupTexture out of debug_utils.h 2017-02-04 12:31:40 -08:00
wwylele
6dc1d6e568 ShaderJIT: add 16 dummy bytes at the bottom of the stack 2017-02-03 14:53:38 +02:00
Weiyi Wang
0b9c59ff22 Common/x64: remove legacy emitter and abi (#2504)
These are not used any more since we moved shader JIT to xbyak.
2017-01-31 01:06:42 -08:00
Merry
f7e96dc068 shader_jit_x64_compiler: esi and edi should be persistent (#2500) 2017-01-31 00:38:31 -08:00
Yuri Kunde Schlesner
37a4ea046d VideoCore: Make PrimitiveAssembler const-correct 2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
dcdffabfe6 VideoCore: Extract swrast-specific data from OutputVertex 2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
8ed9f9d49f VideoCore/Shader: Clean up OutputVertex::FromAttributeBuffer
This also fixes a long-standing but neverthless harmless memory
corruption bug, whech the padding of the OutputVertex struct would get
corrupted by unused attributes.
2017-01-29 21:31:38 -08:00
Yuri Kunde Schlesner
92bf5c88e6 VideoCore: Split shader output writing from semantic loading 2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
335df895b9 VideoCore: Consistently use shader configuration to load attributes 2017-01-29 21:31:37 -08:00
Yuri Kunde Schlesner
fccb28d2e9 VideoCore: Use correct register for immediate mode attribute count 2017-01-29 21:31:36 -08:00
Yuri Kunde Schlesner
ab6954e942 VideoCore: Rename some types to more accurate names 2017-01-29 21:31:36 -08:00
Yuri Kunde Schlesner
bbc7844021 VideoCore: Change misleading register names
A few registers had names such as "count" or "number" when they actually
contained the maximum (that is, count - 1). This can easily lead to hard
to notice off by one errors.
2017-01-29 21:31:36 -08:00
Kloen
eee37b857b video_core: gl_rasterizer_cache.cpp removed unused type alias 2017-01-30 05:18:28 +01:00
Kloen
6a3a3964b0 video_core: gl_rasterizer.cpp removed unused type alias 2017-01-30 05:16:48 +01:00
Kloen
4652d70572 video_core: silence unused-local-typedef boost related warning on GCC 2017-01-29 21:24:24 +01:00
Yuri Kunde Schlesner
0e9081b973 VideoCore/Shader: Move entry_point to SetupBatch 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
0f64274145 VideoCore/Shader: Move per-batch ShaderEngine state into ShaderSetup 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
6fa3687afc Shader: Remove OutputRegisters struct 2017-01-25 18:53:25 -08:00
Yuri Kunde Schlesner
9ea5eacf91 Shader: Initialize conditional_code in interpreter
This doesn't belong in LoadInputVertex because it also happens for
non-VS invocations. Since it's not used by the JIT it seems adequate to
initialize it in the interpreter which is the only thing that cares
about them.
2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
1a2acc3baa Shader: Don't read ShaderSetup from global state 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
fa4ac279a7 shader_jit_x64: Don't read program from global state 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
ade7ed7c5f VideoCore/Shader: Move ProduceDebugInfo to InterpreterEngine 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
114d6b2f97 VideoCore/Shader: Split interpreter and JIT into separate ShaderEngines 2017-01-25 18:53:24 -08:00
Yuri Kunde Schlesner
8eefc62833 VideoCore/Shader: Rename shader_jit_x64{ => _compiler}.{cpp,h} 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
dd4a1672a7 VideoCore/Shader: Split shader uniform state and shader engine
Currently there's only a single dummy implementation, which will be
split in a following commit.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
bd82cffd0b VideoCore/Shader: Add constness to methods 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
1e1f939817 VideoCore/Shader: Use only entry_point as ShaderSetup param
This removes all implicit dependency of ShaderState on global PICA
state.
2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
e3caf669b0 VideoCore/Shader: Use self instead of g_state.vs in ShaderSetup 2017-01-25 18:53:23 -08:00
Yuri Kunde Schlesner
34d581f2dc VideoCore/Shader: Extract input vertex loading code into function 2017-01-25 18:53:20 -08:00
Kloen
5cc94c17f6 video_core: fix shader.cpp signed / unsigned warning 2017-01-23 16:53:31 +01:00
Kloen
753fea5d65 video_core: gl_rasterizer float to int warning 2017-01-23 16:53:30 +01:00
Kloen
b6063d9a93 video_core: fix gl_rasterizer warning on MSVC 2017-01-23 16:53:30 +01:00
bunnei
22ad9094e6 config: Add option for specifying screen resolution scale factor. 2017-01-07 03:23:22 -05:00
Jonathan Hao
c18cb1b192 Fix some warnings (#2399) 2017-01-04 13:48:29 -03:00
bunnei
2f746e9946 Merge pull request #2367 from JayFoxRox/lighting-lut-quickfix
Lighting LUT Quickfix
2016-12-29 13:41:51 -05:00
Jannik Vogel
6ed4206f87 Minor cleanup in GLSL code 2016-12-25 21:38:10 +01:00
Jannik Vogel
88f409aec9 Offset lighting LUT samples correctly 2016-12-25 21:37:26 +01:00
MerryMage
64f98f4d0f core: Move emu_window and key_map into core
* Removes circular dependences (common should not depend on core)
2016-12-23 13:42:39 +00:00
bunnei
29564d73bd Merge pull request #2319 from yuriks/profile-scopes
VideoCore: Make profiling scope more representative
2016-12-21 13:33:49 -05:00
Albin Bernhardsson
ddec9cb369 Use GL_TRUE when setting color_mask 2016-12-19 19:06:35 +01:00
bunnei
3a1eaf2efc Merge pull request #2318 from yuriks/trace-opt
VideoCore: Inline IsPicaTracing
2016-12-18 21:15:24 -05:00
Yuri Kunde Schlesner
c135317de1 VideoCore/Shader: Extract DebugData out from UnitState 2016-12-16 00:16:25 -08:00
Yuri Kunde Schlesner
6e7e767645 Remove unnecessary cast 2016-12-16 00:15:55 -08:00
Yuri Kunde Schlesner
b5e3599704 VideoCore/Shader: Extract evaluate_condition lambda to function scope 2016-12-16 00:15:51 -08:00
Yuri Kunde Schlesner
960578f4e1 VideoCore/Shader: Extract call lambda up a scope and remove unused param 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
e4e962bc7c VideoCore/Shader: Remove dynamic control flow in (Get)UniformOffset 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
d27cb1dedc VideoCore/Shader: Move DebugData to a separate file 2016-12-15 23:08:05 -08:00
Yuri Kunde Schlesner
fb9e856b91 shader_jit_x64: Use LOOPCOUNT_REG as a 64-bit reg when indexing 2016-12-15 10:02:42 -08:00
Yuri Kunde Schlesner
ac9f937477 VideoCore: Make profiling scope more representative 2016-12-14 22:52:09 -08:00
Yuri Kunde Schlesner
945f554b84 VideoCore: Inline IsPicaTracing
Speeds up ALBW main menu slightly (~3%)
2016-12-14 22:06:40 -08:00
Yuri Kunde Schlesner
f00ada3363 VideoCore: Eliminate an unnecessary copy in the drawcall loop 2016-12-14 21:00:29 -08:00
Yuri Kunde Schlesner
5ff3206207 shader_jit_x64: Use Reg32 for LOOP* registers, eliminating casts 2016-12-14 20:06:09 -08:00
Yuri Kunde Schlesner
f4e98ecf3f VideoCore: Convert x64 shader JIT to use Xbyak for assembly 2016-12-14 20:06:08 -08:00
Lioncash
963aedd8cc Add all services to the Service namespace
Previously there was a split where some of the services were in the
Service namespace and others were not.
2016-12-11 00:07:27 +00:00
Markus Wick
d0d49bb951 OpenGL: Drop framebuffer completeness check.
This OpenGL call synchronize the worker thread of the nvidia blob.
It can be verified on linux with the __GL_THREADED_OPTIMIZATIONS=1 environment variable.
Those errors should not happen on tested drivers.
It was used as a workaround for https://bugs.freedesktop.org/show_bug.cgi?id=94148
2016-12-07 22:09:13 +01:00
emmauss
c4e4fa53d9 Implement Frame rate limiter (#2223)
* implement frame limiter

* fixes
2016-12-06 14:33:19 -05:00
Jannik Vogel
fc4591fa49 ASSERT that shader was linked successfully 2016-12-05 21:11:24 +01:00
Jannik Vogel
4088afe23c Report shader uniform block size in case of mismatch 2016-12-05 21:11:24 +01:00
Jannik Vogel
0edc986861 Print broken shader code to log 2016-12-05 21:11:24 +01:00
Yuri Kunde Schlesner
8a1f96011d OpenGL: Non-zero stride only makes sense for linear buffers 2016-12-04 06:14:27 -08:00
Yuri Kunde Schlesner
2600633b89 OpenGL: Ensure framebuffer binding is restored if completion check fails 2016-12-04 06:14:27 -08:00
Yuri Kunde Schlesner
ba7f213655 OpenGL: Fix DisplayTransfer accel when input width != output width
Fixes #2246, #2261
2016-12-04 05:21:57 -08:00
Yuri Kunde Schlesner
4f0f88bc6a Merge pull request #2259 from JayFoxRox/fix-fallback
shader_jit: Fix non-SSE4.1 path where FLR would not truncate
2016-12-03 22:11:39 -08:00
Jannik Vogel
2d8097eecc shader_jit: Fix non-SSE4.1 path where FLR would not truncate 2016-12-04 04:26:33 +01:00
Yuri Kunde Schlesner
4d5e42240c clang-format: Fix coding style 2016-12-03 01:32:46 -08:00
Jannik Vogel
e2cb7d7833 shader_jit: Load LOOPCOUNT_REG and LOOPINC 4 bit left-shifted 2016-12-02 04:33:15 +01:00
Subv
e3e4f27447 ClangFormat: Fixed the clang-format errors 2016-11-30 09:37:37 -05:00
Subv
aea9a91100 Build: Fixed a few warnings. 2016-11-29 16:51:53 -05:00
Yuri Kunde Schlesner
e279a6955e Merge pull request #2222 from linkmauve/die-frameskip-die
Remove the broken frame_skip option
2016-11-27 16:01:45 -08:00
Emmanuel Gil Peyrot
0820c99462 GPU: Remove the broken frame_skip option.
Fixes #1960.
2016-11-27 21:19:56 +00:00
Subv
4623415026 RasterizerGL: Use GL_TRUE and 0xFF in the stencil and depth masks instead of simply true and -1 2016-11-27 13:09:11 -05:00
Subv
743b0e71d9 Rasterizer/Memfill: Set the correct stencil write mask when clearing the stencil buffer. 2016-11-27 12:16:10 -05:00
jphalimi
82210ab480 Cache Vertices instead of Output registers (#2165)
This patch brings +3% performance improvement on average. It removes
ToVertex() as an important hotspot of the emulator.
2016-11-23 23:10:34 -05:00
wwylele
75affa13f7 Fix format error from #2195 2016-11-22 20:17:28 +02:00
bunnei
2de470c9b2 Merge pull request #2195 from Subv/factor_check
GPU/CiTrace: Avoid calling GetTextures() when not necessary.
2016-11-19 22:21:11 -05:00
Subv
050e9be15b GPU/CiTrace: Avoid calling GetTextures() when not necessary. 2016-11-19 19:27:00 -05:00
James Rowe
f68f37b520 Merge pull request #2194 from jroweboy/extremely-minor-clangformat-change
Minor formatting change
2016-11-19 13:51:14 -07:00
James Rowe
19acec351c Minor formatting change 2016-11-19 13:35:07 -07:00
James Rowe
d9305b0a07 Add default hotkey to swap primary screens.
Also minor style changes
2016-11-05 03:46:43 -06:00
James Rowe
2b1654ad9b Support additional screen layouts.
Allows users to choose a single screen layout or a large screen layout.
Adds a configuration option to change the prominent screen.
2016-11-05 02:55:41 -06:00
Ricardo de Almeida Gonzaga
13d46f6820 Fix typos 2016-10-20 12:26:59 -02:00
bunnei
d989102c9c Merge pull request #2082 from yuriks/shader-interp-crash
Fix/mask crash in shader debugger in Mii Maker
2016-10-06 19:35:37 -04:00
bunnei
49b10339bf Merge pull request #2103 from wwylele/gpu-reg-cleanup
GPU: DisplayTransfer & MemoryFill cleanup and param check
2016-10-03 20:21:55 -04:00
Yuri Kunde Schlesner
d9a904f9cb VideoCore: Shader interpreter cleanups 2016-09-29 21:15:49 -07:00
Yuri Kunde Schlesner
26b68313b9 VideoCore: Fix out-of-bounds read in ShaderSetup::ProduceDebugInfo
As far as I can tell, memset was replaced by a fill without correcting
the parameter type, causing an out-of-bounds array read in the Vec4
constructor.
2016-09-29 21:11:36 -07:00
Yuri Kunde Schlesner
01667d9a35 OpenGL: Take cached viewport sub-rect into account for scissor
Fixes #1938
2016-09-29 20:55:24 -07:00
wwylele
d2419570b9 rasterizer: separate TextureCopy from DisplayTransfer 2016-09-29 10:01:34 +08:00
Yuri Kunde Schlesner
f120e78b56 Remove special rules for Windows.h and library includes 2016-09-21 00:16:33 -07:00
Yuri Kunde Schlesner
84fbbe2629 Use negative priorities to avoid special-casing the self-include 2016-09-21 00:15:56 -07:00
Emmanuel Gil Peyrot
ebdae19fd2 Remove empty newlines in #include blocks.
This makes clang-format useful on those.

Also add a bunch of forgotten transitive includes, which otherwise
prevented compilation.
2016-09-21 11:15:47 +09:00
Yuri Kunde Schlesner
396a8d91a4 Manually tweak source formatting and then re-run clang-format 2016-09-18 21:14:25 -07:00
Emmanuel Gil Peyrot
dc8479928c Sources: Run clang-format on everything. 2016-09-18 09:38:01 +09:00
Yuri Kunde Schlesner
a3afeb4687 VideoCore: Fix dangling lambda context in shader interpreter
The static meant that after the first execution, these lambda context
would be pointing to a random location on the stack. Fixes a random
crash when using the interpreter.
2016-09-15 22:15:11 -07:00
bunnei
09063dc5bb Merge pull request #2032 from bunnei/qt-graphics
Qt graphics configure & V-Sync option
2016-08-31 22:20:54 -04:00
Jannik Vogel
7a79fa7a90 OpenGL: Avoid error on unsupported lighting LUT 2016-08-30 19:30:26 +02:00
bunnei
08ad9b36d4 config: Add a setting for graphics V-Sync. 2016-08-29 21:42:30 -04:00
Yuri Kunde Schlesner
ecf6ecf325 OpenGL: Add scaled resolution support to scissor 2016-06-27 22:16:04 -07:00
Yuri Kunde Schlesner
f0b9bc14b6 PICA: Scissor fixes and cleanups 2016-06-27 21:14:39 -07:00
Subv
f9be06b15f PICA: Implement scissor test 2016-06-27 21:14:13 -07:00
scurest
0f9274fe24 Remove superfluous std::move in return std::move(local_var) 2016-06-25 13:26:21 -05:00
Jannik Vogel
a12571c709 OpenGL: Implement fog 2016-06-07 00:06:28 +02:00
Jannik Vogel
ebee2513a9 Rasterizer: Implement fog 2016-06-07 00:06:28 +02:00
Jannik Vogel
57855a1701 Pica: Add fog state 2016-06-07 00:06:28 +02:00
Jannik Vogel
c900c092e3 OpenGL: Avoid undefined behaviour for UNIFORM_BLOCK_DATA_SIZE 2016-06-07 00:06:28 +02:00
mailwl
07cc781163 gsp::gpu: Reset g_thread_id in UnregisterInterruptRelayQueue 2016-06-01 09:40:15 +03:00
bunnei
552018c50a Merge pull request #1812 from JayFoxRox/refactor-shader
Retrieve shader result from new OutputRegisters-type
2016-05-31 18:12:56 -04:00
bunnei
201a7af92a Merge pull request #1846 from JayFoxRox/missing-dirty-lighting
OpenGL: Set shader_dirty on lighting changes
2016-05-26 17:35:12 -04:00
bunnei
a316fbb15a Merge pull request #1733 from lioncash/vert_loader
VertexLoader: Minor changes
2016-05-23 21:13:34 -04:00
Jannik Vogel
6a28f46844 OpenGL: Set shader_dirty on lighting changes 2016-05-23 23:28:13 +02:00
Jannik Vogel
30a01584f2 Pica: Name LightSrc.config register 2016-05-23 23:28:13 +02:00
Jannik Vogel
8e905b3af6 Pica: Name lighting.config0 and .config1 registers 2016-05-23 23:28:13 +02:00
Jannik Vogel
068bd6f728 OpenGL: Use uniforms for dist_atten_bias and dist_atten_scale 2016-05-23 23:28:13 +02:00
Jannik Vogel
d77279a415 Refactor Tev stage dumper 2016-05-21 03:11:27 +02:00
Jannik Vogel
324c21c922 Extend Tev stage dumper 2016-05-21 03:08:59 +02:00
bunnei
e5599ed300 Merge pull request #1786 from JayFoxRox/blend-equation
OpenGL: Support blend equation
2016-05-16 20:00:21 -04:00
Jannik Vogel
ff0fa86b17 Retrieve shader result from new OutputRegisters-type 2016-05-16 18:55:51 +02:00
linkmauve
f40fabd688 Merge pull request #1787 from JayFoxRox/refactor-jit
Refactor JIT
2016-05-16 17:54:45 +01:00
Jannik Vogel
5389dedfa1 OpenGL: Only update depth uniforms if the depth changed 2016-05-14 10:31:18 +02:00
Jannik Vogel
f8a11a664f OpenGL: value-initialize variables which cause uninitialised access otherwise 2016-05-14 10:16:11 +02:00
Jannik Vogel
1308afe2c2 Use new shader-jit signature for interpreter 2016-05-13 09:41:55 +02:00
Jannik Vogel
4e01e9ffc5 Refactor access to state in shader-jit 2016-05-13 09:20:14 +02:00
Jannik Vogel
5864cb7e00 OpenGL: Support blend equation 2016-05-12 22:57:40 +02:00
Jannik Vogel
7e756faaba Move program_counter and call_stack from UnitState to interpreter 2016-05-12 19:05:42 +02:00
Jannik Vogel
6c6d99ca51 Move default_attributes into Pica state 2016-05-12 19:05:41 +02:00
bunnei
f6eb62d062 Merge pull request #1690 from JayFoxRox/tex-type-3
Pica: Implement texture type 3 (Projection2D)
2016-05-11 21:47:08 -04:00
Jannik Vogel
ae7a82fa1c Turn ShaderSetup into struct 2016-05-11 23:48:24 +02:00
Jannik Vogel
5a7306d6df OpenGL: Implement texture type 3 2016-05-11 08:07:37 +02:00
Jannik Vogel
4311297eb1 Rasterizer: Implement texture type 3 2016-05-11 08:07:36 +02:00
Jannik Vogel
2f8e8e1455 Pica: Add tc0.w to OutputVertex 2016-05-11 08:07:36 +02:00
Jannik Vogel
9cfebb9334 Pica: Add texture type to state 2016-05-11 08:07:36 +02:00
bunnei
86ecbdfa4d Merge pull request #1621 from JayFoxRox/w-buffer
Implement W-buffer and fix depth-mapping
2016-05-10 23:00:40 -04:00
Lioncash
75e5d0a6a0 gl_rasterizer: Fix compilation for debug builds 2016-05-10 09:22:02 -04:00
Jannik Vogel
fc9cc21024 OpenGL: Implement W-Buffers and fix depth-mapping 2016-05-10 08:58:52 +02:00
Jannik Vogel
4c98113b57 Pica: Implement W-Buffer in SW rasterizer 2016-05-10 08:58:52 +02:00
linkmauve
006fe5fc0f Merge pull request #1704 from JayFoxRox/pod-config
Pica: PicaShaderConfig is TC and cleared before use
2016-05-10 01:16:53 +01:00
Lioncash
6d5f2a3cff vertex_loader: Correct forward declaration of InputVertex
It's actually a struct, not a class.
2016-05-08 23:08:18 -04:00
Lioncash
5587383eb7 vertex_loader: Provide an assertion for ensuring the loader has been setup
Also adds an assert to ensure that Setup is not called more than once
during a VertexLoader's lifetime.
2016-05-08 23:08:12 -04:00
Lioncash
1357724cd9 vertex_loader: Add constructors to facilitate immediate and two-step initialization 2016-05-08 23:03:32 -04:00
Lioncash
769f4a7018 vertex_loader: initialize_num_total_attributes.
Keeps the public API sane.
2016-05-08 23:03:32 -04:00
Lioncash
8ea5e7dfb5 vertex_loader: Use std::array instead of raw C arrays 2016-05-08 23:03:32 -04:00
Lioncash
a286b61f75 vertex_loader: Correct header ordering 2016-05-08 23:01:26 -04:00
Alexander Laties
0a31e373f1 fixup simple type conversions where possible 2016-05-07 11:41:55 -04:00
Emmanuel Gil Peyrot
aa4d4ff23c Frontends, VideoCore: Move glad initialisation to the frontend
On SDL2 this allows it to use SDL_GL_GetProcAddress() instead of the
default function loader, and fixes a crash when using apitrace with an
EGL context.

On Qt we will need to migrate from QGLWidget to QOpenGLWidget and
QOpenGLContext before we can use gladLoadGLLoader() instead of
gladLoadGL(), since the former doesn’t expose a function loader.
2016-05-06 03:10:14 +01:00
Jannik Vogel
7a77b8356c Pica: Rename VertexLoaded breakpoint to VertexShaderInvocation 2016-05-04 10:21:51 +02:00
Jannik Vogel
f74652d2fe Pica: Use a union for PicaShaderConfig 2016-05-03 15:06:49 +02:00
Jannik Vogel
5fc8eb227a Pica: Add TevStageConfigRaw to PicaShaderConfig (MSVC workaround) 2016-05-03 15:06:46 +02:00
Jannik Vogel
f3f7018c9e Pica: Make PicaShaderConfig trivially_copyable and clear it before use 2016-05-03 14:10:11 +02:00
Jannik Vogel
5ec1140f8b OpenGL: Don't copy const_color (Reverts #1745) 2016-05-03 12:34:52 +02:00
Jannik Vogel
696cb197a5 Pica: Replace logic in shader.cpp with loop 2016-05-03 01:40:47 +02:00
bunnei
15d0e98267 Merge pull request #1741 from linkmauve/iwyu-video_core
Fix video_core includes (and dependencies) using include-what-you-use
2016-05-01 17:44:57 -04:00
Jannik Vogel
7e0d6903ff OpenGL: Copy TevStageConfig using a loop. Fixes bug: const_color not copied 2016-05-01 16:35:54 +02:00
Jannik Vogel
aab41604f7 OpenGL: border_color was never set. Fixed. (#1740) 2016-04-30 12:20:23 -07:00
Emmanuel Gil Peyrot
691a42fe98 VideoCore: Run include-what-you-use and fix most includes. 2016-04-30 17:02:41 +01:00
Jannik Vogel
49bfe9bf91 Remove TGA dumper 2016-04-30 09:43:59 +02:00
bunnei
90243c56fb Merge pull request #1730 from hrydgard/vertex-loader
* Remove late accesses to attribute_config

* Refactor: Extract VertexLoader from command_processor.cpp.

Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.

* Move "&" to their proper place, add missing includes and make some properly relative.

* Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached).

* Optimize the vertex loader, nearly doubling its speed.

* Debugger fix

* Move and rename the MemoryAccesses class to MemoryAccessTracker.
2016-04-29 09:42:47 -04:00
Yuri Kunde Schlesner
e3a8292495 Common: Remove section measurement from profiler (#1731)
This has been entirely superseded by MicroProfile. The rest of the code
can go when a simpler frametime/FPS meter is added to the GUI.
2016-04-29 00:07:10 -07:00
Henrik Rydgard
a86d7cacc1 Move and rename the MemoryAccesses class to MemoryAccessTracker. 2016-04-29 08:50:21 +02:00
Henrik Rydgard
a442ee07f4 Debugger fix 2016-04-28 22:30:01 +02:00
Henrik Rydgard
251f29dd7f Optimize the vertex loader, nearly doubling its speed. 2016-04-28 22:21:39 +02:00
Henrik Rydgard
2403e86cbb Don't keep base_address in the loader, it doesn't belong there (with it, the loader can't be cached). 2016-04-28 20:17:35 +02:00
Henrik Rydgard
d00e2340c6 Move "&" to their proper place, add missing includes and make some properly relative. 2016-04-28 19:40:11 +02:00
Henrik Rydgard
47ff008817 Refactor: Extract VertexLoader from command_processor.cpp.
Preparation for a similar concept to Dolphin or PPSSPP. These can be JIT-ed and cached.
2016-04-28 19:05:55 +02:00
Henrik Rydgard
0cf15f64ef Remove late accesses to attribute_config 2016-04-28 18:07:34 +02:00
bunnei
15c907317c Merge pull request #1710 from hrydgard/optimize-event-breakpoints
Replace std::map with std::array for graphics event breakpoints
2016-04-25 21:37:43 -04:00
Sam Spilsbury
656a442433 shader: Shader size is long uint, not uint. 2016-04-25 00:40:03 +08:00
Sam Spilsbury
c6709d97bc shader: Handle non-CALL opcodes with a break 2016-04-25 00:39:54 +08:00
Sam Spilsbury
bbffa6ad69 shader: Format string must be provided inline and not as a variable 2016-04-24 23:40:52 +08:00
Henrik Rydgard
01a1555b5d Replace std::map with std::array for graphics event breakpoints, and allow the compiler to inline. Saves 1%+ in vertex heavy situations. 2016-04-24 14:19:49 +02:00
Sam Spilsbury
39d4994c15 pica: Handle default lighting case 2016-04-23 11:54:02 +08:00
tfarley
562f36a144 HWRasterizer: reorder declarations to match defs 2016-04-22 10:52:02 -04:00
tfarley
3268cab26b HWRasterizer: sync specular uniform for new shaders 2016-04-22 10:48:00 -04:00
bunnei
bab30bcd6e Merge pull request #1436 from tfarley/hw-tex-forwarding
Hardware Renderer Texture Forwarding
2016-04-22 08:15:51 -04:00
tfarley
22f3a7e94c HWRasterizer: Texture forwarding 2016-04-21 17:27:56 -04:00
tfarley
e46d086189 Config: Add scaled resolution option 2016-04-21 17:27:48 -04:00
bunnei
14cc1ed911 Merge pull request #1655 from JayFoxRox/hw-dot3
OpenGL: Implement color combiner Operation::Dot3_RGB
2016-04-21 16:39:36 -04:00
bunnei
142a5dc3f5 Merge pull request #1625 from JayFoxRox/sw-blend-func
Rasterizer: Allow all blend factors for alpha blend-func
2016-04-17 20:20:15 -04:00
Jannik Vogel
e2b63a2dd7 Rasterizer: Allow all blend factors for alpha blend-func 2016-04-17 22:44:24 +02:00
Lioncash
4501a9eb50 debug_utils: use std::make_unique for initializing PicaTrace 2016-04-14 22:05:28 -04:00
bunnei
aff35d3e58 Merge pull request #1665 from lioncash/file
IOFile: Minor API changes
2016-04-14 16:28:15 -04:00
bunnei
d7fe2784cc shader_jit_x64: Rename RuntimeAssert to Compile_Assert. 2016-04-13 23:04:53 -04:00
bunnei
3f623b2561 shader_jit_x64.cpp: Rename JitCompiler to JitShader. 2016-04-13 23:04:53 -04:00
bunnei
847fb951e2 shader_jit_x64: Free memory that's no longer needed after compilation. 2016-04-13 23:04:52 -04:00
bunnei
60aa72e117 shader_jit_x64: Use a sorted vector instead of a set for keeping track of return addresses. 2016-04-13 23:04:52 -04:00
bunnei
60749f2cda shader_jit_x64: Use CALL/RET instead of JMP for subroutines. 2016-04-13 23:04:52 -04:00
bunnei
1d45b57939 shader_jit_x64: Separate initialization and code generation for readability. 2016-04-13 23:04:50 -04:00
bunnei
6e0319eec9 shader_jit_x64: Get rid of unnecessary last_program_counter variable. 2016-04-13 23:04:49 -04:00
bunnei
f3afe24594 shader_jit_x64: Execute certain asserts at runtime.
- This is because we compile the full shader code space, and therefore its common to compile malformed instructions.
2016-04-13 23:04:49 -04:00
bunnei
ffcf7ecee9 shader: Remove unused 'state' argument from 'Setup' function. 2016-04-13 23:04:48 -04:00
bunnei
a5a74eb121 shader_jit_x64: Specify shader main offset at runtime. 2016-04-13 23:04:47 -04:00
bunnei
c9d10de644 shader_jit_x64: Allocate each program independently and persist for emu session. 2016-04-13 23:04:47 -04:00
bunnei
4632791a40 shader_jit_x64: Rewrite flow control to support arbitrary CALL and JMP instructions. 2016-04-13 23:04:44 -04:00
bunnei
135aec7bea shader_jit_x64: Fix strict memory aliasing issues. 2016-04-13 23:04:43 -04:00
Lioncash
a4120ca66c file_util: Don't expose IOFile internals through the API 2016-04-13 20:17:17 -04:00
Jannik Vogel
ff7c798d86 Pica: Remove geometry dumper (PICA_DUMP_GEOMETRY) 2016-04-10 22:07:06 +02:00
Jannik Vogel
0ad050f85d OpenGL: Implement color combiner Operation::Dot3_RGB 2016-04-10 15:31:24 +02:00
Jannik Vogel
35a92b4097 OpenGL: Respect buffer-write allow registers 2016-04-08 22:57:11 +02:00
Jannik Vogel
c6bbc41984 OpenGL: Split buffer-write mask sync into seperate functions 2016-04-08 22:42:44 +02:00
Jannik Vogel
fa24df7340 Rasterizer: Respect buffer-write allow registers 2016-04-08 22:35:22 +02:00
Jannik Vogel
d47605b2ed OpenGL: Keep stencil-test and framebuffer.depth_format in sync 2016-04-08 22:35:17 +02:00
bunnei
6e750ae12d Merge pull request #1639 from linkmauve/fix-double-framebuffer-check
OpenGL: Fix a double framebuffer completeness checks.
2016-04-07 19:52:02 -04:00
Mathew Maidment
aa6380e5bc Merge pull request #1643 from MerryMage/make_unique
Common: Remove Common::make_unique, use std::make_unique
2016-04-05 20:10:11 -04:00
MerryMage
a06dcfeb61 Common: Remove Common::make_unique, use std::make_unique 2016-04-05 13:31:17 +01:00
Emmanuel Gil Peyrot
3219be8ee0 OpenGL: Fix a double framebuffer completeness checks. 2016-04-03 17:00:44 +01:00
Jannik Vogel
693cbc1f8f OpenGL: Check for framebuffer completeness 2016-04-03 17:06:05 +02:00
Jannik Vogel
c26b141407 Avoid warnings by casting to size_t for ARRAY_SIZE() comparisons 2016-04-01 02:14:43 +02:00
Yuri Kunde Schlesner
81004211dd Pica: Improve accuracy of immediate-mode support
This partially fixes Etrian Odyssey IV.
2016-03-23 20:18:40 -07:00
Yuri Kunde Schlesner
0c447e0a06 OpenGL: Don't attempt to draw empty triangle batches
Our code did not handle this well, causing random crashes in some
situations.
2016-03-23 20:02:05 -07:00
bunnei
ebbba0d381 Merge pull request #1508 from JayFoxRox/vs-output-map
Respect vs output map
2016-03-22 11:59:12 -04:00
bunnei
784c5539ea Merge pull request #1538 from lioncash/dot
shader_interpreter: use std::inner_product for the dot product
2016-03-20 00:35:06 -04:00
bunnei
58852bedbf Merge pull request #1535 from JayFoxRox/fix-align
PICA: Alignment happens locally in vertex
2016-03-17 20:00:00 -04:00
Lioncash
63e956cc7a video_core: Don't cast away const 2016-03-17 02:01:38 -04:00
Lioncash
4d89df8df2 shader_interpreter: use std::inner_product for the dot product
Same thing, less code.
2016-03-17 01:00:30 -04:00
Lioncash
c928b04eee core/video_core: Make NumIds functions constexpr 2016-03-17 00:29:47 -04:00
Lioncash
39baad9926 core/video_core: Don't cast away const in subscript operators
Not to say these subscript operators aren't totally ugly as is.
2016-03-17 00:27:15 -04:00
Jannik Vogel
7eef9ebc3b PICA: Alignment happens locally in vertex 2016-03-17 02:24:20 +01:00
bunnei
55f24e1cf4 Merge pull request #1519 from JayFoxRox/vp-offset-fix
PICA: Fix viewport offset
2016-03-16 14:19:53 -04:00
bunnei
96cafbe4cc Merge pull request #1503 from bunnei/clear-jit-cache
Clear JIT cache
2016-03-16 13:18:51 -04:00
Jannik Vogel
9aad2f29bb PICA: Fix MAD/MADI encoding 2016-03-15 20:01:25 +01:00
Jannik Vogel
964cfaea47 PICA: Fix viewport offset 2016-03-14 18:37:33 +01:00
Jannik Vogel
f746a00964 Respect vs output map 2016-03-14 13:03:34 +01:00
Jannik Vogel
a66c186e81 PICA: Align vertex attributes 2016-03-13 04:54:23 +01:00
bunnei
6efb710b28 shader_jit_x64: Clear cache after code space fills up. 2016-03-12 12:15:49 -05:00
bunnei
c103759cdc shader_jit_x64: Make assert outputs more useful & cleanup formatting. 2016-03-12 12:06:28 -05:00
bunnei
46f78b7f19 shader: Update log message to use proper log class. 2016-03-12 12:03:32 -05:00
Yuri Kunde Schlesner
305e63d9ea Merge pull request #1475 from lioncash/align
Common: Get rid of alignment macros
2016-03-09 20:08:38 -08:00
bunnei
4a2d1571bc Merge pull request #1474 from lioncash/renderer
renderer_base: Minor changes
2016-03-09 10:57:38 -05:00
Lioncash
88d604383e Common: Get rid of alignment macros
The gl rasterizer already uses alignas,
so we may as well move everything over.
2016-03-09 01:31:14 -05:00
bunnei
8530a2d7df Merge pull request #1344 from LittleWhite-tb/error-output
Output errors in GUI
2016-03-08 23:12:04 -05:00
Lioncash
4b5b32e721 renderer_base: In-class initialize variables 2016-03-08 21:46:47 -05:00
Lioncash
be913040a8 render_base: Clarify/normalize getter functions 2016-03-08 21:45:24 -05:00
Lioncash
bf76afc68d renderer_base: Don't directly expose the rasterizer unique_ptr
There's no reason to allow direct access to the unique_ptr instance. Only
its contained pointer.
2016-03-08 21:31:44 -05:00
LittleWhite
4be68dddfb Improve error report from Init() functions
Add error popup when citra initialization failed
2016-03-08 22:05:25 +01:00
Yuri Kunde Schlesner
c58bc25d5b Pica: Write depth value even when depth test is disabled
This has been confirmed on hardware. Fixes Etrian Odyssey IV.
2016-03-05 20:16:20 -08:00
Dwayne Slater
6b775034dd Add immediate mode vertex submission 2016-03-02 22:16:38 -05:00
bunnei
2b00bdec1f Merge pull request #1424 from MerryMage/lut_init
renderer_opengl: Initalise fragment shader LUT textures
2016-02-25 19:36:27 -05:00
MerryMage
0801363840 renderer_opengl: Initalise fragment shader LUT textures 2016-02-26 00:12:38 +00:00
bunnei
e04e6aabbc Merge pull request #1395 from ds84182/padding-attributes
Add support for padding vertex attributes
2016-02-24 18:15:16 -08:00
Dwayne Slater
ed8072b48b Fix out of bounds array access when loading a component >= 12 2016-02-20 19:03:14 -05:00
Dwayne Slater
82fc075ff6 Add support for padding vertex attributes 2016-02-20 19:00:31 -05:00
MerryMage
6c71858c5c BitField: Make trivially copyable and remove assignment operator 2016-02-12 19:51:16 +00:00
bunnei
19557aaab3 pica: Cleanup lighting register definitions and documentation. 2016-02-05 17:20:25 -05:00
bunnei
c4d318f691 gl_rasterizer: Use alignas(16) instead of explicit padding. 2016-02-05 17:20:24 -05:00
bunnei
aaa7beeda8 renderer_opengl: Use GLvec3/GLvec4 aliases for commonly used types. 2016-02-05 17:20:23 -05:00
bunnei
8e9318f20a gl_rasterizer: Fix issue with interpolation of opposite quaternions. 2016-02-05 17:20:23 -05:00
bunnei
b694423d09 pica_types: Fix typo in docstring. 2016-02-05 17:20:22 -05:00
bunnei
a949fd5f25 pica_types: Replace float24/20/16 with a template class. 2016-02-05 17:20:22 -05:00
bunnei
d171822dce command_processor: Add an assertion to ensure LUTs are not written past their boundaries. 2016-02-05 17:20:20 -05:00
bunnei
310a1c30ca gl_rasterizer: Remove unnecessary casts. 2016-02-05 17:20:19 -05:00
bunnei
c229503f4a gl_rasterizer: Fix PicaShaderConfig on GCC. 2016-02-05 17:20:19 -05:00
bunnei
9dfb223d26 gl_rasterizer: Initial implementation of bump mapping. 2016-02-05 17:20:19 -05:00
bunnei
449902b558 gl_shader_gen: Fix bug in LUT range (should within range [0, 255] not [0, 256]). 2016-02-05 17:20:17 -05:00
bunnei
348c9c9ff3 gl_shader_gen: Implement lighting red, green, and blue reflection. 2016-02-05 17:20:16 -05:00
bunnei
01b407638c gl_shader_gen: View should be normalized. 2016-02-05 17:20:15 -05:00
bunnei
c37de30cfc gl_shader_gen: Implement fragment lighting fresnel effect. 2016-02-05 17:20:13 -05:00
bunnei
0e67c21c9e gl_shader_gen: Implement fragment lighting specular 1 component. 2016-02-05 17:19:16 -05:00
bunnei
781b046579 gl_shader_gen: Add support for D0 LUT scaling. 2016-02-05 17:18:36 -05:00
bunnei
3d89dacd56 gl_shader_gen: Refactor lighting config to match Pica register naming.
- Also implement D0 LUT enable.
2016-02-05 17:17:35 -05:00
bunnei
6307999116 pica: Cleanup and add some comments to lighting registers. 2016-02-05 17:17:34 -05:00
bunnei
6878ba7608 gl_rasterizer: Minor naming refactor on Pica register naming. 2016-02-05 17:17:33 -05:00
bunnei
76f303538b gl_shader_gen: Reorganize and cleanup lighting code.
- No functional difference.
2016-02-05 17:17:33 -05:00
bunnei
5f3bad8fb1 gl_shader_gen: Fix directional lights. 2016-02-05 17:17:32 -05:00
bunnei
bdc72d0904 gl_shader_gen: Fix bug with lighting where clamp highlights was only applied to last light. 2016-02-05 17:17:32 -05:00
bunnei
603b619cbe gl_shader_gen: View vector needs to be normalized when computing half angle vector. 2016-02-05 17:17:31 -05:00
bunnei
021cb0bced renderer_opengl: Use textures for fragment shader LUTs instead of UBOs.
- Gets us LUT interpolation for free.
- Some older Intel GPU drivers did not support the big UBOs needed to store the LUTs.
2016-02-05 17:17:31 -05:00
bunnei
bf89870437 renderer_opengl: Initial implementation of basic specular lighting. 2016-02-05 17:17:30 -05:00
bunnei
e34fa6365f renderer_opengl: Implement HW fragment lighting distance attenuation. 2016-02-05 17:17:30 -05:00
bunnei
e9af70eaf3 renderer_opengl: Implement HW fragment lighting LUTs within our default UBO. 2016-02-05 17:17:29 -05:00
bunnei
afbef52516 renderer_opengl: Implement diffuse component of HW fragment lighting. 2016-02-05 17:17:29 -05:00
bunnei
b003075570 pica: Implement decoding of basic fragment lighting components.
- Diffuse
- Distance attenuation
- float16/float20 types
- Vertex Shader 'view' output
2016-02-05 17:17:28 -05:00
bunnei
281bc90ad2 pica: Implement fragment lighting LUTs. 2016-02-05 17:17:27 -05:00
bunnei
4369767c72 pica: Add decodings for distance attenuation and LUT registers. 2016-02-05 17:17:26 -05:00
bunnei
38c7b20475 pica: Add pica_types module and move float24 definition. 2016-02-05 17:17:26 -05:00
tfarley
a15f4d1590 hwrasterizer: Use proper cached fb addr/size 2016-02-03 15:52:34 -05:00
Yuri Kunde Schlesner
05356543d9 OpenGL: Downgrade GL_DEBUG_SEVERITY_NOTIFICATION to Debug logging level
The nVidia driver is *extremely* spammy on this category, sending a
message on every buffer or texture upload, slowing down the emulator and
making the log useless.
2016-02-02 22:44:13 -08:00
bunnei
a43f8d2fb7 Merge pull request #1367 from yuriks/jit-jmp
Shader JIT: Fix off-by-one error when compiling JMPs
2016-01-27 09:19:28 -05:00
bunnei
c407b6ce2f Merge pull request #1369 from yuriks/jmpu-inverted
Shader: Implement "invert condition" feature of IFU instruction
2016-01-26 09:58:16 -05:00
Yuri Kunde Schlesner
d01d1f7e01 Debugger: Use 3dbrew names for GPU registers
This list was imported from the 3dbrew wiki page and is pretty much
complete.
2016-01-24 20:29:44 -08:00
Yuri Kunde Schlesner
083d2d89a5 Shader: Implement "invert condition" feature of IFU instruction
If the bit 0 of the JMPU instruction is set, then the jump condition
will be inverted. That is, a jump will happen when the boolean is false
instead of when it is true.
2016-01-24 20:29:06 -08:00
Yuri Kunde Schlesner
c1071c1ff7 Shader JIT: Fix off-by-one error when compiling JMPs
There was a mistake in the JMP code which meant that one instruction at
the destination would be skipped when the jump was taken. This commit
also changes the meaning of the culprit parameter to make it less
confusing and avoid similar mistakes in the future.
2016-01-24 02:15:56 -08:00
bunnei
0b6cc0592d Merge pull request #1334 from tfarley/hw-depth-modifiers
hwrasterizer: Use depth offset
2016-01-20 22:27:33 -05:00
tfarley
f53dbafdae hwrasterizer: Use depth offset 2016-01-20 21:57:59 -05:00
Lioncash
4966568076 command_processor: Get rid of variable shadowing 2016-01-17 02:22:51 -05:00
bunnei
6a261e825c Merge pull request #1196 from linkmauve/khr_debug
Add optional GL_KHR_debug support
2016-01-12 22:54:52 -05:00
Lioncash
5e17a586da video_core: Make the renderer global a unique_ptr 2015-12-30 08:52:01 -05:00
Lioncash
97dc9634a2 swrasterizer: Add missing override specifier 2015-12-29 18:35:38 -05:00
Yuri Kunde Schlesner
015d7b9779 VideoCore: Sync state after changing rasterizers
This fixes various bugs that appear in the HW rasterizer after switching
between it and the SW one during emulation.
2015-12-20 17:37:15 -08:00
Yuri Kunde Schlesner
402692c08d Merge pull request #1267 from yuriks/flipped-framebuffer
OpenGL: Flip framebuffers during transfer rather than when rendering
2015-12-09 20:35:15 -08:00
bunnei
3013f26d70 Merge pull request #1269 from Subv/triangle_fan
GPU/PrimitiveAssembler: Fixed drawing triangle fans.
2015-12-08 10:27:40 -05:00
Yuri Kunde Schlesner
195fedccf0 VideoCore: Unify interface to OpenGL and SW rasterizers
This removes explicit checks sprinkled all over the codebase to instead
just have the SW rasterizer expose an implementation with no-ops for
most operations.
2015-12-07 20:20:38 -08:00
Yuri Kunde Schlesner
03835d04f4 VideoCore: Rename HWRasterizer methods to be less confusing 2015-12-06 19:08:37 -08:00
Yuri Kunde Schlesner
da80ece8b9 OpenGL: Rename cache functions to better match what they actually do 2015-12-06 17:02:52 -08:00
Subv
7b33e163b9 GPU/PrimitiveAssembler: Fixed drawing triangle fans.
It was skipping the second vertex assignment and using uninitialized garbage when assembling the corresponding triangle.
2015-12-06 10:48:05 -05:00
Yuri Kunde Schlesner
cf81e08389 OpenGL: Flip framebuffers during transfer rather than when rendering 2015-12-04 22:23:39 -08:00
Yuri Kunde Schlesner
95dbc6eb0e OpenGL: Add support for glFrontFace in the state tracker 2015-12-04 21:58:26 -08:00
Yuri Kunde Schlesner
e9c209ccc8 PICA: Properly emulate 1-stage delay in the combiner buffer
This was discovered and verified by @fincs. The tev combiner buffer
actually lags behind by one stage, meaning stage 1 reads the initial
color, stage 2 reads stage 0's output, and so on.

Fixes character portraits in Fire Emblem: Awakening and world textures
in Zelda: ALBW. Closes #1140.
2015-11-30 22:45:18 -08:00
bunnei
f008dfbaca renderer_opengl: Fix uniform issues introduced with kemenaran/avoid-explicit-uniform-location. 2015-11-25 22:33:24 -05:00
Pierre de La Morinerie
0735630744 Use regular uniform location
The support for GL_ARB_explicit_uniform_location is not that good
(53% according to http://feedback.wildfiregames.com/report/opengl/feature/GL_ARB_explicit_uniform_location).

This fix the shader compilation on Intel HD 4000 (#1222).
2015-11-25 11:56:11 +01:00
Subv
823ce62f2f FragShader: Use an UBO instead of several individual uniforms 2015-11-18 21:03:56 -05:00
Subv
7a37dba75b GPU/Loaders: Log an error when a loader tries to load from a component beyond the available ones (12).
Related to #1170
2015-11-09 21:16:11 -05:00
Emmanuel Gil Peyrot
53df67376d OpenGL: Log GL_KHR_debug messages we receive
This allows the driver to communicate errors, warnings and improvement
suggestions about our usage of the API.
2015-10-24 02:30:51 +01:00
bunnei
74186a5f01 gl_shader_gen: Use explicit locations for vertex shader attributes. 2015-10-21 22:29:56 -04:00
bunnei
e663f5c914 gl_shader_gen: Optimize code for AppendAlphaTestCondition.
- Also add a comment to AppendColorCombiner.
2015-10-21 22:29:56 -04:00
bunnei
e7b1f2ae0a gl_rasterizer: Define enum types for each vertex texcoord attribute. 2015-10-21 21:59:47 -04:00
bunnei
0ebcff710e gl_shader_gen: Various cleanups to shader generation. 2015-10-21 21:59:44 -04:00
bunnei
240a3b80d9 gl_rasterizer: Use MMH3 hash for shader cache hey.
- Includes a check to confirm no hash collisions.
2015-10-21 21:58:59 -04:00
bunnei
71edb55114 gl_shader_gen: Require explicit uniform locations.
- Fixes uniform issue on AMD.
2015-10-21 21:54:56 -04:00
bunnei
5ef2df056d gl_shader_gen: Rename 'o' to 'attr' in vertex/fragment shaders. 2015-10-21 21:53:19 -04:00
bunnei
c2c4faef4c gl_shader_gen: AppendAlphaModifier default should be 0.0, not vec4(0.0). 2015-10-21 21:53:18 -04:00
bunnei
bd833b8dd8 gl_shader_gen: Fix bug where TEV stage outputs should be clamped. 2015-10-21 21:53:18 -04:00
bunnei
f2e7f7e101 gl_rasterizer: Add documentation to ShaderCacheKey. 2015-10-21 21:53:17 -04:00
bunnei
4b5141954e gl_shader_gen: Add additional function documentation. 2015-10-21 21:53:17 -04:00
bunnei
2a0a86f629 gl_shader_util: Cleanup header file + add docstring. 2015-10-21 21:53:16 -04:00
bunnei
a74774257e gl_shader_gen: Various cleanups + moved TEV stage generation to its own function. 2015-10-21 21:53:16 -04:00
bunnei
c86b9d4242 renderer_opengl: Refactor shader generation/caching to be more organized + various cleanups. 2015-10-21 21:53:14 -04:00
bunnei
3c057bd3d8 gl_rasterizer: Move logic for creating ShaderCacheKey to a static function. 2015-10-21 21:53:05 -04:00
bunnei
b02a533d94 gl_shader_util: Use vec3 constants for AppendColorCombiner. 2015-10-21 21:51:24 -04:00
bunnei
37b0aa5af7 gl_rasterizer: Fix typo in uploading TEV const color uniforms. 2015-10-21 21:51:24 -04:00
bunnei
82f3e6dc69 gl_shader_util: Fix precision bug with alpha testing.
- Alpha testing is not done with float32 precision, this makes the HW renderer match the SW renderer.
2015-10-21 21:51:23 -04:00
Subv
e3f4233cef Initial implementation of fragment shader generation with caching. 2015-10-21 21:51:23 -04:00
Emmanuel Gil Peyrot
14af5919ba CitraQt, SkyEye, Loader, VideoCore: Remove newlines in LOG_* calls.
The LOG_* function itself already appends one.
2015-10-09 22:14:56 +01:00
Rohit Nirmal
32391cffdd Silence -Wsign-compare warnings. 2015-10-06 22:16:15 -05:00
Martin Lindhe
bafb7afba2 fix some xcode 7.0 warnings 2015-09-29 23:11:09 +02:00
Lioncash
751fbfdcc3 general: Silence some warnings when using clang 2015-09-16 08:51:53 -04:00
Lioncash
aec28ed91e video_core: Reorganize headers 2015-09-11 07:31:15 -04:00
Lioncash
1fa772393b video_core: Remove unnecessary includes from headers 2015-09-11 00:10:03 -04:00
bunnei
a008b28659 Merge pull request #1133 from lioncash/emplace-back
gl_rasterizer: Replace push_back calls with emplace_back in AddTriangle
2015-09-10 15:07:06 -04:00
bunnei
0d5604fdcb Merge pull request #1136 from lioncash/proto
renderer_opengl: Remove unimplemented function declaration
2015-09-10 11:29:33 -04:00
Lioncash
8a3428f16c renderer_opengl: Remove unimplemented function declaration 2015-09-10 10:45:44 -04:00
Lioncash
526eb33d1e video_core: Remove unused variables 2015-09-10 10:26:21 -04:00
Lioncash
7b72b71605 gl_rasterizer: Replace push_back calls with emplace_back in AddTriangle 2015-09-10 00:20:30 -04:00
aroulin
1484a23530 Shader JIT: Use SCALE constant from emitter 2015-09-07 16:50:28 +02:00
aroulin
87e3b9ffc0 Shader: Fix size_t to int casts of register offsets 2015-09-07 16:50:28 +02:00
Yuri Kunde Schlesner
b044c047c4 OpenGL: Use Sampler Objects to decouple sampler config from textures
Fixes #978
2015-09-03 15:09:51 -03:00
Yuri Kunde Schlesner
466e608c19 OpenGL: Remove ugly and endian-unsafe color pointer casts 2015-09-03 15:09:51 -03:00
Yuri Kunde Schlesner
ec28f037e6 OpenGL: Add support for Sampler Objects to state tracker 2015-09-03 15:09:50 -03:00
Yuri Kunde Schlesner
cc19a76656 Merge pull request #1087 from yuriks/opengl-glad
Replace the previous OpenGL loader with a glad-generated 3.3 one
2015-09-03 15:07:01 -03:00
bunnei
918ca40c68 Merge pull request #1088 from aroulin/x64-emitter-abi-call
x64: Proper stack alignment in shader JIT function calls
2015-09-02 08:46:58 -04:00
aroulin
ba998b85a1 video_core: Fix format specifiers warnings 2015-09-02 08:20:00 +02:00
aroulin
179ad35c2e x64: Proper stack alignment in shader JIT function calls
Import Dolphin stack handling and register saving routines
Also removes the x86 parts from abi files
2015-09-01 23:39:52 +02:00
Tony Wasserka
071510b367 Merge pull request #1092 from Subv/vertex_offset
Pica: Add the vertex_offset register to the Pica registers map.
2015-08-31 18:17:59 +02:00
Subv
58a04c0776 Pica: Added the primitive_restart register (0x25f) to the registers map. 2015-08-31 09:14:18 -05:00
Subv
149ea561a6 Pica: Add the vertex_offset register to the Pica registers map. 2015-08-31 07:02:30 -05:00
aroulin
84959be150 Shader JIT: Fix SGE/SGEI NaN behavior
SGE was incorrectly emulated w.r.t. NaN behavior as the CMPSS SSE
instruction was used with NLT
2015-08-31 08:16:15 +02:00
bunnei
e77dc4e9d2 Merge pull request #1059 from Subv/vertex_offset
GPU: Implemented register 0x22A PICA_REG_DRAW_VERTEX_OFFSET
2015-08-30 17:12:33 -04:00
Subv
12a11472f1 GPU: Implemented register 0x22A.
This is the equivalent of the "first" parameter in glDrawArrays, it tells the GPU the vertex index at which to start rendering.

Register 0x22A doesn't affect indexed rendering.
2015-08-30 15:46:22 -05:00
Yuri Kunde Schlesner
a1a5570e97 Replace the previous OpenGL loader with a glad-generated 3.3 one
The main advantage of switching to glad from glLoadGen is that, apart
from being actively maintained, it supports a customizable entrypoint
loader function, which makes it possible to also support OpenGL ES.
2015-08-30 08:45:56 -03:00
bunnei
58e9f78844 Merge pull request #1049 from Subv/stencil
Rasterizer: Corrected the stencil implementation.
2015-08-29 20:06:25 -04:00
Yuri Kunde Schlesner
c5a4025b65 Merge pull request #1065 from yuriks/shader-fp
Shader FP compliance fixes
2015-08-27 16:34:13 -07:00
bunnei
f3cef178e3 gl_rasterizer_cache: Detect and ignore unnecessary texture flushes. 2015-08-27 19:07:53 -04:00
aroulin
f52d8c1a9b Shader JIT: Fix float to integer rounding in MOVA
MOVA converts new address register values from floats to integers using truncation
2015-08-27 15:26:41 +02:00
archshift
dd0e1061ef Shader JIT: ifdef out reference to ifdef'd out shader_map
shader_map was only defined on x86 architectures, but was cleared on shutdown
with no ifdef protection. Ifdef this out so non-x86 architectures can be built.
2015-08-26 22:28:19 +00:00
Yuri Kunde Schlesner
0fcabd2b11 Integrate the MicroProfile profiling library
This brings goodies such as a configurable user interface and
multi-threaded timeline view.
2015-08-24 22:16:28 -03:00
bunnei
afd45d1d7f Merge pull request #1063 from Subv/hw_renderer_debug_fb
HWRenderer: Only reload the framebuffer from gpu memory if the hw renderer is in use during a breakpoint
2015-08-24 13:02:44 -04:00
Subv
583d777b1a HWRenderer: Added a workaround for the Intel Windows driver bug that causes glTexSubImage2D to not change the stencil buffer.
Reported here https://communities.intel.com/message/324464
2015-08-24 11:28:28 -05:00
Yuri Kunde Schlesner
eff10959de fixup! Shaders: Fix multiplications between 0.0 and inf 2015-08-24 02:10:11 -03:00
Yuri Kunde Schlesner
d8ef20c856 Shader JIT: Tiny micro-optimization in DPH 2015-08-24 01:48:37 -03:00
Yuri Kunde Schlesner
630a850d4d Shaders: Fix multiplications between 0.0 and inf
The PICA200 semantics for multiplication are so that when multiplying
inf by exactly 0.0, the result is 0.0, instead of NaN, as defined by
IEEE. This is relied upon by games.

Fixes #1024 (missing OoT interface items)
2015-08-24 01:48:15 -03:00
Yuri Kunde Schlesner
082b74fa24 Shaders: Explicitly conform to PICA semantics in MAX/MIN 2015-08-24 01:46:58 -03:00
Yuri Kunde Schlesner
76247170df Shader JIT: Add name to second scratch register (XMM4) 2015-08-24 01:46:10 -03:00
Lioncash
fa5076eb9b shader_jit: Replace two MDisp usages with MatR 2015-08-24 00:39:50 -04:00
Yuri Kunde Schlesner
455147ee95 Shader JIT: Fix CMP NaN behavior to match hardware 2015-08-24 01:29:40 -03:00
bunnei
83c214f6d8 Merge pull request #1062 from aroulin/shader-rcp-rsq
Shader: RCP and RSQ computes only the 1st component
2015-08-23 17:56:35 -04:00
Subv
d1b9383d86 HWRenderer: Only reload the framebuffer from gpu memory if the hw renderer is in use during a breakpoint. 2015-08-23 15:26:17 -05:00
aroulin
03c5cfead4 Shader: Use std::sqrt for float instead of sqrt 2015-08-23 22:03:07 +02:00
aroulin
fa552f11ef Shader: RCP and RSQ computes only the 1st component 2015-08-23 22:01:17 +02:00
aroulin
2f1514b904 Shader: implement DPH/DPHI in JIT 2015-08-22 11:09:53 +02:00
aroulin
2e7cf2f6cf Shader: implement DPH/DPHI in interpreter
Tests revealed that the component with w=1 is
SRC1 and not SRC2, it is now fixed on 3dbrew.
2015-08-22 11:09:53 +02:00
Subv
0c7da9b815 HWRasterizer: Implemented stencil ops 6 and 7. 2015-08-21 11:05:56 -05:00
Subv
7c1f84a92b SWRasterizer: Implemented stencil ops 6 and 7.
IncrementWrap and DecrementWrap, verified with hwtests.
2015-08-21 11:01:42 -05:00
Subv
e43eb130d4 HWRasterizer: Implemented stencil op 1 (GL_ZERO) 2015-08-21 10:59:49 -05:00
Subv
fef1462371 SWRasterizer: Implemented stencil action 1 (GL_ZERO).
Verified with hwtests.
2015-08-21 10:35:25 -05:00
Subv
b3e530d005 SWRasterizer: Removed a todo. Verified with hwtests. 2015-08-21 10:09:15 -05:00
Subv
8e6336d96b SWRenderer: The stencil depth_pass action is executed even if depth testing is disabled.
The HW renderer already did this.
2015-08-21 09:48:43 -05:00
Subv
e74825e3d0 Rasterizer: Abstract duplicated stencil code into a lambda. 2015-08-21 09:45:36 -05:00
Subv
46f660a789 GLRasterizer: Implemented stencil testing in the hw renderer. 2015-08-20 10:11:09 -05:00
Subv
186873420f GPU/Rasterizer: Corrected the stencil implementation.
Verified the behavior with hardware tests.
2015-08-20 10:10:35 -05:00
aroulin
f3e8f42718 Shader: implement SGE, SGEI and SLT in JIT 2015-08-19 14:29:39 +02:00
aroulin
863730f6a7 Shader: implement SGE, SGEI in interpreter 2015-08-19 14:29:39 +02:00
bunnei
3c5ff418ca Merge pull request #1047 from aroulin/shader-ex2-lg2
Shader: Save caller-saved registers in JIT before a CALL
2015-08-18 22:02:25 -04:00
aroulin
2f9eb98f03 Shader: Save caller-saved registers in JIT before a CALL 2015-08-19 03:40:07 +02:00
bunnei
026379ed55 Merge pull request #1037 from aroulin/shader-ex2-lg2
Shader: Implement EX2 and LG2 in interpreter/JIT
2015-08-18 19:42:32 -04:00
bunnei
1f18c9f8dd Merge pull request #1034 from yuriks/rg8-textures
videocore: Added RG8 texture support
2015-08-16 22:17:12 -04:00
aroulin
7d3a6016d6 Shader: implement EX2 and LG2 in JIT 2015-08-17 01:12:34 +02:00
LittleWhite
9d6748fa94 Fix Linux GCC 4.9 build (complaining about undeclared memset) 2015-08-16 17:21:08 +02:00
aroulin
638e47c04d Shader: implement EX2 and LG2 in interpreter 2015-08-16 15:54:30 +02:00
Tony Wasserka
96820ae42a Build fix for Debug configurations. 2015-08-16 15:14:54 +02:00
Tony Wasserka
f5144e6c10 Merge pull request #997 from Lectem/cmdlist_full_debug
citra-qt: Improve pica command list widget (add mask, fix some issues)
2015-08-16 13:34:45 +02:00
Tony Wasserka
33ba604fd9 Introduce a shader tracer to allow inspection of input/output values for each processed instruction. 2015-08-16 14:12:11 +02:00
Tony Wasserka
2e3601f415 Pica/DebugUtils: Include uniform information into shader dumps. 2015-08-16 13:22:01 +02:00
Tony Wasserka
4cb302c8ae citra-qt: Improve shader debugger.
Now supports dumping the current shader and recognizes a larger number of output semantics.
2015-08-16 13:22:00 +02:00
Patrick Martin
5b65d95310 videocore: Added RG8 texture support 2015-08-16 02:21:50 -03:00
bunnei
db97090cad Shader: Use a POD struct for registers. 2015-08-15 18:03:27 -04:00
bunnei
b39c053785 Rename ARCHITECTURE_X64 definition to ARCHITECTURE_x86_64. 2015-08-15 18:03:27 -04:00
bunnei
0ee00861f6 Common: Cleanup CPU capability detection code. 2015-08-15 18:03:26 -04:00
bunnei
a1942238f5 Common: Move cpu_detect to x64 directory. 2015-08-15 18:03:26 -04:00
bunnei
bd7e691f78 x64: Refactor to remove fake interfaces and general cleanups. 2015-08-15 18:03:25 -04:00
bunnei
cfb354f11f JIT: Support negative address offsets. 2015-08-15 18:01:22 -04:00
bunnei
094ae6fadb Shader: Initial implementation of x86_x64 JIT compiler for Pica vertex shaders.
- Config: Add an option for selecting to use shader JIT or interpreter.
- Qt: Add a menu option for enabling/disabling the shader JIT.
2015-08-15 18:01:07 -04:00
bunnei
d67e2f78b7 Common: Added MurmurHash3 hash function for general-purpose use. 2015-08-15 17:33:46 -04:00
bunnei
3f69c2039d Shader: Define a common interface for running vertex shader programs. 2015-08-15 17:33:44 -04:00
bunnei
18527b9e21 Shader: Move shader code to its own subdirectory, "shader". 2015-08-15 17:33:42 -04:00
bunnei
642b9b5030 GPU: Refactor "VertexShader" namespace to "Shader".
- Also renames "vertex_shader.*" to "shader_interpreter.*"
2015-08-15 17:33:41 -04:00
bunnei
35f3360663 Merge pull request #893 from linkmauve/remove-uint._t-int._t
Replace standard uint*_t and int*_t with CommonTypes’ u* and s* types
2015-08-11 17:55:24 -04:00
Emmanuel Gil Peyrot
5115d0177e ARM Core, Video Core, CitraQt, Citrace: Use CommonTypes types instead of the standard u?int*_t types. 2015-08-11 22:38:44 +01:00
Yuri Kunde Schlesner
254582aa35 OpenGL: Fix state tracking in situations with reused object handles
If an OpenGL object is created, bound to a binding using the state
tracker, and then destroyed, a newly created object can be assigned the
same numeric handle by OpenGL. However, even though it is a new object,
and thus needs to be bound to the binding again, the state tracker
compared the current and previous handles and concluded that no change
needed to be made, leading to failure to bind objects in certain cases.

This manifested as broken text in VVVVVV, which this commit fixes along
with similar texturing problems in other games.
2015-08-06 00:59:37 -03:00
Yuri Kunde Schlesner
ff68db61bc OpenGL: Remove redundant texture.enable_2d field from OpenGLState
All uses of this field where it's false can just set the texture id to 0
instead.
2015-08-05 22:55:22 -03:00
Yuri Kunde Schlesner
a96502edd3 Videocore: Implement simple vertex caching
This gives a ~2/3 reduction in the amount of vertices that need to be
processed through the vertex loaders and the vertex shader, yielding a
good speedup.
2015-08-04 23:41:47 -03:00
bunnei
bb7eb5c574 Merge pull request #1006 from yuriks/fb-commit-profile
OpenGL: Add a profiler category measuring framebuffer readback
2015-07-30 10:39:38 -04:00
bunnei
31c1bb901b Merge pull request #963 from yuriks/gpu-fixes
Misc. GPU vertex loading fixes
2015-07-29 16:45:17 -04:00
Yuri Kunde Schlesner
428154da45 OpenGL: Add a profiler category measuring framebuffer readback 2015-07-28 17:37:46 -03:00
bunnei
e1a3fed6ff Merge pull request #991 from yuriks/globjects
OpenGL: Make OpenGL object resource wrappers fully inline
2015-07-26 16:37:33 -04:00
bunnei
cb76453ec4 Merge pull request #992 from yuriks/hot-path-debug
VideoCore: #ifdef out some debugging routines
2015-07-26 11:45:51 -04:00