Commit graph

61 commits

Author SHA1 Message Date
gdkchan 43ebd7a9bb
New shader cache implementation (#3194)
* New shader cache implementation

* Remove some debug code

* Take transform feedback varying count into account

* Create shader cache directory if it does not exist + fragment output map related fixes

* Remove debug code

* Only check texture descriptors if the constant buffer is bound

* Also check CPU VA on GetSpanMapped

* Remove more unused code and move cache related code

* XML docs + remove more unused methods

* Better codegen for TransformFeedbackDescriptor.AsSpan

* Support migration from old cache format, remove more unused code

Shader cache rebuild now also rewrites the shared toc and data files

* Fix migration error with BRX shaders

* Add a limit to the async translation queue

 Avoid async translation threads not being able to keep up and the queue growing very large

* Re-create specialization state on recompile

This might be required if a new version of the shader translator requires more or less state, or if there is a bug related to the GPU state access

* Make shader cache more error resilient

* Add some missing XML docs and move GpuAccessor docs to the interface/use inheritdoc

* Address early PR feedback

* Fix rebase

* Remove IRenderer.CompileShader and IShader interface, replace with new ShaderSource struct passed to CreateProgram directly

* Handle some missing exceptions

* Make shader cache purge delete both old and new shader caches

* Register textures on new specialization state

* Translate and compile shaders in forward order (eliminates diffs due to different binding numbers)

* Limit in-flight shader compilation to the maximum number of compilation threads

* Replace ParallelDiskCacheLoader state changed event with a callback function

* Better handling for invalid constant buffer 1 data length

* Do not create the old cache directory structure if the old cache does not exist

* Constant buffer use should be per-stage. This change will invalidate existing new caches (file format version was incremented)

* Replace rectangle texture with just coordinate normalization

* Skip incompatible shaders that are missing texture information, instead of crashing

This is required if we, for example, support new texture instruction to the shader translator, and then they allow access to textures that were not accessed before. In this scenario, the old cache entry is no longer usable

* Fix coordinates normalization on cubemap textures

* Check if title ID is null before combining shader cache path

* More robust constant buffer address validation on spec state

* More robust constant buffer address validation on spec state (2)

* Regenerate shader cache with one stream, rather than one per shader.

* Only create shader cache directory during initialization

* Logging improvements

* Proper shader program disposal

* PR feedback, and add a comment on serialized structs

* XML docs for RegisterTexture

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2022-04-10 10:49:44 -03:00
gdkchan e44a43c7e1
Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling (#3251)
* Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling

* Shader cache version bump

* Fix typo
2022-04-08 12:42:39 +02:00
gdkchan 0bcbe32367
Only initialize shader outputs that are actually used on the next stage (#3054)
* Only initialize shader outputs that are actually used on the next stage

* Shader cache version bump
2022-03-06 20:42:13 +01:00
gdkchan 7f6b3d234a
Implement IMUL, PCNT and CONT shader instructions, fix FFMA32I and HFMA32I (#2972)
* Implement IMUL shader instruction

* Implement PCNT/CONT instruction and fix FFMA32I

* Add HFMA232I to the table

* Shader cache version bump

* No Rc on Ffma32i
2022-01-10 12:08:00 -03:00
gdkchan 911ea38e93
Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins (#2817)
* Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins

* Shader cache version bump

* Fix back color value on fragment shader

* Disable IPA multiplication for fixed function attributes and back color selection
2021-11-08 13:18:46 -03:00
gdkchan 99445dd0a6
Add support for fragment shader interlock (#2768)
* Support coherent images

* Add support for fragment shader interlock

* Change to tree based match approach

* Refactor + check for branch targets and external registers

* Make detection more robust

* Use Intel fragment shader ordering if interlock is not available, use nothing if both are not available

* Remove unused field
2021-10-28 19:53:12 -03:00
gdkchan d512ce122c
Initial tessellation shader support (#2534)
* Initial tessellation shader support

* Nits

* Re-arrange built-in table

* This is not needed anymore

* PR feedback
2021-10-18 18:38:04 -03:00
gdkchan 7603dbe3c8
Add missing U8/S8 types from shader I2I instruction (#2740)
* Add missing U8/S8 types from shader I2I instruction

* Better names

* Fix dstIsSignedInt
2021-10-17 17:48:36 -03:00
gdkchan a7109c767b
Rewrite shader decoding stage (#2698)
* Rewrite shader decoding stage

* Fix P2R constant buffer encoding

* Fix PSET/PSETP

* PR feedback

* Log unimplemented shader instructions

* Implement NOP

* Remove using

* PR feedback
2021-10-12 22:35:31 +02:00
riperiperi 142cededd4
Implement Shader Instructions SUATOM and SURED (#2090)
* Initial Implementation

* Further improvements (no support for float/64-bit types)

* Merge atomic and reduce instructions, add missing format switch

* Fix rebase issues.

* Not used.

* Whoops. Fixed.

* Partial implementation of inc/dec, cleanup and TODOs

* Remove testing path

* Address Feedback
2021-08-31 02:51:57 -03:00
gdkchan ee1038e542
Initial support for shader attribute indexing (#2546)
* Initial support for shader attribute indexing

* Support output indexing too, other improvements

* Fix order

* Address feedback
2021-08-27 01:44:47 +02:00
gdkchan ed754af8d5
Make sure attributes used on subsequent shader stages are initialized (#2538) 2021-08-11 22:27:00 +02:00
gdkchan d9d18439f6
Use a new approach for shader BRX targets (#2532)
* Use a new approach for shader BRX targets

* Make shader cache actually work

* Improve the shader pattern matching a bit

* Extend LDC search to predecessor blocks, catches more cases

* Nit

* Only save the amount of constant buffer data actually used. Avoids crashes on partially mapped buffers

* Ignore Rd on predicate instructions, as they do not have a Rd register (catches more cases)
2021-08-11 20:59:42 +02:00
riperiperi 7ff1f9aa12
End shader decoding when reaching a block that starts with an infinite loop (after BRX) (#2367)
* End shader decoding when reaching an infinite loop

The NV shader compiler puts these at the end of shaders.

* Update shader cache version
2021-06-15 02:09:59 +02:00
gdkchan 3b90adcd1d
Fix shaders with mixed PBK and SSY addresses on the stack (#2329)
* Fix shaders with mixed PBK and SSY addresses on the stack

* Address PR feedback and nits
2021-06-03 01:41:53 +02:00
gdkchan 524fe3bea4
Implement shader HelperThreadNV (#2163)
* Implement shader HelperThreadNV

* Bump shader cache version

* Use gl_HelperInvocation since its supported across all vendors

* Nit
2021-04-02 21:50:35 +11:00
mageven a5d5ca0635
Shader Cache: Move bindless checking from translation to decode (#2145) 2021-03-27 00:50:26 +01:00
gdkchan 5be6ec6364
Fix shader LOP3 predicate write condition (#1910)
* Fix LOP3 predicate write condition

* Bump shader cache version
2021-01-14 01:07:50 +01:00
gdkchan 8e0a421264
Fix remap when handle is 0 (#1882)
* Nvservices cleanup and attempt to fix remap

* Unmap if remap handle is 0

* Remove mapped pool add from Remap
2021-01-10 10:11:31 +11:00
gdkchan b9200dd734
Support conditional on BRK and SYNC shader instructions (#1878)
* Support conditional on BRK and SYNC shader instructions

* Add TODO comment and bump cache version
2021-01-08 22:55:55 -03:00
Mary 48f6570557
Salieri: shader cache (#1701)
Here come Salieri, my implementation of a disk shader cache!

"I'm sure you know why I named it that."
"It doesn't really mean anything."

This implementation collects shaders at runtime and cache them to be later compiled when starting a game.
2020-11-13 00:15:34 +01:00
gdkchan c3d62bd078
Implement ATOM shader instruction (#1687)
* Implement ATOM shader instruction

* Fix reduction type decoding
2020-11-10 01:06:46 +01:00
gdkchan 49f970d5bd
Implement CAL and RET shader instructions (#1618)
* Add support for CAL and RET shader instructions

* Remove unused stuff

* Fix a bug that could cause the wrong values to be passed to a function

* Avoid repopulating function id dictionary every time

* PR feedback

* Fix vertex shader A/B merge
2020-10-25 17:00:44 -03:00
gdkchan 2f16491712
Get rid of Reflection.Emit dependency on CPU and Shader projects (#1626)
* Get rid of Reflection.Emit dependency on CPU and Shader projects

* Remove useless private sets

* Missed those due to the alignment
2020-10-21 09:13:44 -03:00
gdkchan f02791b20c
Fix LOP3 (cbuf) shader instruction encoding (#1616) 2020-10-13 19:33:04 -03:00
gdkchan e4777717cd
Implement LEA.HI shader instruction (#1609) 2020-10-12 21:46:04 -03:00
gdkchan b066cfc1a3
Add support for shader constant buffer slot indexing (#1608)
* Add support for shader constant buffer slot indexing

* Fix typo
2020-10-12 21:40:50 -03:00
gdkchan 0954e76a26
Improve BRX target detection heuristics (#1591) 2020-10-03 15:43:33 +10:00
gdkchan e13154c83d
Implement shader LEA instruction and improve bindless image load/store (#1355) 2020-07-04 01:48:44 +02:00
gdkchan 0b6d206daa
Omit image format if possible, and fix BA bit (#1280)
* Omit image format if possible, and fix BA bit

* Match extension name
2020-05-27 11:00:21 +02:00
Thog ff7a933ec0
Implement TMML and TMML.B (#1270)
* Implement TMML and TMML.B

This implement TMML and TMML.B instructions

* Fix TmmlB declaration alignment

* Address gdkchan's comments

* Fix inverted encoding definitions
2020-05-23 12:04:35 +02:00
gdkchan b8eb6abecc
Refactor shader GPU state and memory access (#1203)
* Refactor shader GPU state and memory access

* Fix NVDEC project build

* Address PR feedback and add missing XML comments
2020-05-06 11:02:28 +10:00
gdkchan 3cb1fa0e85
Implement texture buffers (#1152)
* Implement texture buffers

* Throw NotSupportedException where appropriate
2020-04-25 23:02:18 +10:00
gdkchan 03711dd7b5
Implement SULD shader instruction (#1117)
* Implement SULD shader instruction

* Some nits
2020-04-22 09:35:28 +10:00
gdkchan d599fba711
Implement FCMP shader instruction (#1067) 2020-03-30 12:04:00 +02:00
Chenj168 7ad8b3ef75
Move the OpActivator to OpCodeTable class for improve performance (#1001)
* Move the OpActivator to OpCodeTable class, for reduce the use of ConcurrentDictionary

* Modify code style.
2020-03-29 19:52:56 +11:00
Elise 06bf25521f
Implement NOP and stub DEPBAR shader instructions (#1041)
* Implement NOP and stub DEPBAR shader instruction

* Fix a few issues and formatting stuff

* Remove OpCodeNop/Depbar and use OpCode instead

* Fix NOP shader instruction opcode

* Fix formatting
2020-03-26 19:30:16 -03:00
gdkchan 1586450a38
Implement VMNMX shader instruction (#1032)
* Implement VMNMX shader instruction

* No need for the gap on the enum

* Fix typo
2020-03-25 15:49:10 +01:00
gdkchan 6edc929894
Implement ICMP shader instruction (#1010) 2020-03-23 17:32:30 +01:00
jduncanator 54501962f6
Fix branch with CC and predicate, and a case of SYNC propagation (#967) 2020-03-06 11:09:49 +11:00
gdkchan dc97457bf0
Initial support for double precision shader instructions. (#963)
* Implement DADD, DFMA and DMUL shader instructions

* Rename FP to FP32

* Correct double immediate

* Classic mistake
2020-03-03 15:02:08 +01:00
gdkchan 5a9dba0756
Sign-extend shader memory instruction offsets (#934) 2020-02-14 01:48:07 +01:00
gdkchan b8e3909d80 Add a GetSpan method to the memory manager and use it on GPU (#877) 2020-01-13 10:27:50 +11:00
gdkchan 29a825b43b Address PR feedback
Removes a useless null check

Aligns some values to improve readability
2020-01-09 02:13:00 +01:00
gdkchan 18814d44b2 Address PR feedback
Add TODO comment for GL_EXT_polygon_offset_clamp
2020-01-09 02:13:00 +01:00
gdkchan 2eccc7023a Partial support for shader memory barriers 2020-01-09 02:13:00 +01:00
gdkchan 6b13c5b439 Support bindless texture gather shader instruction 2020-01-09 02:13:00 +01:00
gdkchan cb171f6ebf Support shared color mask, implement more shader instructions
Support shared color masks (used by Nouveau and maybe the NVIDIA
driver).
Support draw buffers (also required by OpenGL).
Support viewport transform disable (disabled for now as it breaks some
games).
Fix instanced rendering draw being ignored for multi draw.
Fix IADD and IADD3 immediate shader encodings, that was not matching
some ops.
Implement FFMA32I shader instruction.
Implement IMAD shader instruction.
2020-01-09 02:13:00 +01:00
gdk 6a98c643ca Add a pass to turn global memory access into storage access, and do all storage related transformations on IR 2020-01-09 02:13:00 +01:00
gdk 442485adb3 Partial support for branch with CC, and fix a edge case of branch out of loop on shaders 2020-01-09 02:13:00 +01:00