Ryujinx

Author	SHA1	Message	Date
riperiperi	d2f3adbf69	Texture: Fix layout conversion when gobs in z is used with depth = 1 (#5220 ) * Texture: Fix layout conversion when gobs in z is used with depth = 1 The size calculator methods deliberately reduce the gob size of textures if they are deemed too small for it. This is required to get correct sizes when iterating mip levels of a texture. Rendering to a slice of a 3D texture can produce a 3D texture with depth 1, but a gob size matching a much larger texture. We _can't_ "correct" this gob size, as it is intended as a slice of a larger 3D texture. Ignoring it causes layout conversion to break on read and flush. This caused an issue in Tears of the Kingdom where the compressed 3D texture used for the gloom would always break on OpenGL, and seemingly randomly break on Vulkan. In the first case, the data is forcibly flushed to decompress the BC4 texture on the CPU to upload it as 3D, which was broken due to the incorrect layout. In the second, the data may be randomly flushed if it falls out of the cache, but it will appear correct if it's able to form copy dependencies. This change only allows gob sizes to be reduced once per mip level. For the purpose of aligned size, it can still be reduced infinitely as our texture cache isn't properly able to handle a view being _misaligned_. The SizeCalculator has also been changed to reduce the size of rendered depth slices to only include the exact range a single depth slice will cover. (before, the size was way too small with gobs in z reduced to 1, and too large when using the correct value) Gobs in Y logic remains untouched, we don't support Y slices of textures so it's fine as is. This is probably worth testing in a few games as it also affects texture size and view logic. * Improve wording * Maybe a bit better	2023-06-04 20:25:57 +00:00
gdkchan	21c9ac6240	Implement shader storage buffer operations using new Load/Store instructions (#4993 ) * Implement storage buffer operations using new Load/Store instruction * Extend GenerateMultiTargetStorageOp to also match access with constant offset, and log and comments * Remove now unused code * Catch more complex cases of global memory usage * Shader cache version bump * Extend global access elimination to work with more shared memory cases * Change alignment requirement from 16 bytes to 8 bytes, handle cases where we need more than 16 storage buffers * Tweak preferencing to catch more cases * Enable CB0 elimination even when host storage buffer alignment is > 16 (for Intel) * Fix storage buffer bindings * Simplify some code * Shader cache version bump * Fix typo * Extend global memory elimination to handle shared memory with multiple possible offsets and local memory	2023-06-03 20:12:18 -03:00
riperiperi	4741a05df9	Vulkan: Include DepthMode in ProgramPipelineState (#5185 )	2023-06-01 09:05:39 +02:00
riperiperi	c6676007bf	GPU: Dispose Renderer after running deferred actions (#5144 ) * GAL: Dispose Renderer after running deferred actions Deferred actions from disposing physical memory instances always dispose the resources in their caches. The renderer can't be disposed before these resources get disposed, otherwise the dispose actions will not actually run, and the ThreadedRenderer may get stuck trying to enqueue too many commands when there is nothing consuming them. This should fix most instances of the emulator freezing on close. * Wait for main render commands to finish, but keep RenderThread alive til dispose * Address some feedback. * No parameterize needed * Set thread name as part of constructor * Port to Ava and SDL2	2023-05-31 21:43:20 +00:00
gdkchan	c27e453fd3	Share ResourceManager vertex vertex A and B shaders (#5181 )	2023-05-31 17:17:50 -03:00
cstamford	dc0dbc50ab	Add support for VK_EXT_depth_clip_control. (#5027 ) * Add support for VK_EXT_depth_clip_control. * Code review feedback Minor formatting Co-authored-by: gdkchan <gab.dark.100@gmail.com> * Check .DepthClipControl to make sure the host actually supports the feature. * Review feedback: remove Vulkan platform switch, relying on QueryHostSupportsDepthClipControl to drive the behaviour - OpenGL returns true, and any future platforms that don't support the [-1, 1] depth mode can return false for the transformation. --------- Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2023-05-28 23:31:56 +02:00
gdkchan	3b375525fb	Force reciprocal operation with value biased by constant to be precise on macOS (#5110 ) * Force operations to be precise in some cases on SPIR-V * Make it a bit more strict, add comments * Shader cache version bump	2023-05-26 15:19:37 -03:00
gdkchan	e6658c133c	Fix resolution scaling of image operation coordinates (#5102 ) * Fix resolution scaling of image operation coordinates * Shader cache version bump	2023-05-25 23:42:49 -03:00
gdkchan	8f0c89ffd6	Generate scaling helper functions on IR (#4714 ) * Generate scaling helper functions on IR * Delete unused code * Split RewriteTextureSample and move gather bias add to an earlier pass * Remove using * Shader cache version bump	2023-05-25 17:46:58 -03:00
makigumo	6cb6b15612	Implement p2rc, p2ri, p2rr and r2p.cc shaders (#5031 ) * implement P2rC, P2rI, P2rR shaders * implement R2p.CC shader * bump CodeGenVersion * address feedback	2023-05-22 17:32:15 -03:00
gdkchan	5626f2ca1c	Replace ShaderBindings with new ResourceLayout structure for Vulkan (#5025 ) * Introduce ResourceLayout * Part 1: Use new ResourceSegments array on UpdateAndBind * Part 2: Use ResourceLayout to build PipelineLayout * Delete old code * XML docs * Fix shader cache load NRE * Fix typo	2023-05-21 14:04:21 -03:00
gdkchan	402f05b8ef	Replace constant buffer access on shader with new Load instruction (#4646 )	2023-05-20 16:19:26 -03:00
gdkchan	fb27042e01	Limit compute storage buffer size (#5028 )	2023-05-20 16:15:07 +00:00
riperiperi	69a9de33d3	SPIR-V: Only allow implicit LOD sampling on fragment (#5026 )	2023-05-20 15:52:26 +02:00
gdkchan	fc26189fe1	Eliminate redundant multiplications by gl_FragCoord.w on the shader (#4578 ) * Eliminate redundant multiplications by gl_FragCoord.w on the shader * Shader cache version bump	2023-05-19 11:52:31 -03:00
riperiperi	ecbf303266	GPU: Avoid using garbage size for non-cb0 storage buffers (#4999 ) * GPU: Avoid using garbage size for non-cb0 storage buffers In the depths area, Tears of the Kingdom uses a global memory access with address on constant buffer slot 6. This isn't standard and thus doesn't actually have a size 8 bytes after it, so we were reading back a garbage size that ended up very large (at least in version 1.1.0), and would synchronize a lot of data per frame. This PR makes storage buffers created from addresses outside constant buffer slot 0 get their size as the number of bytes remaining in the GPU mapping starting at the given virtual address. This should bound the buffer to a reasonable size, and ideally stop it crossing into other memory. * Limit max size * Add TODO * Feedback	2023-05-18 08:56:34 +02:00
OpaqueReptile	cb4b58052f	Start GPU performance counter at 0 instead of host GPU value (#4992 ) * Start performance counter at 0 instead of host perf counter value * whitespace * init _firstTimestamp in constructer per feedback * change comment * punctuation * address feedback * revise comment	2023-05-17 15:38:59 -03:00
Mary	7271f1b18e	Bump shader cache codegen version That was missing from #4892	2023-05-12 18:53:14 +02:00
riperiperi	95c06de4c1	GPU: Remove swizzle undefined matching and rework depth aliasing (#4896 ) * GPU: Remove swizzle undefined matching and rework depth aliasing @gdkchan pointed out that UI textures in TOTK seemed to be setting their texture swizzle incorrectly (texture was RGB but was sampling A, swizzle for A was wrong), so I determined that SwizzleComponentMatches was the problem and set on eliminating it. This PR combines existing work to select the most recently modified texture (now used when selecting which aliased texture to use) with some additional changes to remove the swizzle check and support aliased view creation. The original observation (#1538) was that we wanted to match depth textures for the purposes of aliasing with color textures, but they often had different swizzle from what was sampled (as it's generally the identity swizzle once rendered). At the time, I decided to allow swizzles to match if only the defined components matched, which fixed the issue in all known cases but could easily be broken by a game _expecting_ a given swizzle, such as a 1/0 value on a component. This error case could also occur in textures that don't even depth alias, such as R11G11B10, as the rule was created to generally apply to all cases. The solution is now to fail this exact match test, and allow the search for an R32 texture to create a swizzled view of a D32 texture (and other such cases). This allows the creation of a view that mismatches the requested format, which wasn't present before and was the reason for the swizzle matching approach. The exact match and view creation rules now follow the same rules over what textures to select when there are multiple options (such as a "perfect" match and an "aliased" match at the same time). It now selects the most recently modified texture, which is done with a new sequence number in the GpuContext (because we don't have enough of these). Reportedly fixes UI having weird coloured backgrounds in TOTK. This also fixes an issue in MK8D where returning from a race resulted in the character selection cubemaps being broken. May work around issues introduced by the "short texture cache" PR due to modification ordering, though they won't be truly fixed. Should allow (#4365) to avoid copies in more cases. Need to test that. I tested a bunch of games #1538 originally affected and they seem to be fine. This change affects all games so it would be good to get some wide testing on it. * Address feedback 1, fix an issue * Workaround: Do not allow copies for format alias. These should be removed when D32<->R32 copy dependencies become legal	2023-05-11 21:30:47 -03:00
riperiperi	95bad6995c	GPU: Fix shader cache assuming past shader data was mapped (#4885 ) This fixes a potential issue where a shader lookup could match the address of a previous _different_ shader, but that shader is now partially unmapped. This would just crash with an invalid region exception. To compare a shader in the address cache with one in memory, we get the memory at the location with the previous shader's size. However, it's possible it has been unmapped and then remapped with a smaller size. In this case, we should just get back the mapped portion of the shader, which will then fail the comparison immediately and get to compile/lookup for the new one. This might fix a random crash in TOTK that was reported by Piplup. I don't know if it does, because I don't have the game yet.	2023-05-11 18:41:34 +02:00
Nico	4c3d2d5d75	UI: Add progress bar for re-packaging shaders (#4805 ) * feat: introduce new shader loading state for progress tracking when writing shaders to disk * fix: move translation to bottom of locale file * fix: change back to foreach and add requested spacing between lines * style: fix formatting Co-authored-by: gdkchan <gab.dark.100@gmail.com> --------- Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2023-05-06 15:35:46 +02:00
riperiperi	332891b5ff	Use correct offset for storage constant buffer elimination (#4821 )	2023-05-05 23:59:36 +02:00
riperiperi	7df4fcada7	GPU: Remove CPU region handle containers (#4817 ) * GPU: Remove CPU region handle containers. Another one for the "I don't know why I didn't do this earlier" pile. This removes the "Cpu" prefixed region handle classes, which each mirror a region handle type from Ryujinx.Memory. Originally, not all projects had a reference to Ryujinx.Memory, so these classes were introduced to bridge the gap. Someone else crossed that bridge since, so these classes don't have much of a purpose anymore. This PR replaces all uses of CpuRegionHandle etc to their direct Ryujinx.Memory versions. RegionHandle methods (specifically QueryModified) are about the hottest path there is in the entire emulator, so there is a nice boost from doing this. * Add docs	2023-05-05 23:40:46 +02:00
Ikko Eltociear Ashimine	f8ec878796	Fix typo in TextureBindingsManager.cs (#4798 ) accomodate -> accommodate	2023-05-05 22:17:36 +02:00
gdkchan	aa021085cf	Allow any shader SSBO constant buffer slot and offset (#2237 ) * Allow any shader SSBO constant buffer slot and offset * Fix slot value passed to SetUsedStorageBuffer on fallback case * Shader cache version * Ensure that the storage buffer source constant buffer offset is word aligned * Fix FirstBinding on GetUniformBufferDescriptors	2023-05-05 14:20:20 +00:00
riperiperi	1f5d881860	GPU: Allow granular buffer updates from the constant buffer updater (#4749 ) * GPU: Allow granular buffer updates from the constant buffer updater Sometimes, constant buffer updates can't be avoided, either due to a cb0 access that cannot be eliminated, or the game updating a buffer between draws to the detriment of everyone. To avoid uploading the full 4096 bytes each time, this PR remembers the offset and size containing all constant buffer updates since the last sync. It will then upload that range after sync. * Allow clearing the dirty range * Always use precise Might want to not do this if distance between the existing range and new one is too high. * Use old force dirty mechanism when distance between regions is too great * Update src/Ryujinx.Graphics.Gpu/Memory/Buffer.cs Co-authored-by: gdkchan <gab.dark.100@gmail.com> * Fix inheritance of _dirtyStart and _dirtyEnd --------- Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2023-05-05 13:47:15 +00:00
gdkchan	4d1579acbf	Fix some invalid blits involving depth textures (#4723 )	2023-05-03 21:20:12 -03:00
riperiperi	2c94ac455e	GPU: Keep rendered textures without any pool references alive (#4662 ) * GPU: Keep sampled textures without any pool references alive Occasionally games are very wasteful and clear/write to a texture without ever sampling it. As rendered textures in NVN games seem to all have overlapping memory ranges, the texture will eventually get overwritten. Normally, this would trigger a removal from the auto delete cache, but a pool entry would keep the texture alive. However, with these textures that are never used, they will get deleted immediately and recreated on the next frame. This change makes it so the ShortTextureCache can keep textures that have naver had a pool reference alive for a few frames, so they're not constantly being created and deleted. This improves performance in Zelda BOTW a little. * Cleanup	2023-05-01 16:27:51 -03:00
riperiperi	e18d258fa0	GPU: Pre-emptively flush textures that are flushed often (to imported memory when available) (#4711 ) * WIP texture pre-flush Improve performance of TextureView GetData to buffer Fix copy/sync ordering Fix minor bug Make this actually work WIP host mapping stuff * Fix usage flags * message * Cleanup 1 * Fix rebase * Fix * Improve pre-flush rules * Fix pre-flush * A lot of cleanup * Use the host memory bits * Select the correct memory type * Cleanup TextureGroupHandle * Missing comment * Remove debugging logs * Revert BufferHandle _value access modifier * One interrupt action at a time. * Support D32S8 to D24S8 conversion, safeguards * Interrupt cannot happen in sync handle's lock Waitable needs to be checked twice now, but this should stop it from deadlocking. * Remove unused using * Address some feedback * Address feedback * Address more feedback * Address more feedback * Improve sync rules Should allow for faster sync in some cases.	2023-05-01 16:05:12 -03:00
riperiperi	36f10df775	GPU: Fix errors handling texture remapping (#4745 ) * GPU: Fix errors handling texture remapping - Fixes an error where a pool entry and memory mapping changing at the same time could cause a texture to rebind its data from the wrong GPU VA (data swaps) - Fixes an error where the texture pool could act on a mapping change before the mapping has actually been changed ("Unmapped" event happens before change, we need to signal it changed _after_ it completes) TODO: remove textures from partially mapped list... if they aren't. * Add Remap actions for handling post-mapping behaviours * Remove unused code. * Address feedback * Nit	2023-05-01 15:32:32 -03:00
al81-ru	680e548022	Uneven frame pacing with vsync (#4744 ) fixes issue #3906	2023-04-29 21:54:41 +01:00
TSR Berry	cee7121058	Move solution and projects to src	2023-04-27 23:51:14 +02:00

1 2

82 commits