Ryujinx

Author	SHA1	Message	Date
riperiperi	a1f77a5b6a	Implement lazy flush-on-read for Buffers (SSBO/Copy) (#1790 ) * Initial implementation of buffer flush (VERY WIP) * Host shaders need to be rebuilt for the SSBO write flag. * New approach with reserved regions and gl sync * Fix a ton of buffer issues. * Remove unused buffer unmapped behaviour * Revert "Remove unused buffer unmapped behaviour" This reverts commit f1700e52fb8760180ac5e0987a07d409d1e70ece. * Delete modified ranges on unmap Fixes potential crashes in Super Smash Bros, where a previously modified range could lie on either side of an unmap. * Cache some more delegates. * Dispose Sync on Close * Also create host sync for GPFifo syncpoint increment. * Copy buffer optimization, add docs * Fix race condition with OpenGL Sync * Enable read tracking on CommandBuffer, insert syncpoint on WaitForIdle * Performance: Only flush individual pages of SSBO at a time This avoids flushing large amounts of data when only a small amount is actually used. * Signal Modified rather than flushing after clear * Fix some docs and code style. * Introduce a new test for tracking memory protection. Sucessfully demonstrates that the bug causing write protection to be cleared by a read action has been fixed. (these tests fail on master) * Address Comments * Add host sync for SetReference This ensures that any indirect draws will correctly flush any related buffer data written before them. Fixes some flashing and misplaced world geometry in MH rise. * Make PageAlign static * Re-enable read tracking, for reads.	2021-01-17 17:08:06 -03:00
gdkchan	df820a72de	Implement clear buffer (fast path) (#1902 ) * Implement clear buffer (fast path) * Remove blank line	2021-01-13 08:50:54 +11:00
gdkchan	6ed19c1488	Fix compute reserved constant buffer updates (#1892 )	2021-01-10 21:02:58 +01:00
riperiperi	10aa11ce13	Interrupt GPU command processing when a frame's fence is reached. (#1741 ) * Interrupt GPU command processing when a frame's fence is reached. * Accumulate times rather than %s * Accurate timer for vsync Spin wait for the last .667ms of a frame. Avoids issues caused by signalling 16ms vsync. (periodic stutters in smo) * Use event wait for better timing. * Fix lazy wait Windows doesn't seem to want to do 1ms consistently, so force a spin if we're less than 2ms. * A bit more efficiency on frame waits. Should now wait the remainder 0.6667 instead of 1.6667 sometimes (odd waits above 1ms are reliable, unlike 1ms waits) * Better swap interval 0 solution 737 fps without breaking a sweat. Downside: Vsync can no longer be disabled on games that use the event heavily (link's awakening - which is ok since it breaks anyways) * Fix comment. * Address Comments.	2020-12-17 19:39:52 +01:00
riperiperi	9493cdfe55	Allow copy destination to have a different scale from source (#1711 ) * Allow copy destination to have a different scale from source Will result in more scaled copy destinations, but allows scaling in some games that copy textures to the output framebuffer. * Support copying multiple levels/layers Uses glFramebufferTextureLayer to copy multiple layers, copies levels individually (and scales the regions). Remove CopyArrayScaled, since the backend copy handles it now.	2020-11-20 17:14:45 -03:00
gdkchan	5189a807c4	Fix buffer to texture copy with remap enabled (#1721 )	2020-11-17 19:06:02 -03:00
gdkchan	787e20937f	Propagate zeta format properly (#1716 )	2020-11-16 09:37:16 +01:00
riperiperi	c652494219	Use "Screen Scissor" as size hint for render targets (#1703 ) "Screen scissor" is the minimum size of all render targets, and is set when any render target is bound on NVN or OpenGL. Since it works on all active texture's real sizes, it is therefore more reliable than viewport 0's width, and is actually set before clear. This fixes a regression with Hyrule Warriors: Age Of Calamity's cubemaps, which did not set viewport dimensions before clear. This resulted in attempting to create a cubemap with rectangular sides, which is logically and physically impossible. (also it just fails)	2020-11-13 10:40:26 +11:00
Mary	48f6570557	Salieri: shader cache (#1701 ) Here come Salieri, my implementation of a disk shader cache! "I'm sure you know why I named it that." "It doesn't really mean anything." This implementation collects shaders at runtime and cache them to be later compiled when starting a game.	2020-11-13 00:15:34 +01:00
riperiperi	02872833b6	Size hints for copy regions and viewport dimensions to avoid data loss (#1686 ) * Size hints for copy regions and viewport dimensions to avoid data loss * Reword comment. * Use info for the rule rather than calculating aligned size. * Reorder min/max, remove spaces	2020-11-09 21:41:13 -03:00
gdkchan	934a78005e	Simplify logic for bindless texture handling (#1667 ) * Simplify logic for bindless texture handling * Nits	2020-11-09 19:35:04 -03:00
gdkchan	8d168574eb	Use explicit buffer and texture bindings on shaders (#1666 ) * Use explicit buffer and texture bindings on shaders * More XML docs and other nits	2020-11-08 12:10:00 +01:00
riperiperi	5561a3b95e	Synchronize Rasterizer State before Clear (#1680 )	2020-11-07 16:21:10 -03:00
riperiperi	500b48251c	Only report that GPU commands are available when the queue is not empty. (#1656 ) * Only report that commands are available when the queue is not empty. * Address Feedback Co-authored-by: FICTURE7 <FICTURE7@gmail.com> Co-authored-by: FICTURE7 <FICTURE7@gmail.com>	2020-11-06 23:04:26 -03:00
gdkchan	24dbfc0fe6	Correct BPP of buffer to texture copies (#1670 )	2020-11-06 18:37:05 +01:00
gdkchan	a89b81a812	Separate zeta from color formats (#1647 )	2020-11-05 23:50:34 +01:00
gdkchan	2dcc6333f8	Fix image binding format (#1625 ) * Fix image binding format * XML doc	2020-10-20 19:03:20 -03:00
riperiperi	b4d8d893a4	Memory Read/Write Tracking using Region Handles (#1272 ) * WIP Range Tracking - Texture invalidation seems to have large problems - Buffer/Pool invalidation may have problems - Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution. - Native project is in the messiest possible location. - [HACK] JIT memory access always uses native "fast" path - [HACK] Trying some things with texture invalidation and views. It works :) Still a few hacks, messy things, slow things More work in progress stuff (also move to memory project) Quite a bit faster now. - Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former. - The Virtual range list is now non-overlapping like the physical one. - Fixed some bugs where regions could leak. - Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road) Move some stuff. I think we'll eventually just put the dll and so for this in a nuget package. Fix rebase. [WIP] MultiRegionHandle variable size ranges - Avoid reprotecting regions that change often (needs some tweaking) - There's still a bug in buffers, somehow. - Might want different api for minimum granularity Fix rebase issue Commit everything needed for software only tracking. Remove native components. Remove more native stuff. Cleanup Use a separate window for the background context, update opentk. (fixes linux) Some experimental changes Should get things working up to scratch - still need to try some things with flush/modification and res scale. Include address with the region action. Initial work to make range tracking work Still a ton of bugs Fix some issues with the new stuff. * Fix texture flush instability There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it) * Find the destination texture for Buffer->Texture full copy Greatly improves performance for nvdec videos (with range tracking) * Further improve texture tracking * Disable Memory Tracking for view parents This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice) The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future. * Introduce some tracking tests. WIP * Complete base tests. * Add more tests for multiregion, fix existing test. * Cleanup Part 1 * Remove unnecessary code from memory tracking * Fix some inconsistencies with 3D texture rule. * Add dispose tests. * Use a background thread for the background context. Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster. Also nerf the multithreading test a bit. * Copy to texture with matching alignment This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size. * Track reads for buffer copies. Synchronize new buffers before copying overlaps. * Remove old texture flushing mechanisms. Range tracking all the way, baby. * Wake the background thread when disposing. Avoids a deadlock when games are closed. * Address Feedback 1 * Separate TextureCopy instance for background thread Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread. * Add missing XML docs. * Address Feedback * Maybe I should start drinking coffee. * Some more feedback. * Remove flush warning, Refocus window after making background context	2020-10-16 17:18:35 -03:00
gdkchan	bd28ce90e6	Implement small indexed draws and other fixes to make guest Vulkan work (#1558 )	2020-09-24 09:48:34 +10:00
gdkchan	1eea35554c	Better viewport flipping and depth mode detection method (#1556 ) * Use a better viewport flipping approach * New approach to detect depth mode * nit: Sort method on the OpenGL backend * Adjust spacing on comment * Unswap near and far parameters based on ScaleZ	2020-09-19 19:46:49 -03:00
riperiperi	5d69d9103e	Texture/Buffer Memory Management Improvements (#1408 ) * Initial implementation. Still pending better valid-overlap handling, disposed pool, compressed format flush fix. * Very messy backend resource cache. * Oops * Dispose -> Release * Improve Release/Dispose. * More rule refinement. * View compatibility levels as an enum - you can always know if a view is only copy compatible. * General cleanup. Use locking on the resource cache, as it is likely to be used by other threads in future. * Rename resource cache to resource pool. * Address some of the smaller nits. * Fix regression with MK8 lens flare Texture flushes done the old way should trigger memory tracking. * Use TextureCreateInfo as a key. It now implements IEquatable and generates a hashcode based on width/height. * Fix size change for compressed+non-compressed view combos. Before, this could set either the compressed or non compressed texture with a size with the wrong size, depending on which texture had its size changed. This caused exceptions when flushing the texture. Now it correctly takes the block size into account, assuming that these textures are only related because a pixel in the non-compressed texture represents a block in the compressed one. * Implement JD's suggestion for HashCode Combine Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com> * Address feedback * Address feedback. Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>	2020-09-10 16:44:04 -03:00
sharmander	bc19114bb5	Fix: Issue #1475 Texture Compatibility Check methods need to be centralized (#1482 ) * Texture Compatibility Check methods need to be centralized #1475 * Fix spacing * Fix spacing * Undo removal of .ToString() * Move isPerfectMatch back to Texture.cs Rename parameters in TextureCompatibility.cs for consistency * Add switch from 1474 to TextureCompatibility as requested by mageven. * Actually add TextureCompatibility changes to the PR (Add DeriveDepthFormat method) * Alignment corrections + Derive method signature adjustment. * Removed empty line as erquested * Remove empty lines * Remove blank lines, fix alignment * Fix alignment * Remove emtpy line	2020-08-31 21:06:27 -03:00
mageven	2a314f3c28	Add missing depth-color conversions in CopyTexture (#1474 ) * Add missing depth-color conversions in CopyTexture * Whitespace * switch expression	2020-08-14 20:03:19 +10:00
LDj3SNuD	8624dd8de6	Fix MacroJit SubtractWithBorrow Alu Reg Operation. (#1473 )	2020-08-13 12:08:48 -03:00
gdkchan	157ad3f54f	Silence several build warnings (#1428 ) * Silence several build warnings * Remove fixed buffers from NVDEC struct * Remove unused field and usings * Fix wrong name * Silence more warning on H264 PictureInfo	2020-08-06 23:40:41 +02:00
mageven	a33dc2f491	Improved Logger (#1292 ) * Logger class changes only Now compile-time checking is possible with the help of Nullable Value types. * Misc formatting * Manual optimizations PrintGuestLog PrintGuestStackTrace Surfaceflinger DequeueBuffer * Reduce SendVibrationXX log level to Debug * Add Notice log level This level is always enabled and used to print system info, etc... Also, rewrite LogColor to switch expression as colors are static * Unify unhandled exception event handlers * Print enabled LogLevels during init * Re-add App Exit disposes in proper order nit: switch case spacing * Revert PrintGuestStackTrace to Info logs due to #1407 PrintGuestStackTrace is now called in some critical error handlers so revert to old behavior as KThread isn't part of Guest. * Batch replace Logger statements	2020-08-04 01:32:53 +02:00
gdkchan	60db4c3530	Implement a Macro JIT (#1445 ) * Implement a Macro JIT * Nit: space	2020-08-03 03:36:57 +02:00
gdkchan	43c13057da	Implement alpha test using legacy functions (#1426 )	2020-07-28 18:30:08 -03:00
gdkchan	51fbc1fde4	Use polygon offset clamp if supported (#1429 )	2020-07-26 18:11:28 -03:00
gdkchan	111534a74e	Remove GPU MemoryAccessor (#1423 ) * Remove GPU MemoryAccessor * Update outdated XML doc * Update more outdated stuff	2020-07-25 16:39:45 +10:00
gdkchan	5a7df48975	New GPFifo and fast guest constant buffer updates (#1400 ) * Add new structures from official docs, start migrating GPFifo * Finish migration to new GPFifo processor * Implement fast constant buffer data upload * Migrate to new GPFifo class * XML docs	2020-07-23 23:53:25 -03:00
mageven	723ae240dc	GL: Implement more Point parameters (#1399 ) * Fix GL_INVALID_VALUE on glPointSize calls * Implement more of Point primitive state * Use existing Origin enum	2020-07-20 21:59:13 -03:00
gdkchan	788ca6a411	Initial transform feedback support (#1370 ) * Initial transform feedback support * Some nits and fixes * Update ReportCounterType and Write method * Can't change shader or TFB bindings while TFB is active * Fix geometry shader input names with new naming	2020-07-15 13:01:10 +10:00
gdkchan	4d02a2d2c0	New NVDEC and VIC implementation (#1384 ) * Initial NVDEC and VIC implementation * Update FFmpeg.AutoGen to 4.3.0 * Add nvdec dependencies for Windows * Unify some VP9 structures * Rename VP9 structure fields * Improvements to Video API * XML docs for Common.Memory * Remove now unused or redundant overloads from MemoryAccessor * NVDEC UV surface read/write scalar paths * Add FIXME comments about hacky things/stuff that will need to be fixed in the future * Cleaned up VP9 memory allocation * Remove some debug logs * Rename some VP9 structs * Remove unused struct * No need to compile Ryujinx.Graphics.Host1x with unsafe anymore * Name AsyncWorkQueue threads to make debugging easier * Make Vp9PictureInfo a ref struct * LayoutConverter no longer needs the depth argument (broken by rebase) * Pooling of VP9 buffers, plus fix a memory leak on VP9 * Really wish VS could rename projects properly... * Address feedback * Remove using * Catch OperationCanceledException * Add licensing informations * Add THIRDPARTY.md to release too Co-authored-by: Thog <me@thog.eu>	2020-07-12 05:07:01 +02:00
riperiperi	f224769c49	Implement Logical Operation registers and functionality (#1380 ) * Implement Logical Operation registers and functionality. * Address Feedback 1	2020-07-10 14:23:15 -03:00
riperiperi	484eb645ae	Implement Zero-Configuration Resolution Scaling (#1365 ) * Initial implementation of Render Target Scaling Works with most games I have. No GUI option right now, it is hardcoded. Missing handling for texelFetch operation. * Realtime Configuration, refactoring. * texelFetch scaling on fragment shader (WIP) * Improve Shader-Side changes. * Fix potential crash when no color/depth bound * Workaround random uses of textures in compute. This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything. * Fix scales oscillating when changing between non-native scales. * Scaled textures on compute, cleanup, lazier uniform update. * Cleanup. * Fix stupidity * Address Thog Feedback. * Cover most of GDK's feedback (two comments remain) * Fix bad rename * Move IsDepthStencil to FormatExtensions, add docs. * Fix default config, square texture detection. * Three final fixes: - Nearest copy when texture is integer format. - Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error) - Discount small textures. * Remove scale threshold. Not needed right now - we'll see if we run into problems. * All CPU modification blacklists scale. * Fix comment.	2020-07-07 04:41:07 +02:00
gdkchan	76e5af967a	Fix buffer to 3D texture copy (#1354 )	2020-07-04 01:37:36 +02:00
gdkchan	dbeb50684d	Support inline index buffer data (#1351 ) * Support inline index buffer data * Sort usings	2020-07-04 00:41:27 +02:00
gdkchan	b0d9ec8a82	Fix compute restore of previous shader state (#1352 )	2020-07-04 00:30:41 +02:00
gdkchan	96951b7d04	Fix regression caused by wrong SB descriptor offset (#1316 )	2020-06-22 13:48:32 +02:00
riperiperi	bea1fc2e8d	Optimize texture format conversion, and MethodCopyBuffer (#1274 ) * Improve performance when converting texture formats. Still more work to do. * Speed up buffer -> texture copies. No longer copies byte by byte. Fast path when formats are identical. * Fix a few things, 64 byte block fast copy. * Spacing cleanup, unrelated change. * Fix base offset calculation for region copies. * Fix Linear -> BlockLinear * Fix some nits. (part 1 of review feedback) * Use a generic version of the Convert* functions rather than lambdas. This is some real monkey's paw shit. * Remove unnecessary span constructor. * Revert "Use a generic version of the Convert* functions rather than lambdas." This reverts commit `aa43dcfbe8`. * Fix bug with rectangle destination writing, better rectangle calculation for linear textures.	2020-06-13 19:31:06 -03:00
gdkchan	44d7fcff39	Implement FIFO semaphore (#1286 ) * Implement FIFO semaphore * New enum for FIFO semaphore operation	2020-05-29 10:51:10 +02:00
gdkchan	a15b951721	Fix wrong face culling once and for all (#1277 ) * Viewport swizzle support on NV and clip origin * Initialize default viewport swizzle state, emulate viewport swizzle on shaders when not supported * Address PR feedback	2020-05-28 09:03:07 +10:00
gdkchan	5795bb1528	Support separate textures and samplers (#1216 ) * Support separate textures and samplers * Add missing bindless flag, fix SNORM format on buffer textures * Add missing separation * Add comments about the new handles	2020-05-27 16:07:10 +02:00
gdkchan	5011640b30	Spanify Graphics Abstraction Layer (#1226 ) * Spanify Graphics Abstraction Layer * Be explicit about BufferHandle size	2020-05-23 11:46:09 +02:00
gdkchan	b8eb6abecc	Refactor shader GPU state and memory access (#1203 ) * Refactor shader GPU state and memory access * Fix NVDEC project build * Address PR feedback and add missing XML comments	2020-05-06 11:02:28 +10:00
riperiperi	cd48576f58	Implement Counter Queue and Partial Host Conditional Rendering (#1167 ) * Implementation of query queue and host conditional rendering * Resolve some comments. * Use overloads instead of passing object. * Wake the consumer threads when incrementing syncpoints. Also, do a busy loop when awaiting the counter for a blocking flush, rather than potentially sleeping the thread. * Ensure there's a command between begin and end query.	2020-05-04 12:24:59 +10:00
mageven	53369e79bd	Implement user-defined clipping on GL state pipeline (#1118 )	2020-05-04 12:04:49 +10:00
riperiperi	c2ac45adc5	Fix depth clamp enable bit, unit scale for polygon offset. (#1178 ) Verified with deko3d and opengl driver code.	2020-04-30 11:47:24 +10:00
gdkchan	3cb1fa0e85	Implement texture buffers (#1152 ) * Implement texture buffers * Throw NotSupportedException where appropriate	2020-04-25 23:02:18 +10:00

1 2

96 commits