Commit graph

14 commits

Author SHA1 Message Date
riperiperi
bf77d1cab9
GPU: Pass SpanOrArray for Texture SetData to avoid copy (#3745)
* GPU: Pass SpanOrArray for Texture SetData to avoid copy

Texture data is often converted before upload, meaning that an array was allocated to perform the conversion into. However, the backend SetData methods were being passed a Span of that data, and the Multithreaded layer does `ToArray()` on it so that it can be stored for later! This method can't extract the original array, so it creates a copy.

This PR changes the type passed for textures to a new ref struct called SpanOrArray, which is backed by either a ReadOnlySpan or an array. The benefit here is that we can have a ToArray method that doesn't copy if it is originally backed by an array.

This will also avoid a copy when running the ASTC decoder.

On NieR this was taking 38% of texture upload time, which it does a _lot_ of when you move between areas, so there should be a 1.6x performance boost when strictly uploading textures. No doubt this will also improve texture streaming performance in UE4 games, and maybe a small reduction with video playback.

From the numbers, it's probably possible to improve the upload rate by a further 1.6x by performing layout conversion on GPU. I'm not sure if we could improve it further than that - multithreading conversion on CPU would probably result in memory bottleneck.

This doesn't extend to buffers, since we don't convert their data on the GPU emulator side.

* Remove implicit cast to array.
2022-10-08 12:04:47 -03:00
gdkchan
923089a298
Fast path for Inline-to-Memory texture data transfers (#3610)
* Fast path for Inline-to-Memory texture data transfers

* Only do it for block linear textures to be on the safe side
2022-08-26 02:16:41 +00:00
riperiperi
cda659955c
Texture Sync, incompatible overlap handling, data flush improvements. (#2971)
* Initial test for texture sync

* WIP new texture flushing setup

* Improve rules for incompatible overlaps

Fixes a lot of issues with Unreal Engine games. Still a few minor issues (some caused by dma fast path?) Needs docs and cleanup.

* Cleanup, improvements

Improve rules for fast DMA

* Small tweak to group together flushes of overlapping handles.

* Fixes, flush overlapping texture data for ASTC and BC4/5 compressed textures.

Fixes the new Life is Strange game.

* Flush overlaps before init data, fix 3d texture size/overlap stuff

* Fix 3D Textures, faster single layer flush

Note: nosy people can no longer merge this with Vulkan. (unless they are nosy enough to implement the new backend methods)

* Remove unused method

* Minor cleanup

* More cleanup

* Use the More Fun and Hopefully No Driver Bugs method for getting compressed tex too

This one's for metro

* Address feedback, ASTC+ETC to FormatClass

* Change offset to use Span slice rather than IntPtr Add

* Fix this too
2022-01-09 13:28:48 -03:00
riperiperi
4b60371e64
Return mapped buffer pointer directly for flush, WriteableRegion for textures (#2494)
* Return mapped buffer pointer directly for flush, WriteableRegion for textures

A few changes here to generally improve performance, even for platforms not using the persistent buffer flush.

- Texture and buffer flush now return a ReadOnlySpan<byte>. It's guaranteed that this span is pinned in memory, but it will be overwritten on the next flush from that thread, so it is expected that the data is used before calling again.
- As a result, persistent mappings no longer copy to a new array - rather the persistent map is returned directly as a Span<>. A similar host array is used for the glGet flushes instead of allocating new arrays each time.
- Texture flushes now do their layout conversion into a WriteableRegion when the texture is not MultiRange, which allows the flush to happen directly into guest memory rather than into a temporary span, then copied over. This avoids another copy when doing layout conversion.

Overall, this saves 1 data copy for buffer flush, 1 copy for linear textures with matching source/target stride, and 2 copies for block textures or linear textures with mismatching strides.

* Fix tests

* Fix array pointer for Mesa/Intel path

* Address some feedback

* Update method for getting array pointer.
2021-07-19 19:10:54 -03:00
riperiperi
b530f0e110
Texture Cache: "Texture Groups" and "Texture Dependencies" (#2001)
* Initial implementation (3d tex mips broken)

This works rather well for most games, just need to fix 3d texture mips.

* Cleanup

* Address feedback

* Copy Dependencies and various other fixes

* Fix layer/level offset for copy from view<->view.

* Remove dirty flag from dependency

The dirty flag behaviour is not needed - DeferredCopy is all we need.

* Fix tracking mip slices.

* Propagate granularity (fix astral chain)

* Address Feedback pt 1

* Save slice sizes as part of SizeInfo

* Fix nits

* Fix disposing multiple dependencies causing a crash

This list is obviously modified when removing dependencies, so create a copy of it.
2021-03-02 19:30:54 -03:00
riperiperi
5d69d9103e
Texture/Buffer Memory Management Improvements (#1408)
* Initial implementation. Still pending better valid-overlap handling,
disposed pool, compressed format flush fix.

* Very messy backend resource cache.

* Oops

* Dispose -> Release

* Improve Release/Dispose.

* More rule refinement.

* View compatibility levels as an enum - you can always know if a view is only copy compatible.

* General cleanup.

Use locking on the resource cache, as it is likely to be used by other threads in future.

* Rename resource cache to resource pool.

* Address some of the smaller nits.

* Fix regression with MK8 lens flare

Texture flushes done the old way should trigger memory tracking.

* Use TextureCreateInfo as a key.

It now implements IEquatable and generates a hashcode based on width/height.

* Fix size change for compressed+non-compressed view combos.

Before, this could set either the compressed or non compressed texture with a size with the wrong size, depending on which texture had its size changed. This caused exceptions when flushing the texture.

Now it correctly takes the block size into account, assuming that these textures are only related because a pixel in the non-compressed texture represents a block in the compressed one.

* Implement JD's suggestion for HashCode Combine

Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>

* Address feedback

* Address feedback.

Co-authored-by: jduncanator <1518948+jduncanator@users.noreply.github.com>
2020-09-10 16:44:04 -03:00
riperiperi
484eb645ae
Implement Zero-Configuration Resolution Scaling (#1365)
* Initial implementation of Render Target Scaling

Works with most games I have. No GUI option right now, it is hardcoded.

Missing handling for texelFetch operation.

* Realtime Configuration, refactoring.

* texelFetch scaling on fragment shader (WIP)

* Improve Shader-Side changes.

* Fix potential crash when no color/depth bound

* Workaround random uses of textures in compute.

This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything.

* Fix scales oscillating when changing between non-native scales.

* Scaled textures on compute, cleanup, lazier uniform update.

* Cleanup.

* Fix stupidity

* Address Thog Feedback.

* Cover most of GDK's feedback (two comments remain)

* Fix bad rename

* Move IsDepthStencil to FormatExtensions, add docs.

* Fix default config, square texture detection.

* Three final fixes:

- Nearest copy when texture is integer format.
- Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error)
- Discount small textures.

* Remove scale threshold.

Not needed right now - we'll see if we run into problems.

* All CPU modification blacklists scale.

* Fix comment.
2020-07-07 04:41:07 +02:00
gdkchan
3cb1fa0e85
Implement texture buffers (#1152)
* Implement texture buffers

* Throw NotSupportedException where appropriate
2020-04-25 23:02:18 +10:00
gdkchan
b8e3909d80 Add a GetSpan method to the memory manager and use it on GPU (#877) 2020-01-13 10:27:50 +11:00
gdkchan
9bfb373bdf Remove more unused code 2020-01-09 02:13:00 +01:00
gdkchan
654e617fe7 Some code cleanup 2020-01-09 02:13:00 +01:00
gdkchan
e25b7c9848 Initial support for the guest OpenGL driver (NVIDIA and Nouveau) 2020-01-09 02:13:00 +01:00
gdk
d786d8d2b9 Support copy of slices to 3D textures, remove old 3D render target layered render support, do not delete textures with existing views created from them 2020-01-09 02:13:00 +01:00
gdk
1876b346fe Initial work 2020-01-09 02:13:00 +01:00