Fast path for Inline2Memory buffer write that skips write tracking (#2624)

* Fast path for Inline2Memory buffer write

This PR adds a method to PhysicalMemory that attempts to write all cached resources directly, so that memory tracking can be avoided. The goal of this is both to avoid flushing buffer data, and to avoid raising the sequence number when data is written, which causes buffer and texture handles to be re-checked.

This currently only targets buffers, with a side check on textures that falls back to a tracked write if any exist within the target range. It's not expected to write textures from here - this is just a mechanism to protect us if someone does decide to do that. It's possible to add a fast path for this in future (and for ShaderCache, once that starts using tracking)

The forced read before inline2memory begins has been skipped, as the data is fully written when the transfer is completed anyways. This allows us to flush on read in emergency situations, but still write the new data over the flushed data.

Improves performance on Xenoblade 2 and DE, which was flushing buffer data on the GPU thread when trying to write compute data. May improve performance in other games that write SSBOs from compute, and update data in the same/nearby pages often.

Super Smash Bros Ultimate should probably be tested to make sure the vertex explosions haven't returned, as I think that's what this AdvanceSequence was for.

* ForceDirty before write, to make sure data does not flush over the new write
This commit is contained in:
riperiperi 2021-09-19 14:09:53 +01:00 committed by GitHub
parent db97b1d7d2
commit 7c5ead1c19
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 37 additions and 6 deletions

View file

@ -110,9 +110,6 @@ namespace Ryujinx.Graphics.Gpu.Engine.InlineToMemory
ulong dstGpuVa = ((ulong)state.OffsetOutUpperValue << 32) | state.OffsetOut;
// Trigger read tracking, to flush any managed resources in the destination region.
_channel.MemoryManager.GetSpan(dstGpuVa, _size, true);
_dstGpuVa = dstGpuVa;
_dstX = state.SetDstOriginBytesXV;
_dstY = state.SetDstOriginSamplesYV;
@ -174,7 +171,7 @@ namespace Ryujinx.Graphics.Gpu.Engine.InlineToMemory
if (_isLinear && _lineCount == 1)
{
memoryManager.Write(_dstGpuVa, data);
memoryManager.Physical.CacheResourceWrite(memoryManager, _dstGpuVa, data);
}
else
{
@ -227,11 +224,11 @@ namespace Ryujinx.Graphics.Gpu.Engine.InlineToMemory
memoryManager.Write(dstAddress, data[srcOffset]);
}
}
_context.AdvanceSequence();
}
_finished = true;
_context.AdvanceSequence();
}
}
}

View file

@ -99,6 +99,18 @@ namespace Ryujinx.Graphics.Gpu.Image
return TextureScaleMode.Blacklisted;
}
/// <summary>
/// Determines if any texture exists within the target memory range.
/// </summary>
/// <param name="memoryManager">The GPU memory manager</param>
/// <param name="gpuVa">GPU virtual address to search for textures</param>
/// <param name="size">The size of the range</param>
/// <returns>True if any texture exists in the range, false otherwise</returns>
public bool IsTextureInRange(MemoryManager memoryManager, ulong gpuVa, ulong size)
{
return _textures.FindOverlaps(memoryManager.GetPhysicalRegions(gpuVa, size), ref _textureOverlaps) != 0;
}
/// <summary>
/// Determines if a given texture is "safe" for upscaling from its info.
/// Note that this is different from being compatible - this elilinates targets that would have detrimental effects when scaled.

View file

@ -80,6 +80,28 @@ namespace Ryujinx.Graphics.Gpu.Memory
}
}
/// <summary>
/// Write data to memory that is destined for a resource in a cache.
/// This avoids triggering write tracking when possible, which can avoid flushes and incrementing sequence number.
/// </summary>
/// <param name="memoryManager">The GPU memory manager</param>
/// <param name="gpuVa">GPU virtual address to write the data into</param>
/// <param name="data">The data to be written</param>
public void CacheResourceWrite(MemoryManager memoryManager, ulong gpuVa, ReadOnlySpan<byte> data)
{
if (TextureCache.IsTextureInRange(memoryManager, gpuVa, (ulong)data.Length))
{
// No fast path yet - copy the data back and trigger write tracking.
memoryManager.Write(gpuVa, data);
_context.AdvanceSequence();
}
else
{
BufferCache.ForceDirty(memoryManager, gpuVa, (ulong)data.Length);
memoryManager.WriteUntracked(gpuVa, data);
}
}
/// <summary>
/// Gets a span of data from the application process.
/// </summary>