Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
using Ryujinx.Cpu.Tracking;
|
2019-10-13 08:02:07 +02:00
|
|
|
using Ryujinx.Graphics.GAL;
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
using Ryujinx.Memory.Range;
|
2021-01-17 21:08:06 +01:00
|
|
|
using Ryujinx.Memory.Tracking;
|
2019-10-13 08:02:07 +02:00
|
|
|
using System;
|
|
|
|
|
|
|
|
namespace Ryujinx.Graphics.Gpu.Memory
|
|
|
|
{
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Buffer, used to store vertex and index data, uniform and storage buffers, and others.
|
|
|
|
/// </summary>
|
2019-12-31 23:37:00 +01:00
|
|
|
class Buffer : IRange, IDisposable
|
2019-10-13 08:02:07 +02:00
|
|
|
{
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
private static ulong GranularBufferThreshold = 4096;
|
|
|
|
|
2019-12-31 23:37:00 +01:00
|
|
|
private readonly GpuContext _context;
|
2019-10-13 08:02:07 +02:00
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
2020-05-23 11:46:09 +02:00
|
|
|
/// Host buffer handle.
|
2019-12-31 04:22:58 +01:00
|
|
|
/// </summary>
|
2020-05-23 11:46:09 +02:00
|
|
|
public BufferHandle Handle { get; }
|
2019-10-27 03:41:01 +01:00
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Start address of the buffer in guest memory.
|
|
|
|
/// </summary>
|
2019-10-13 08:02:07 +02:00
|
|
|
public ulong Address { get; }
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Size of the buffer in bytes.
|
|
|
|
/// </summary>
|
|
|
|
public ulong Size { get; }
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// End address of the buffer in guest memory.
|
|
|
|
/// </summary>
|
2019-10-13 08:02:07 +02:00
|
|
|
public ulong EndAddress => Address + Size;
|
|
|
|
|
2021-01-17 21:08:06 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Ranges of the buffer that have been modified on the GPU.
|
|
|
|
/// Ranges defined here cannot be updated from CPU until a CPU waiting sync point is reached.
|
|
|
|
/// Then, write tracking will signal, wait for GPU sync (generated at the syncpoint) and flush these regions.
|
|
|
|
/// </summary>
|
|
|
|
/// <remarks>
|
|
|
|
/// This is null until at least one modification occurs.
|
|
|
|
/// </remarks>
|
|
|
|
private BufferModifiedRangeList _modifiedRanges = null;
|
|
|
|
|
2020-12-03 19:34:32 +01:00
|
|
|
private CpuMultiRegionHandle _memoryTrackingGranular;
|
2021-01-17 21:08:06 +01:00
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
private CpuRegionHandle _memoryTracking;
|
2021-01-17 21:08:06 +01:00
|
|
|
|
|
|
|
private readonly RegionSignal _externalFlushDelegate;
|
|
|
|
private readonly Action<ulong, ulong> _loadDelegate;
|
2020-12-03 19:34:32 +01:00
|
|
|
private readonly Action<ulong, ulong> _modifiedDelegate;
|
2021-01-17 21:08:06 +01:00
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
private int _sequenceNumber;
|
2020-05-04 00:54:50 +02:00
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
private bool _useGranular;
|
2021-01-17 21:08:06 +01:00
|
|
|
private bool _syncActionRegistered;
|
2019-10-13 08:02:07 +02:00
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Creates a new instance of the buffer.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="context">GPU context that the buffer belongs to</param>
|
|
|
|
/// <param name="address">Start address of the buffer</param>
|
|
|
|
/// <param name="size">Size of the buffer in bytes</param>
|
2019-10-13 08:02:07 +02:00
|
|
|
public Buffer(GpuContext context, ulong address, ulong size)
|
|
|
|
{
|
|
|
|
_context = context;
|
|
|
|
Address = address;
|
|
|
|
Size = size;
|
|
|
|
|
2020-05-23 11:46:09 +02:00
|
|
|
Handle = context.Renderer.CreateBuffer((int)size);
|
2019-10-13 08:02:07 +02:00
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
_useGranular = size > GranularBufferThreshold;
|
2020-05-04 00:54:50 +02:00
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
if (_useGranular)
|
|
|
|
{
|
2020-12-03 19:34:32 +01:00
|
|
|
_memoryTrackingGranular = context.PhysicalMemory.BeginGranularTracking(address, size);
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
_memoryTracking = context.PhysicalMemory.BeginTracking(address, size);
|
|
|
|
}
|
2020-12-03 19:34:32 +01:00
|
|
|
|
2021-01-17 21:08:06 +01:00
|
|
|
_externalFlushDelegate = new RegionSignal(ExternalFlush);
|
|
|
|
_loadDelegate = new Action<ulong, ulong>(LoadRegion);
|
2020-12-03 19:34:32 +01:00
|
|
|
_modifiedDelegate = new Action<ulong, ulong>(RegionModified);
|
2019-10-13 08:02:07 +02:00
|
|
|
}
|
|
|
|
|
2021-01-24 23:22:19 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Gets a sub-range from the buffer, from a start address till the end of the buffer.
|
|
|
|
/// </summary>
|
|
|
|
/// <remarks>
|
|
|
|
/// This can be used to bind and use sub-ranges of the buffer on the host API.
|
|
|
|
/// </remarks>
|
|
|
|
/// <param name="address">Start address of the sub-range, must be greater than or equal to the buffer address</param>
|
|
|
|
/// <returns>The buffer sub-range</returns>
|
|
|
|
public BufferRange GetRange(ulong address)
|
|
|
|
{
|
|
|
|
ulong offset = address - Address;
|
|
|
|
|
|
|
|
return new BufferRange(Handle, (int)offset, (int)(Size - offset));
|
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Gets a sub-range from the buffer.
|
|
|
|
/// </summary>
|
2020-01-01 16:39:09 +01:00
|
|
|
/// <remarks>
|
|
|
|
/// This can be used to bind and use sub-ranges of the buffer on the host API.
|
|
|
|
/// </remarks>
|
|
|
|
/// <param name="address">Start address of the sub-range, must be greater than or equal to the buffer address</param>
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <param name="size">Size in bytes of the sub-range, must be less than or equal to the buffer size</param>
|
|
|
|
/// <returns>The buffer sub-range</returns>
|
2019-10-13 08:02:07 +02:00
|
|
|
public BufferRange GetRange(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
int offset = (int)(address - Address);
|
|
|
|
|
2020-05-23 11:46:09 +02:00
|
|
|
return new BufferRange(Handle, offset, (int)size);
|
2019-10-13 08:02:07 +02:00
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Checks if a given range overlaps with the buffer.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">Start address of the range</param>
|
|
|
|
/// <param name="size">Size in bytes of the range</param>
|
|
|
|
/// <returns>True if the range overlaps, false otherwise</returns>
|
2019-10-13 08:02:07 +02:00
|
|
|
public bool OverlapsWith(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
return Address < address + size && address < EndAddress;
|
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Performs guest to host memory synchronization of the buffer data.
|
2020-01-01 16:39:09 +01:00
|
|
|
/// </summary>
|
|
|
|
/// <remarks>
|
2019-12-31 04:22:58 +01:00
|
|
|
/// This causes the buffer data to be overwritten if a write was detected from the CPU,
|
|
|
|
/// since the last call to this method.
|
2020-01-01 16:39:09 +01:00
|
|
|
/// </remarks>
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <param name="address">Start address of the range to synchronize</param>
|
|
|
|
/// <param name="size">Size in bytes of the range to synchronize</param>
|
2019-10-13 08:02:07 +02:00
|
|
|
public void SynchronizeMemory(ulong address, ulong size)
|
|
|
|
{
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
if (_useGranular)
|
2019-10-13 08:02:07 +02:00
|
|
|
{
|
2020-12-03 19:34:32 +01:00
|
|
|
_memoryTrackingGranular.QueryModified(address, size, _modifiedDelegate, _context.SequenceNumber);
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
}
|
|
|
|
else
|
2019-10-13 08:02:07 +02:00
|
|
|
{
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
if (_memoryTracking.Dirty && _context.SequenceNumber != _sequenceNumber)
|
|
|
|
{
|
|
|
|
_memoryTracking.Reprotect();
|
2021-01-17 21:08:06 +01:00
|
|
|
|
|
|
|
if (_modifiedRanges != null)
|
|
|
|
{
|
|
|
|
_modifiedRanges.ExcludeModifiedRegions(Address, Size, _loadDelegate);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
_context.Renderer.SetBufferData(Handle, 0, _context.PhysicalMemory.GetSpan(Address, (int)Size));
|
|
|
|
}
|
|
|
|
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
_sequenceNumber = _context.SequenceNumber;
|
|
|
|
}
|
2019-10-13 08:02:07 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-01-17 21:08:06 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Ensure that the modified range list exists.
|
|
|
|
/// </summary>
|
|
|
|
private void EnsureRangeList()
|
|
|
|
{
|
|
|
|
if (_modifiedRanges == null)
|
|
|
|
{
|
|
|
|
_modifiedRanges = new BufferModifiedRangeList(_context);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Signal that the given region of the buffer has been modified.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">The start address of the modified region</param>
|
|
|
|
/// <param name="size">The size of the modified region</param>
|
|
|
|
public void SignalModified(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
EnsureRangeList();
|
|
|
|
|
|
|
|
_modifiedRanges.SignalModified(address, size);
|
|
|
|
|
|
|
|
if (!_syncActionRegistered)
|
|
|
|
{
|
|
|
|
_context.RegisterSyncAction(SyncAction);
|
|
|
|
_syncActionRegistered = true;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Indicate that mofifications in a given region of this buffer have been overwritten.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">The start address of the region</param>
|
|
|
|
/// <param name="size">The size of the region</param>
|
|
|
|
public void ClearModified(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
if (_modifiedRanges != null)
|
|
|
|
{
|
|
|
|
_modifiedRanges.Clear(address, size);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Action to be performed when a syncpoint is reached after modification.
|
|
|
|
/// This will register read/write tracking to flush the buffer from GPU when its memory is used.
|
|
|
|
/// </summary>
|
|
|
|
private void SyncAction()
|
|
|
|
{
|
|
|
|
_syncActionRegistered = false;
|
|
|
|
|
|
|
|
if (_useGranular)
|
|
|
|
{
|
|
|
|
_modifiedRanges.GetRanges(Address, Size, (address, size) =>
|
|
|
|
{
|
|
|
|
_memoryTrackingGranular.RegisterAction(address, size, _externalFlushDelegate);
|
|
|
|
SynchronizeMemory(address, size);
|
|
|
|
});
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
_memoryTracking.RegisterAction(_externalFlushDelegate);
|
|
|
|
SynchronizeMemory(Address, Size);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Inherit modified ranges from another buffer.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="from">The buffer to inherit from</param>
|
|
|
|
public void InheritModifiedRanges(Buffer from)
|
|
|
|
{
|
|
|
|
if (from._modifiedRanges != null)
|
|
|
|
{
|
|
|
|
if (from._syncActionRegistered && !_syncActionRegistered)
|
|
|
|
{
|
|
|
|
_context.RegisterSyncAction(SyncAction);
|
|
|
|
_syncActionRegistered = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
EnsureRangeList();
|
|
|
|
_modifiedRanges.InheritRanges(from._modifiedRanges, (ulong address, ulong size) =>
|
|
|
|
{
|
|
|
|
if (_useGranular)
|
|
|
|
{
|
|
|
|
_memoryTrackingGranular.RegisterAction(address, size, _externalFlushDelegate);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
_memoryTracking.RegisterAction(_externalFlushDelegate);
|
|
|
|
}
|
|
|
|
});
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Determine if a given region of the buffer has been modified, and must be flushed.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">The start address of the region</param>
|
|
|
|
/// <param name="size">The size of the region</param>
|
|
|
|
/// <returns></returns>
|
|
|
|
public bool IsModified(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
if (_modifiedRanges != null)
|
|
|
|
{
|
|
|
|
return _modifiedRanges.HasRange(address, size);
|
|
|
|
}
|
|
|
|
|
|
|
|
return false;
|
|
|
|
}
|
|
|
|
|
2020-12-03 19:34:32 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Indicate that a region of the buffer was modified, and must be loaded from memory.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="mAddress">Start address of the modified region</param>
|
|
|
|
/// <param name="mSize">Size of the modified region</param>
|
|
|
|
private void RegionModified(ulong mAddress, ulong mSize)
|
|
|
|
{
|
|
|
|
if (mAddress < Address)
|
|
|
|
{
|
|
|
|
mAddress = Address;
|
|
|
|
}
|
|
|
|
|
|
|
|
ulong maxSize = Address + Size - mAddress;
|
|
|
|
|
|
|
|
if (mSize > maxSize)
|
|
|
|
{
|
|
|
|
mSize = maxSize;
|
|
|
|
}
|
|
|
|
|
2021-01-17 21:08:06 +01:00
|
|
|
if (_modifiedRanges != null)
|
|
|
|
{
|
|
|
|
_modifiedRanges.ExcludeModifiedRegions(mAddress, mSize, _loadDelegate);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
LoadRegion(mAddress, mSize);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Load a region of the buffer from memory.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="mAddress">Start address of the modified region</param>
|
|
|
|
/// <param name="mSize">Size of the modified region</param>
|
|
|
|
private void LoadRegion(ulong mAddress, ulong mSize)
|
|
|
|
{
|
2020-12-03 19:34:32 +01:00
|
|
|
int offset = (int)(mAddress - Address);
|
|
|
|
|
|
|
|
_context.Renderer.SetBufferData(Handle, offset, _context.PhysicalMemory.GetSpan(mAddress, (int)mSize));
|
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Performs copy of all the buffer data from one buffer to another.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="destination">The destination buffer to copy the data into</param>
|
|
|
|
/// <param name="dstOffset">The offset of the destination buffer to copy into</param>
|
2019-10-13 08:02:07 +02:00
|
|
|
public void CopyTo(Buffer destination, int dstOffset)
|
|
|
|
{
|
2020-05-23 11:46:09 +02:00
|
|
|
_context.Renderer.Pipeline.CopyBuffer(Handle, destination.Handle, 0, dstOffset, (int)Size);
|
2019-10-13 08:02:07 +02:00
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Flushes a range of the buffer.
|
|
|
|
/// This writes the range data back into guest memory.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">Start address of the range</param>
|
|
|
|
/// <param name="size">Size in bytes of the range</param>
|
2019-10-27 03:41:01 +01:00
|
|
|
public void Flush(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
int offset = (int)(address - Address);
|
|
|
|
|
2020-05-23 11:46:09 +02:00
|
|
|
byte[] data = _context.Renderer.GetBufferData(Handle, offset, (int)size);
|
2019-10-27 03:41:01 +01:00
|
|
|
|
2020-09-10 21:44:04 +02:00
|
|
|
// TODO: When write tracking shaders, they will need to be aware of changes in overlapping buffers.
|
|
|
|
_context.PhysicalMemory.WriteUntracked(address, data);
|
2019-10-27 03:41:01 +01:00
|
|
|
}
|
|
|
|
|
2021-01-17 21:08:06 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Align a given address and size region to page boundaries.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">The start address of the region</param>
|
|
|
|
/// <param name="size">The size of the region</param>
|
|
|
|
/// <returns>The page aligned address and size</returns>
|
|
|
|
private static (ulong address, ulong size) PageAlign(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
ulong pageMask = MemoryManager.PageMask;
|
|
|
|
ulong rA = address & ~pageMask;
|
|
|
|
ulong rS = ((address + size + pageMask) & ~pageMask) - rA;
|
|
|
|
return (rA, rS);
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Flush modified ranges of the buffer from another thread.
|
|
|
|
/// This will flush all modifications made before the active SyncNumber was set, and may block to wait for GPU sync.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">Address of the memory action</param>
|
|
|
|
/// <param name="size">Size in bytes</param>
|
|
|
|
public void ExternalFlush(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
_context.Renderer.BackgroundContextAction(() =>
|
|
|
|
{
|
|
|
|
var ranges = _modifiedRanges;
|
|
|
|
|
|
|
|
if (ranges != null)
|
|
|
|
{
|
|
|
|
(address, size) = PageAlign(address, size);
|
|
|
|
ranges.WaitForAndGetRanges(address, size, Flush);
|
|
|
|
}
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
|
|
|
/// <summary>
|
|
|
|
/// Called when part of the memory for this buffer has been unmapped.
|
|
|
|
/// Calls are from non-GPU threads.
|
|
|
|
/// </summary>
|
|
|
|
/// <param name="address">Start address of the unmapped region</param>
|
|
|
|
/// <param name="size">Size of the unmapped region</param>
|
|
|
|
public void Unmapped(ulong address, ulong size)
|
|
|
|
{
|
|
|
|
_modifiedRanges?.Clear(address, size);
|
|
|
|
}
|
|
|
|
|
2019-12-31 04:22:58 +01:00
|
|
|
/// <summary>
|
|
|
|
/// Disposes the host buffer.
|
|
|
|
/// </summary>
|
2019-10-13 08:02:07 +02:00
|
|
|
public void Dispose()
|
|
|
|
{
|
2021-01-17 21:08:06 +01:00
|
|
|
_modifiedRanges?.Clear();
|
Memory Read/Write Tracking using Region Handles (#1272)
* WIP Range Tracking
- Texture invalidation seems to have large problems
- Buffer/Pool invalidation may have problems
- Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution.
- Native project is in the messiest possible location.
- [HACK] JIT memory access always uses native "fast" path
- [HACK] Trying some things with texture invalidation and views.
It works :)
Still a few hacks, messy things, slow things
More work in progress stuff (also move to memory project)
Quite a bit faster now.
- Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former.
- The Virtual range list is now non-overlapping like the physical one.
- Fixed some bugs where regions could leak.
- Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road)
Move some stuff.
I think we'll eventually just put the dll and so for this in a nuget package.
Fix rebase.
[WIP] MultiRegionHandle variable size ranges
- Avoid reprotecting regions that change often (needs some tweaking)
- There's still a bug in buffers, somehow.
- Might want different api for minimum granularity
Fix rebase issue
Commit everything needed for software only tracking.
Remove native components.
Remove more native stuff.
Cleanup
Use a separate window for the background context, update opentk. (fixes linux)
Some experimental changes
Should get things working up to scratch - still need to try some things with flush/modification and res scale.
Include address with the region action.
Initial work to make range tracking work
Still a ton of bugs
Fix some issues with the new stuff.
* Fix texture flush instability
There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it)
* Find the destination texture for Buffer->Texture full copy
Greatly improves performance for nvdec videos (with range tracking)
* Further improve texture tracking
* Disable Memory Tracking for view parents
This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice)
The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future.
* Introduce some tracking tests.
WIP
* Complete base tests.
* Add more tests for multiregion, fix existing test.
* Cleanup Part 1
* Remove unnecessary code from memory tracking
* Fix some inconsistencies with 3D texture rule.
* Add dispose tests.
* Use a background thread for the background context.
Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster.
Also nerf the multithreading test a bit.
* Copy to texture with matching alignment
This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size.
* Track reads for buffer copies. Synchronize new buffers before copying overlaps.
* Remove old texture flushing mechanisms.
Range tracking all the way, baby.
* Wake the background thread when disposing.
Avoids a deadlock when games are closed.
* Address Feedback 1
* Separate TextureCopy instance for background thread
Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread.
* Add missing XML docs.
* Address Feedback
* Maybe I should start drinking coffee.
* Some more feedback.
* Remove flush warning, Refocus window after making background context
2020-10-16 22:18:35 +02:00
|
|
|
|
|
|
|
_memoryTrackingGranular?.Dispose();
|
|
|
|
_memoryTracking?.Dispose();
|
2021-01-17 21:08:06 +01:00
|
|
|
|
|
|
|
_context.Renderer.DeleteBuffer(Handle);
|
2019-10-13 08:02:07 +02:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|