Cogwheels of doom - constant DX12 crashes (client-caused) - UNFIXED as of 0.4

Same Issue here, Freeze / COG WHEEL

Ryzen 7 9800XD3
RTX 5080
32 GB RAM

Running DX12. And NVIDIA DLSS
Última edição por Scarface335#4801 em 27 de jan. de 2026 10:28:46
I tried again to look into asm and what happens.

And I saw this code:

"

.text:0000000140FF304E mov rcx, [rbx+28h]
.text:0000000140FF3052 mov edx, edi
.text:0000000140FF3054 mov rax, [rcx]
.text:0000000140FF3057 call qword ptr [rax+40h] ; call to IDXGISwapChain::Present -> frame loop
.text:0000000140FF305A mov rcx, [rbx+28h]
.text:0000000140FF305E mov esi, eax
.text:0000000140FF3060 mov [rsp+48h+arg_0], 0
.text:0000000140FF3068 mov rdx, [rcx]
.text:0000000140FF306B mov r8, [rdx+88h] ; DXGI vtable ?? probably GetDeviceRemovedReason
.text:0000000140FF3072 lea rdx, [rsp+48h+arg_0]
.text:0000000140FF3077 call r8
.text:0000000140FF307A test eax, eax ; GetDeviceRemovedReason returned value
.text:0000000140FF307C js loc_140FF3155 ; jump signed -> HRESULT is large unsigned -> jump (ERROR CASE)
.text:0000000140FF3082 cmp qword ptr [rbx+80h], 0
.text:0000000140FF308A lea r9, [rbx+30h]
.text:0000000140FF308E mov ecx, [rsp+48h+arg_0]
.text:0000000140FF3092 loc_140FF3092: ; DATA XREF: .rdata:0000000143268E0C↓o
.text:0000000140FF3092 ; .rdata:0000000143268E1C↓o ...
.text:0000000140FF3092 mov [rsp+48h+var_18], r14
.text:0000000140FF3097 mov r14, 0AAAAAAAAAAAAAAABh
.text:0000000140FF30A1 mov dword ptr [rsp+48h+var_28], ecx
.text:0000000140FF30A5 jz short loc_140FF30D7
.text:0000000140FF30A7 mov r8, [r9+48h]
.text:0000000140FF30AB mov rax, r14
.text:0000000140FF30AE dec r8
.text:0000000140FF30B1 add r8, [r9+50h]
.text:0000000140FF30B5 mul r8
.text:0000000140FF30B8 shr rdx, 2
.text:0000000140FF30BC lea rax, [rdx+rdx*2]
.text:0000000140FF30C0 add rax, rax
.text:0000000140FF30C3 sub r8, rax
---- snip ----

text:0000000140FF3155 loc_140FF3155: ; CODE XREF: sub_140FF2FF0+8C↑j
.text:0000000140FF3155 ; DATA XREF: .pdata:00000001441D53E4↓o ...
.text:0000000140FF3155 mov rcx, [rbx+28h]
.text:0000000140FF3159 mov rax, [rcx]
.text:0000000140FF315C call qword ptr [rax+120h]
.text:0000000140FF3162 mov rdi, [rsp+48h+var_10]
.text:0000000140FF3167 mov ecx, 80000000h
.text:0000000140FF316C mov [rbx+0A4h], eax
.text:0000000140FF3172 lea eax, [rsi+rcx]
.text:0000000140FF3175 test ecx, eax
.text:0000000140FF3177 jnz short loc_140FF318E
.text:0000000140FF3179
.text:0000000140FF3179 loc_140FF3179: ; DATA XREF: .pdata:00000001441D53F0↓o
.text:0000000140FF3179 ; .pdata:00000001441D53FC↓o
.text:0000000140FF3179 cmp esi, 887A0005h
.text:0000000140FF317F jz short loc_140FF318E
.text:0000000140FF3181 mov rsi, [rsp+48h+arg_10]
.text:0000000140FF3186 xor al, al
.text:0000000140FF3188 add rsp, 40h
.text:0000000140FF318C pop rbx
.text:0000000140FF318D retn


the interesting part is :
"

.text:0000000140FF3092 mov [rsp+48h+var_18], r14
.text:0000000140FF3097 mov r14, 0AAAAAAAAAAAAAAABh
.text:0000000140FF30A1 mov dword ptr [rsp+48h+var_28], ecx
.text:0000000140FF30A5 jz short loc_140FF30D7
.text:0000000140FF30A7 mov r8, [r9+48h]
.text:0000000140FF30AB mov rax, r14
.text:0000000140FF30AE dec r8
.text:0000000140FF30B1 add r8, [r9+50h]
.text:0000000140FF30B5 mul r8
.text:0000000140FF30B8 shr rdx, 2
.text:0000000140FF30BC lea rax, [rdx+rdx*2]
.text:0000000140FF30C0 add rax, rax
.text:0000000140FF30C3 sub r8, rax


0AAAAAAAAAAAAAAABh -> is usually a magic value used by compilers to avoid using div instructions. Basically, multiplying by 0xAAAAAAAAAAAAAAAB and then shifting right (shr rdx, 2) is the mathematical equivalent of Dividing by 3.

So that entire code is just a "Divide by 3" operation followed by getting the Reminder (the Modulo) -> 0,1 or 2 stored in r8. Probably last 3 frames?

So my intuition tells me that this is a Circular buffer of size 3 and the modulo calculated gives the index to a write. Then the code should hit :
"

.text:0000000140FF3111 cmp rcx, 6
.text:0000000140FF3115 jbe short loc_140FF3137
.text:0000000140FF3117 lea rcx, [r11+1]
.text:0000000140FF311B mov [r9+50h], r10
.text:0000000140FF311F mov rax, r14
.text:0000000140FF3122 mul rcx
.text:0000000140FF3125 shr rdx, 2
.text:0000000140FF3129 lea rax, [rdx+rdx*2]
.text:0000000140FF312D add rax, rax
.text:0000000140FF3130 sub rcx, rax


Basically if the circular buffer is full it compares the RCX register with 6.


But it seems like both paths reach the
"
text:0000000140FF315C call qword ptr [rax+120h]

but with different values in [r9+0x48].

"
.text:0000000140FF315C call qword ptr [rax+120h]
.text:0000000140FF3162 mov rdi, [rsp+48h+var_10]
.text:0000000140FF3167 mov ecx, 80000000h
.text:0000000140FF316C mov [rbx+0A4h], eax
.text:0000000140FF3172 lea eax, [rsi+rcx]
.text:0000000140FF3175 test ecx, eax
.text:0000000140FF3177 jnz short loc_140FF318E
.text:0000000140FF3179
.text:0000000140FF3179 loc_140FF3179: ; DATA XREF: .pdata:00000001441D53F0↓o
.text:0000000140FF3179 ; .pdata:00000001441D53FC↓o
.text:0000000140FF3179 cmp esi, 887A0005h
.text:0000000140FF317F jz short loc_140FF318E
.text:0000000140FF3181 mov rsi, [rsp+48h+arg_10]
.text:0000000140FF3186 xor al, al
.text:0000000140FF3188 add rsp, 40h
.text:0000000140FF318C pop rbx
.text:0000000140FF318D retn
.text:0000000140FF318E ; ---------------------------------------------------------------------------
.text:0000000140FF318E
.text:0000000140FF318E loc_140FF318E: ; CODE XREF: sub_140FF2FF0+187↑j
.text:0000000140FF318E ; sub_140FF2FF0+18F↑j
.text:0000000140FF318E ; DATA XREF: ...
.text:0000000140FF318E mov rsi, [rsp+48h+arg_10]
.text:0000000140FF3193 mov al, 1
.text:0000000140FF3195 add rsp, 40h
.text:0000000140FF3199 pop rbx
.text:0000000140FF319A retn


So basically its something like : if RCX <= 6 and error code is not 0x887A0005 ( which is I think the code for DEVICE_REMOVED ) then al = 0
Otherwise -> al = 1.

So far so good. Now we know the al values based on certain condition. But who calls this function ?

Well, in windbg I've got this stack trace :
"
0:000> k
# Child-SP RetAddr Call Site
00 00000059`b63ff040 00007ff6`f3783396 PathOfExileSteam+0xff318e
01 00000059`b63ff090 00007ff6`f35f37f8 PathOfExileSteam+0x1013396
02 00000059`b63ff100 00007ff6`f35490bc PathOfExileSteam+0xe837f8
03 00000059`b63ff1a0 00007ff6`f35490e3 PathOfExileSteam+0xdd90bc
04 00000059`b63ff1d0 00007ff6`f3549f28 PathOfExileSteam+0xdd90e3
05 00000059`b63ff200 00007ff6`f3539e04 PathOfExileSteam+0xdd9f28
06 00000059`b63ff230 00007ff6`f3548fe2 PathOfExileSteam+0xdc9e04
07 00000059`b63ff2b0 00007ff6`f35f3279 PathOfExileSteam+0xdd8fe2
08 00000059`b63ff2e0 00007ff6`f35f4086 PathOfExileSteam+0xe83279
09 00000059`b63ff540 00007ff6`f286f8a3 PathOfExileSteam+0xe84086
0a 00000059`b63ff670 00007ff6`f2870176 PathOfExileSteam+0xff8a3
0b 00000059`b63ff6a0 00007ff6`f28703b7 PathOfExileSteam+0x100176
0c 00000059`b63ff890 00007ff6`f2870457 PathOfExileSteam+0x1003b7
0d 00000059`b63ff8c0 00007ff6`f4ee1e4a PathOfExileSteam+0x100457
0e 00000059`b63ff920 00007ffb`1ff9e8d7 PathOfExileSteam+0x2771e4a
0f 00000059`b63ff960 00007ffb`2144c53c KERNEL32!BaseThreadInitThunk+0x17
10 00000059`b63ff990 00000000`00000000 ntdll!RtlUserThreadStart+0x2c


Parent: (frame 1)

"

.text:0000000141013380 sub_141013380 proc near ; DATA XREF: .rdata:0000000142C53398↓o
.text:0000000141013380 ; .pdata:00000001441D6680↓o
.text:0000000141013380
.text:0000000141013380 pExceptionObject= byte ptr -48h
.text:0000000141013380 var_30 = byte ptr -30h
.text:0000000141013380
.text:0000000141013380 ; __unwind { // __CxxFrameHandler4
.text:0000000141013380 push rbx
.text:0000000141013382 sub rsp, 60h
.text:0000000141013386 mov rbx, rcx
.text:0000000141013389 mov rcx, [rcx+238h]
.text:0000000141013390 mov rax, [rcx]
.text:0000000141013393 call qword ptr [rax+18h]
.text:0000000141013396 test al, al
.text:0000000141013398 jnz short loc_1410133AA
.text:000000014101339A mov rcx, rbx
.text:000000014101339D call sub_14100E2D0
.text:00000001410133A2 test al, al
.text:00000001410133A4 jz loc_141013430
.text:00000001410133AA
.text:00000001410133AA loc_1410133AA: ; CODE XREF: sub_141013380+18↑j
.text:00000001410133AA mov eax, [rbx+2B8h]
.text:00000001410133B0 inc eax
.text:00000001410133B2 xor edx, edx
.text:00000001410133B4 div dword ptr [rbx+140h]
.text:00000001410133BA mov [rbx+2B8h], edx
.text:00000001410133C0 inc dword ptr [rbx+630h]
.text:00000001410133C6 inc dword ptr [rbx+634h]
.text:00000001410133CC inc dword ptr [rbx+638h]
.text:00000001410133D2 inc dword ptr [rbx+63Ch]
.text:00000001410133D8 inc dword ptr [rbx+640h]
...................................................................


Breaking it down it seems like :

"

.text:0000000141013393 call qword ptr [rax+18h] ; Calls sub_140FF2FF0
.text:0000000141013396 test al, al ; Did it return al=1?
.text:0000000141013398 jnz short loc_1410133AA ; IF YES: Jump to "Normal" Recovery


So if al = 0 it calls another function:

"
.text:000000014101339D call sub_14100E2D0
.text:00000001410133A2 test al, al
.text:00000001410133A4 jz loc_141013430



And the loc_141013430 is :

"
.text:0000000141013430 loc_141013430: ; CODE XREF: sub_141013380+24↑j
.text:0000000141013430 mov rcx, rbx
.text:0000000141013433 call sub_14100E400
.text:0000000141013438 lea rdx, aSwapChainPrese ; "swap_chain->Present() failed"
.text:000000014101343F lea rcx, [rsp+68h+var_30]
.text:0000000141013444 call sub_1401219D0
.text:0000000141013449 nop
.text:000000014101344A lea rdx, [rsp+68h+var_30]
.text:000000014101344F lea rcx, [rsp+68h+pExceptionObject]
.text:0000000141013454 call sub_140E52E40
.text:0000000141013459 lea rdx, stru_14342B410 ; pThrowInfo
.text:0000000141013460 lea rcx, [rsp+68h+pExceptionObject] ; pExceptionObject
.text:0000000141013465 call _CxxThrowException


Seems like the game is trying to crash/close/generate an exception if the second check returns an al == 0

Given the fact that the function has a
"
__unwind { // __CxxFrameHandler4
comment means that Windows begins Stack Unwinding when _CxxThrowException is called.

And for me, this might lead to a deadlock since if a NVIDIA clean-up code is called, it should send a callback / message back but the main thread is busy "Unwinding"

So basically my intuition is :

The hang is happening because Path of Exile 2 is throwing a C++ Exception on the Main Thread when the GPU stutters. During the cleanup of that exception, the NVIDIA Streamline interposer deadlocks while waiting for a window message callback that the blocked Main Thread can't provide. The hang occurs in sub_141013380 when Present() fails. The subsequent call to _CxxThrowException triggers a synchronous stack unwind that deadlocks within the Streamline interposer's window-message handshake."

I might be completely wrong, but its all I've got and I hope I give GGG some paths to investigate and maybe give them other ideas / scenarios that could get to the real root cause( Hoping they read this ).











Reportar Post do Fórum

Reportar Conta:

Tipo de Reporte

Informação Adicional