Skip to content

x86: fix YMM FMA p-code temporaries truncated to 128 bits#9197

Open
0xDI wants to merge 1 commit into
NationalSecurityAgency:masterfrom
0xDI:fix/x86-fma-ymm-tmp-size
Open

x86: fix YMM FMA p-code temporaries truncated to 128 bits#9197
0xDI wants to merge 1 commit into
NationalSecurityAgency:masterfrom
0xDI:fix/x86-fma-ymm-tmp-size

Conversation

@0xDI
Copy link
Copy Markdown

@0xDI 0xDI commented May 18, 2026

Fixes #9184

All 36 YMM-form FMA instructions in fma.sinc declared local tmp:16 (128-bit) for their p-code temporary. The pcodeop return value was truncated to 128 bits before being zero-extended into the 256-bit ZmmReg destination, silently zeroing the upper 128 bits of any accumulated result on each iteration. This breaks correct emulation of vectorized multiply-accumulate loops in the p-code emulator and concolic engine.

XMM forms correctly use tmp:16 (128-bit). YMM forms require tmp:32 (256-bit). Changed all 36 affected definitions accordingly.

All 36 YMM-form FMA instructions declared local tmp:16 (128-bit), causing
the pcodeop return value to be truncated before zero-extension into the
256-bit ZmmReg destination. The upper 128 bits of accumulated YMM results
were silently zeroed each iteration, breaking emulation of vectorized
multiply-accumulate loops.

XMM forms correctly use tmp:16 (128-bit). YMM forms require tmp:32 (256-bit).

Resolves NationalSecurityAgency#9184
@ryanmkurtz ryanmkurtz added Feature: Processor/x86 Status: Triage Information is being gathered labels May 19, 2026
@GhidorahRex GhidorahRex added Reason: Internal effort This will be solved internally and removed Status: Triage Information is being gathered labels May 19, 2026
CryptoJones added a commit to CryptoJones/GayHydra that referenced this pull request May 21, 2026
… temporaries truncated to 128 bits (#136)

Cherry-picked from NationalSecurityAgency#9197 (closes upstream issue NationalSecurityAgency#9184).

Original commit: NSA/ghidra@47ff5cd357a60c9649a91b6ec8e331c1a0db7b3f
Original author: 0xDI <0xDI@users.noreply.github.com>

Co-authored-by: 0xDI <0xDI@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FMA YMM definitions use 128-bit temp instead of 256-bit

3 participants