-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] Add ccmp
and enable conditional compares for X64
#110826
base: main
Are you sure you want to change the base?
Conversation
Update comments. Merge the REX2 changes into the original legacy emit path bug fix: Set REX2.W with correct mask code. register encoding and prefix emitting logics. Add REX2 prefix emit logic bug fixes Add Stress mode for REX2 encoding and some bug fixes resolve comments: 1. add assertion check for UD opcodes. 2. add checks for EGPRs. Add REX2 to emitOutputAM, and let LEA to be REX2 compatible. Add REX2.X encoding for SIB byte But fixes: add REX2 prefix on the path in RI where MOV is specially handled. Enable REX2 encoding for `movups` fixed bugs in REX2 prefix emitting logic when working with map 1 instructions, and enabled REX2 for POPCNT legacy map index-er bug fixes some clean-up Adding initial APX unit testing path. Adding a coredistools dll that has LLVM APX disasm capability. It must be coppied into a CORE_ROOT manually. clean up work for REX2 narrow the REX2 scope to `sub` only some clean up based on the comments. bug fix resolve comment
- SV path is mostly for debugging purposes Added encoding unit tests for instructions with immediates
Code refactoring: AddX86PrefixIfNeeded.
… missing in JIT, may indicate these instructions are not being used in JIT, drop them for now.
…lled before adding any prefix.
Refactor REX2 encoding stress logics.
(this will have side effect that the estimated code will go up and mismatch with actual code size.)
…om LOCK prefixd instructions.
…otion from LOCK prefixd instructions." This reverts commit 1be4b12.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
1. Emitter unit testsThe left is output from 2. Intel SDE testingRun the suite in SDE with Run the suite in SDE with #3. SuperPMI results Base is with Diffs are based on 2,623,457 contexts (1,043,127 MinOpts, 1,580,330 FullOpts). MISSED contexts: 2,983 (0.11%) Base JIT options: JitBypassApxCheck=1 Diff JIT options: JitBypassApxCheck=1;JitEnableApxIfConv=1 Overall (-165,684 bytes)
FullOpts (-165,684 bytes)
|
984c510
to
df08843
Compare
Happy to see this. I have one request. Can you please split this PR into two? A first PR that adds the necessary support to codegen/emit to generate the new instructions. The second PR to enable the existing support for emitting conditional compares in the middle-end/lowering? All the work should be in the former PR, which should be free of diffs, while the latter PR should mostly be unifying and enabling the existing support that exists for arm64. Also, the title is inaccurate. If-conversion is enabled for x64 already. The use of conditional compares is an orthogonal optimization to if-conversion. |
ccmp
and enable if-conversion for X86ccmp
and enable conditional compares for X64
Will do. I plan to get this PR in a clean build/run shape and then will split. |
df08843
to
c5fae2e
Compare
Major changes include: 1. Adding `ccmp` (and promoted cmp) logic into emitter backend. 2. Hooking `TryLowerAndOrToCCMP` from ARM pathway into Intel pathway. 3. Hooking `optimizeCompareChainCondBlock` from ARM pathway into Intel pathway.
c5fae2e
to
519e290
Compare
Overview
This PR is built on top of #108796.
This PR adds APX new
ccmp
instruction to the X86 backend, and enables some of the existing if-conversion functionality for X86.Design
Currently, the if-conversion optimization is hidden behind a flag
DOTNET_JitEnableApxIfConv
and defaults to 0.For reference, there is a unique extended evex encoding for
ccmp
:where
SC0
-SC3
encode the condition forccmp
to conditionally execute on (please see SDM Vol 1, Appendix B). If the status codes fail to satisfy the condition encoded bySC0
-SC3
, no compare will be performed, and theOF
,SF,
ZF, andCF
flags will be set to the default flag value (DFV) fieldsof
,sf
,zf
andcf
.Testing
Note: The testing plan for APX work has been discussed in #106557, please refer to that PR for details, only results and comments will be posted in this PR. Results posted below.