-
Notifications
You must be signed in to change notification settings - Fork 821
[SM6.10] Reserve LinAlg HLSL intrinsics #8026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
You can test this locally with the following command:git-clang-format --diff 8e2e89998ac12cc346cc00cd119fe5b709139af8 25fd2c81cf8fa5dfd7cececf0ec12ed6178f7ccc -- include/dxc/HlslIntrinsicOp.h lib/HLSL/HLOperationLower.cppView the diff from clang-format here.diff --git a/lib/HLSL/HLOperationLower.cpp b/lib/HLSL/HLOperationLower.cpp
index 774a6bf6..956ac2ca 100644
--- a/lib/HLSL/HLOperationLower.cpp
+++ b/lib/HLSL/HLOperationLower.cpp
@@ -7563,8 +7563,8 @@ constexpr IntrinsicLower gLowerTable[] = {
DXIL::OpCode::MatrixAccumulate},
{IntrinsicOp::IOP___builtin_LinAlg_MatrixMatrixMultiply, EmptyLower,
DXIL::OpCode::MatrixMulOp},
- {IntrinsicOp::IOP___builtin_LinAlg_MatrixMatrixMultiplyAccumulate, EmptyLower,
- DXIL::OpCode::MatrixMulOp},
+ {IntrinsicOp::IOP___builtin_LinAlg_MatrixMatrixMultiplyAccumulate,
+ EmptyLower, DXIL::OpCode::MatrixMulOp},
{IntrinsicOp::IOP___builtin_LinAlg_MatrixQueryAccumulatorLayout, EmptyLower,
DXIL::OpCode::MatrixQueryAccumulatorLayout},
{IntrinsicOp::IOP___builtin_LinAlg_MatrixAccumulateToDescriptor, EmptyLower,
|
bob80905
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only thing I can think of is whether or not issues should be made for your todo's.
|
I'm leaning towards no on issues for the TODOs because they essentially already exist! There is an issue for each intrinsic already filed (#7895 as an example) and in order for all those intrinsic issues to be closed all the todos in this PR would need to be resolved. Do you think that is good enough or would you still like separate issues? (also a peak behind the veil, I have already fixed a lot of the TODOs in my local changes) |
inbelic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two little nits, otherwise LGTM
| IOP_WorldToObject = 99, | ||
| IOP_WorldToObject3x4 = 100, | ||
| IOP_WorldToObject4x3 = 101, | ||
| IOP___builtin_LinAlg_CopyConvertMatrix = 405, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is there a required order for these? In the other files they seemed to be sorted by opcode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, this is an interesting topic. There is some info below but there are a lot of moving parts. Happy to chat more if below is unclear
- This is generated code, and this is where it was placed by the generator
- They are actually ordered, but they are ordered alphabetically not by opcode value. The extra
__for__builtinsomewhat hides that fact since it's not visually obvious - The values here are internal so they can be renumbered without issue but it was decided they shouldn't be
This may raise the question "how is it that 99% of the opcodes have the number that matches their alphabetic order" to which the answer is: until very recently the generator would renumber every opcode any time a new intrinsic was added. This created a lot of ordering churn, large diffs, and merge conflicts and was generally not a good thing to be doing so it was turned off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, sounds good
Sounds good enough to me |
Similar to #8015 this PR reserved HL opcodes for the HLSL intrinsics required for the LinAlg feature.
The Codegen for each function is set to
EmptyLowerwhich raises a not implemented error if the intrinsic is seen.lib/HLSL/HLOperationLower.cppandutils/hct/gen_intrin_main.txtare the real changes, the rest is generated code.