Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/coreclr/jit/codegen.h
Original file line number Diff line number Diff line change
Expand Up @@ -1336,6 +1336,7 @@ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
#ifdef TARGET_ARM64
void genCodeForJumpCompare(GenTreeOp* tree);
void genCodeForMadd(GenTreeOp* tree);
void genCodeForMsub(GenTreeOp* tree);
void genCodeForBfiz(GenTreeOp* tree);
void genCodeForAddEx(GenTreeOp* tree);
#endif // TARGET_ARM64
Expand Down
27 changes: 26 additions & 1 deletion src/coreclr/jit/codegenarm64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10023,7 +10023,7 @@ void CodeGen::instGen_MemoryBarrier(BarrierKind barrierKind)
}

//-----------------------------------------------------------------------------------
// genCodeForMadd: Emit a madd/msub (Multiply-Add) instruction
// genCodeForMadd: Emit a madd (Multiply-Add) instruction
//
// Arguments:
// tree - GT_MADD tree where op1 or op2 is GT_ADD
Expand Down Expand Up @@ -10067,6 +10067,31 @@ void CodeGen::genCodeForMadd(GenTreeOp* tree)
genProduceReg(tree);
}

//-----------------------------------------------------------------------------------
// genCodeForMsub: Emit a msub (Multiply-Subtract) instruction
//
// Arguments:
// tree - GT_MSUB tree where op2 is GT_MUL
//
void CodeGen::genCodeForMsub(GenTreeOp* tree)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered to just handle it as part of GT_MADD? I bet it'd be much less lines of code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider it - I guess it's a design choice whether or not to use GT_MADD or introduce GT_MSUB.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually conflicted on it. On the one hand, just using GT_MADD is sufficient, but on the other hand, the codegen for GT_MADD is a bit complicated since we are making GT_NEG nodes as contained.

The goal of introducing GT_MSUB and its code-gen was to make it easy to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name GT_MADD as a lowering-only op that is specific for ARM64 which reflects the actual instruction 'madd' - at least for me, I wouldn't it to expect to emit 'msub', but I totally understand why it does though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider actually just making GT_MADD only emit 'madd' ? Then you could do the decision to use GT_MADD or GT_MSUB in lowering rather than in code-gen. It would make containment less complicated and code-gen simpler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider actually just making GT_MADD only emit 'madd'

I like this and it makes it have "parity" with GT_ADD and GT_SUB.

{
assert(tree->OperIs(GT_MSUB) && varTypeIsIntegral(tree) && !(tree->gtFlags & GTF_SET_FLAGS));
genConsumeOperands(tree);

assert(tree->gtGetOp2()->OperIs(GT_MUL));
assert(tree->gtGetOp2()->isContained());

GenTree* a = tree->gtGetOp1();
GenTree* b = tree->gtGetOp2()->gtGetOp1();
GenTree* c = tree->gtGetOp2()->gtGetOp2();

// d = a - b * c
// MSUB d, b, c, a
GetEmitter()->emitIns_R_R_R_R(INS_msub, emitActualTypeSize(tree), tree->GetRegNum(), b->GetRegNum(), c->GetRegNum(),
a->GetRegNum());
genProduceReg(tree);
}

//------------------------------------------------------------------------
// genCodeForBfiz: Generates the code sequence for a GenTree node that
// represents a bitfield insert in zero with sign/zero extension.
Expand Down
4 changes: 4 additions & 0 deletions src/coreclr/jit/codegenarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,10 @@ void CodeGen::genCodeForTreeNode(GenTree* treeNode)
genCodeForMadd(treeNode->AsOp());
break;

case GT_MSUB:
genCodeForMsub(treeNode->AsOp());
break;

case GT_INC_SATURATE:
genCodeForIncSaturate(treeNode);
break;
Expand Down
4 changes: 3 additions & 1 deletion src/coreclr/jit/gtlist.h
Original file line number Diff line number Diff line change
Expand Up @@ -220,8 +220,10 @@ GTNODE(MUL_LONG , GenTreeOp ,1,GTK_BINOP|DBK_NOTHIR)
GTNODE(AND_NOT , GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR)

#ifdef TARGET_ARM64
GTNODE(MADD , GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR) // Generates the Multiply-Add instruction (madd/msub) In the future, we might consider
GTNODE(MADD , GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR) // Generates the Multiply-Add instruction. In the future, we might consider
// enabling it for both armarch and xarch for floating-point MADD "unsafe" math.
GTNODE(MSUB , GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR) // Generates the Multiply-Subtract instruction. In the future, we might consider
// enabling it for both armarch and xarch for floating-point MSUB "unsafe" math.
GTNODE(ADDEX, GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR) // Add with sign/zero extension.
GTNODE(BFIZ , GenTreeOp ,0,GTK_BINOP|DBK_NOTHIR) // Bitfield Insert in Zero.
#endif
Expand Down
16 changes: 16 additions & 0 deletions src/coreclr/jit/lowerarmarch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1673,6 +1673,22 @@ void Lowering::ContainCheckBinary(GenTreeOp* node)
}
}

// Find "a - b * c" in order to emit MSUB
if (comp->opts.OptimizationEnabled() && varTypeIsIntegral(node) && !node->isContained() && node->OperIs(GT_SUB) &&
!node->gtOverflow() && op2->OperIs(GT_MUL) && !op2->isContained() && !op2->gtOverflow() &&
varTypeIsIntegral(op2))
{
GenTree* a = op1;
GenTree* b = op2->gtGetOp1();
GenTree* c = op2->gtGetOp2();

if (!a->isContained() && !b->isContained() && !c->isContained())
{
node->ChangeOper(GT_MSUB);
MakeSrcContained(node, op2);
}
}

// Change ADD TO ADDEX for ADD(X, CAST(Y)) or ADD(CAST(X), Y) where CAST is int->long
// or for ADD(LSH(X, CNS), X) or ADD(X, LSH(X, CNS)) where CNS is in the (0..typeWidth) range
if (node->OperIs(GT_ADD) && !op1->isContained() && !op2->isContained() && varTypeIsIntegral(node) &&
Expand Down