1//===- HashRecognize.cpp ----------------------------------------*- C++ -*-===//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
7//===----------------------------------------------------------------------===//
9// The HashRecognize analysis recognizes unoptimized polynomial hash functions
10// with operations over a Galois field of characteristic 2, also called binary
11// fields, or GF(2^n). 2^n is termed the order of the Galois field. This class
12// of hash functions can be optimized using a lookup-table-driven
13// implementation, or with target-specific instructions.
17// 1. Cyclic redundancy check (CRC), which is a polynomial division in GF(2).
18// 2. Rabin fingerprint, a component of the Rabin-Karp algorithm, which is a
19// rolling hash polynomial division in GF(2).
20// 3. Rijndael MixColumns, a step in AES computation, which is a polynomial
21// multiplication in GF(2^3).
22// 4. GHASH, the authentication mechanism in AES Galois/Counter Mode (GCM),
23// which is a polynomial evaluation in GF(2^128).
25// All of them use an irreducible generating polynomial of degree m,
27// c_m * x^m + c_(m-1) * x^(m-1) + ... + c_0 * x^0
29// where each coefficient c is can take values 0 or 1. The polynomial is simply
30// represented by m+1 bits, corresponding to the coefficients. The different
31// variants of CRC are named by degree of generating polynomial used: so CRC-32
32// would use a polynomial of degree 32.
34// The reason algorithms on GF(2^n) can be optimized with a lookup-table is the
35// following: in such fields, polynomial addition and subtraction are identical
36// and equivalent to XOR, polynomial multiplication is an AND, and polynomial
37// division is identity: the XOR and AND operations in unoptimized
38// implementations are performed bit-wise, and can be optimized to be performed
39// chunk-wise, by interleaving copies of the generating polynomial, and storing
40// the pre-computed values in a table.
42// A generating polynomial of m bits always has the MSB set, so we usually
43// omit it. An example of a 16-bit polynomial is the CRC-16-CCITT polynomial:
45// (x^16) + x^12 + x^5 + 1 = (1) 0001 0000 0010 0001 =わ 0x1021
47// Transmissions are either in big-endian or little-endian form, and hash
48// algorithms are written according to this. For example, IEEE 802 and RS-232
49// specify little-endian transmission.
51//===----------------------------------------------------------------------===//
53// At the moment, we only recognize the CRC algorithm.
54// Documentation on CRC32 from the kernel:
55// https://www.kernel.org/doc/Documentation/crc32.txt
58//===----------------------------------------------------------------------===//
74 #define DEBUG_TYPE "hash-recognize"
76/// Checks if there's a stray instruction in the loop \p L outside of the
77/// use-def chains from \p Roots, or if we escape the loop during the use-def
85 while (!Worklist.
empty()) {
92 for (
const Use &U :
I->operands()) {
100 return Latch->
size() != Visited.
size();
103/// A structure that can hold either a Simple Recurrence or a Conditional
104/// Recurrence. Note that in the case of a Simple Recurrence, Step is an operand
105/// of the BO, while in a Conditional Recurrence, it is a SelectInst.
118 OS.
indent(Indent) <<
"Phi: ";
121 OS.
indent(Indent) <<
"BinaryOperator: ";
124 OS.
indent(Indent) <<
"Start: ";
127 OS.
indent(Indent) <<
"Step: ";
131 OS.
indent(Indent) <<
"ExtraConst: ";
137#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
152/// Check the well-formedness of the (most|least) significant bit check given \p
153/// ConditionalRecurrence, \p SimpleRecurrence, depending on \p
154/// ByteOrderSwapped. We check that ConditionalRecurrence.Step is a
155/// Select(Cmp()) where the compare is `>= 0` in the big-endian case, and `== 0`
156/// in the little-endian case (or the inverse, in which case the branches of the
157/// compare are swapped). We check that the LHS is (ConditionalRecurrence.Phi
158/// [xor SimpleRecurrence.Phi]) in the big-endian case, and additionally check
159/// for an AND with one in the little-endian case. We then check AllowedByR
160/// against CheckAllowedByR, which is [0, smin) in the big-endian case, and is
161/// [0, 1) in the little-endian case. CheckAllowedByR checks for
162/// significant-bit-clear, and we match the corresponding arms of the select
163/// against bit-shift and bit-shift-and-xor-gen-poly.
167 bool ByteOrderSwapped) {
177 // Match predicate with or without a SimpleRecurrence (the corresponding data
183 bool LWellFormed = ByteOrderSwapped ?
match(L, MatchPred)
197 if (AllowedByR == CheckAllowedByR)
198 return TV == BitShift &&
201 if (AllowedByR.inverse() == CheckAllowedByR)
202 return FV == BitShift &&
208/// Wraps llvm::matchSimpleRecurrence. Match a simple first order recurrence
209/// cycle of the form:
212/// %rec = phi [%start, %entry], [%BO, %loop]
214/// %BO = binop %rec, %step
219/// %rec = phi [%start, %entry], [%BO, %loop]
221/// %BO = binop %step, %rec
231/// Digs for a recurrence starting with \p V hitting the PHI node in a use-def
232/// chain. Used by matchConditionalRecurrence.
238 while (!Worklist.
empty()) {
241 // Don't add a PHI's operands to the Worklist.
245 // Find a recurrence over a BinOp, by matching either of its operands
246 // with with the PHINode.
250 // Bind to ExtraConst, if we match exactly one.
251 if (
I->getOpcode() == BOWithConstOpToMatch) {
259 // Continue along the use-def chain.
260 for (Use &U :
I->operands())
268/// A Conditional Recurrence is a recurrence of the form:
271/// %rec = phi [%start, %entry], [%step, %loop]
273/// %step = select _, %tv, %fv
275/// where %tv and %fv ultimately end up using %rec via the same %BO instruction,
276/// after digging through the use-def chain.
278/// ExtraConst is relevant if \p BOWithConstOpToMatch is supplied: when digging
279/// the use-def chain, a BinOp with opcode \p BOWithConstOpToMatch is matched,
280/// and ExtraConst is a constant operand of that BinOp. This peculiarity exists,
281/// because in a CRC algorithm, the \p BOWithConstOpToMatch is an XOR, and the
282/// ExtraConst ends up being the generating polynomial.
286 if (
Phi->getNumIncomingValues() != 2)
289 for (
unsigned Idx = 0; Idx != 2; ++Idx) {
290 Value *FoundStep =
Phi->getIncomingValue(Idx);
291 Value *FoundStart =
Phi->getIncomingValue(!Idx);
294 if (!
match(FoundStep,
298 // For a conditional recurrence, both the true and false values of the
299 // select must ultimately end up in the same recurrent BinOp.
300 BinaryOperator *FoundBO = digRecurrence(TV, BOWithConstOpToMatch);
302 if (!FoundBO || FoundBO != AltBO)
305 if (BOWithConstOpToMatch != Instruction::BinaryOpsEnd && !
ExtraConst) {
306 LLVM_DEBUG(
dbgs() <<
"HashRecognize: Unable to match single BinaryOp "
307 "with constant in conditional recurrence\n");
319/// Iterates over all the phis in \p LoopLatch, and attempts to extract a
320/// Conditional Recurrence and an optional Simple Recurrence.
321static std::optional<std::pair<RecurrenceInfo, RecurrenceInfo>>
323 auto Phis = LoopLatch->
phis();
324 unsigned NumPhis = std::distance(Phis.begin(), Phis.end());
325 if (NumPhis != 2 && NumPhis != 3)
333 if (!SimpleRecurrence)
335 if (!ConditionalRecurrence)
337 &
P, Instruction::BinaryOps::Xor);
339 if (NumPhis == 3 && (!SimpleRecurrence || !ConditionalRecurrence))
341 return std::make_pair(SimpleRecurrence, ConditionalRecurrence);
350/// Generate a lookup table of 256 entries by interleaving the generating
351/// polynomial. The optimization technique of table-lookup for CRC is also
352/// called the Sarwate algorithm.
354 bool ByteOrderSwapped) {
359 if (ByteOrderSwapped) {
361 for (
unsigned I = 1;
I < 256;
I <<= 1) {
362 CRCInit = CRCInit.
shl(1) ^
364 for (
unsigned J = 0; J <
I; ++J)
365 Table[
I + J] = CRCInit ^ Table[J];
370 APInt CRCInit(BW, 1);
371 for (
unsigned I = 128;
I;
I >>= 1) {
373 for (
unsigned J = 0; J < 256; J += (
I << 1))
374 Table[
I + J] = CRCInit ^ Table[J];
379/// Checks that \p P1 and \p P2 are used together in an XOR in the use-def chain
380/// of \p SI's condition, ignoring any casts. The purpose of this function is to
381/// ensure that LHSAux from the SimpleRecurrence is used correctly in the CRC
384/// In other words, it checks for the following pattern:
387/// %P1 = phi [_, %entry], [%P1.next, %loop]
388/// %P2 = phi [_, %entry], [%P2.next, %loop]
390/// %xor = xor (CastOrSelf %P1), (CastOrSelf %P2)
392/// where %xor is in the use-def chain of \p SI's condition.
397 // matchConditionalRecurrence has already ensured that the SelectInst's
398 // condition is an Instruction.
401 while (!Worklist.
empty()) {
404 // Don't add a PHI's operands to the Worklist.
408 // If we match an XOR of the two PHIs ignoring casts, we're done.
413 // Continue along the use-def chain.
414 for (
const Use &U :
I->operands())
422// Recognizes a multiplication or division by the constant two, using SCEV. By
423// doing this, we're immune to whether the IR expression is mul/udiv or
424// equivalently shl/lshr. Return false when it is a UDiv, true when it is a Mul,
425// and std::nullopt otherwise.
427 if (!V->getType()->isIntegerTy())
438/// The main entry point for analyzing a loop and recognizing the CRC algorithm.
439/// Returns a PolynomialInfo on success, and a StringRef on failure.
441 if (!L.isInnermost())
442 return "Loop is not innermost";
445 const PHINode *IndVar = L.getCanonicalInductionVariable();
446 if (!Latch || !Exit || !IndVar || L.getNumBlocks() != 1)
447 return "Loop not in canonical form";
448 unsigned TC = SE.getSmallConstantTripCount(&L);
450 return "Unable to find a small constant byte-multiple trip count";
454 return "Found stray PHI";
455 auto [SimpleRecurrence, ConditionalRecurrence] = *R;
456 if (!ConditionalRecurrence)
457 return "Unable to find conditional recurrence";
459 // Make sure that all recurrences are either all SCEVMul with two or SCEVDiv
460 // with two, or in other words, that they're single bit-shifts.
461 std::optional<bool> ByteOrderSwapped =
463 if (!ByteOrderSwapped)
464 return "Loop with non-unit bitshifts";
465 if (SimpleRecurrence) {
467 return "Loop with non-unit bitshifts";
469 // Ensure that the PHIs have exactly two uses:
470 // the bit-shift, and the XOR (or a cast feeding into the XOR).
471 // Also ensure that the SimpleRecurrence's evolution doesn't have stray
473 if (!ConditionalRecurrence.Phi->hasNUses(2) ||
474 !SimpleRecurrence.Phi->hasNUses(2) ||
475 SimpleRecurrence.BO->getUniqueUndroppableUser() != SimpleRecurrence.Phi)
476 return "Recurrences have stray uses";
478 // Check that the SelectInst ConditionalRecurrence.Step is conditional on
479 // the XOR of SimpleRecurrence.Phi and ConditionalRecurrence.Phi.
481 SimpleRecurrence.Phi,
482 ConditionalRecurrence.Phi, L))
483 return "Recurrences not intertwined with XOR";
486 // Make sure that the TC doesn't exceed the bitwidth of LHSAux, or LHS.
487 Value *LHS = ConditionalRecurrence.Start;
488 Value *LHSAux = SimpleRecurrence ? SimpleRecurrence.Start :
nullptr;
490 : LHS->getType()->getIntegerBitWidth()))
491 return "Loop iterations exceed bitwidth of data";
493 // Make sure that the computed value is used in the exit block: this should be
494 // true even if it is only really used in an outer loop's exit block, since
495 // the loop is in LCSSA form.
497 if (
none_of(ComputedValue->users(), [Exit](
User *U) {
498 auto *UI = dyn_cast<Instruction>(U);
499 return UI && UI->getParent() == Exit;
501 return "Unable to find use of computed value in loop exit block";
503 assert(ConditionalRecurrence.ExtraConst &&
504 "Expected ExtraConst in conditional recurrence");
505 const APInt &GenPoly = *ConditionalRecurrence.ExtraConst;
509 return "Malformed significant-bit check";
515 if (SimpleRecurrence)
518 return "Found stray unvisited instructions";
520 return PolynomialInfo(TC, LHS, GenPoly, ComputedValue, *ByteOrderSwapped,
525 for (
unsigned I = 0;
I < 256;
I++) {
526 (*this)[
I].print(OS,
false);
527 OS << (
I % 16 == 15 ?
'\n' :
' ');
531#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
536 if (!L.isInnermost())
538 OS <<
"HashRecognize: Checking a loop in '"
539 << L.getHeader()->getParent()->getName() <<
"' from " << L.getLocStr()
542 if (!std::holds_alternative<PolynomialInfo>(Ret)) {
543 OS <<
"Did not find a hash algorithm\n";
544 if (std::holds_alternative<StringRef>(Ret))
545 OS <<
"Reason: " << std::get<StringRef>(Ret) <<
"\n";
549 auto Info = std::get<PolynomialInfo>(Ret);
550 OS <<
"Found" << (Info.ByteOrderSwapped ?
" big-endian " :
" little-endian ")
551 <<
"CRC-" << Info.RHS.getBitWidth() <<
" loop with trip count "
552 << Info.TripCount <<
"\n";
553 OS.
indent(2) <<
"Initial CRC: ";
556 OS.
indent(2) <<
"Generating polynomial: ";
557 Info.RHS.print(OS,
false);
559 OS.
indent(2) <<
"Computed CRC: ";
560 Info.ComputedValue->print(OS);
563 OS.
indent(2) <<
"Auxiliary data: ";
564 Info.LHSAux->print(OS);
567 OS.
indent(2) <<
"Computed CRC lookup table:\n";
571#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
577 if (std::holds_alternative<PolynomialInfo>(Res))
578 return std::get<PolynomialInfo>(Res);
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
This file implements a class to represent arbitrary precision integral constant values and operations...
static GCRegistry::Add< CoreCLRGC > E("coreclr", "CoreCLR-compatible GC")
#define LLVM_DUMP_METHOD
Mark debug helper function definitions like dump() that should not be stripped from debug builds.
static bool containsUnreachable(const Loop &L, ArrayRef< const Instruction * > Roots)
Checks if there's a stray instruction in the loop L outside of the use-def chains from Roots,...
static bool isSignificantBitCheckWellFormed(const RecurrenceInfo &ConditionalRecurrence, const RecurrenceInfo &SimpleRecurrence, bool ByteOrderSwapped)
Check the well-formedness of the (most|least) significant bit check given ConditionalRecurrence,...
static bool isConditionalOnXorOfPHIs(const SelectInst *SI, const PHINode *P1, const PHINode *P2, const Loop &L)
Checks that P1 and P2 are used together in an XOR in the use-def chain of SI's condition,...
static std::optional< std::pair< RecurrenceInfo, RecurrenceInfo > > getRecurrences(BasicBlock *LoopLatch, const PHINode *IndVar, const Loop &L)
Iterates over all the phis in LoopLatch, and attempts to extract a Conditional Recurrence and an opti...
static std::optional< bool > isBigEndianBitShift(Value *V, ScalarEvolution &SE)
This header provides classes for managing per-loop analyses.
Class for arbitrary precision integers.
unsigned getBitWidth() const
Return the number of bits in the APInt.
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
APInt shl(unsigned shiftAmt) const
Left-shift function.
bool isSignBitSet() const
Determine if sign bit of this APInt is set.
static APInt getZero(unsigned numBits)
Get the '0' value for the specified bit-width.
APInt lshr(unsigned shiftAmt) const
Logical right-shift function.
ArrayRef - Represent a constant reference to an array (0 or more elements consecutively in memory),...
LLVM Basic Block Representation.
iterator_range< const_phi_iterator > phis() const
Returns a range that iterates over the phis in the basic block.
const Instruction * getTerminator() const LLVM_READONLY
Returns the terminator instruction if the block is well formed or null if the block is not well forme...
An abstraction over a floating-point predicate, and a pack of an integer predicate with samesign info...
This class represents a range of values.
static LLVM_ABI ConstantRange fromKnownBits(const KnownBits &Known, bool IsSigned)
Initialize a range based on a known bits constraint.
static LLVM_ABI ConstantRange makeAllowedICmpRegion(CmpInst::Predicate Pred, const ConstantRange &Other)
Produce the smallest range such that all values that may satisfy the given predicate with any value c...
PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM, LoopStandardAnalysisResults &AR, LPMUpdater &)
static CRCTable genSarwateTable(const APInt &GenPoly, bool ByteOrderSwapped)
Generate a lookup table of 256 entries by interleaving the generating polynomial.
std::optional< PolynomialInfo > getResult() const
LLVM_DUMP_METHOD void dump() const
HashRecognize(const Loop &L, ScalarEvolution &SE)
void print(raw_ostream &OS) const
std::variant< PolynomialInfo, StringRef > recognizeCRC() const
The main entry point for analyzing a loop and recognizing the CRC algorithm.
This class provides an interface for updating the loop pass manager based on mutations to the loop ne...
Represents a single loop in the control flow graph.
Value * getIncomingValueForBlock(const BasicBlock *BB) const
A set of analyses that are preserved following a run of a transformation pass.
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
This class represents an analyzed expression in the program.
The main scalar evolution driver.
LLVM_ABI const SCEV * getSCEV(Value *V)
Return a SCEV expression for the full generality of the specified expression.
This class represents the LLVM 'select' instruction.
std::pair< iterator, bool > insert(PtrType Ptr)
Inserts Ptr if and only if there is no element in the container equal to Ptr.
SmallPtrSet - This class implements a set which is optimized for holding SmallSize or less elements.
void push_back(const T &Elt)
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
LLVM_ABI unsigned getIntegerBitWidth() const
A Use represents the edge between a Value definition and its users.
LLVM Value Representation.
Type * getType() const
All values are typed, get the type of this value.
This class implements an extremely fast bulk output stream that can only output to a stream.
raw_ostream & indent(unsigned NumSpaces)
indent - Insert 'NumSpaces' spaces.
@ C
The default llvm calling convention, compatible with C.
ap_match< APInt > m_APInt(const APInt *&Res)
Match a ConstantInt or splatted ConstantVector, binding the specified pointer to the contained APInt.
BinaryOp_match< LHS, RHS, Instruction::And, true > m_c_And(const LHS &L, const RHS &R)
Matches an And with LHS and RHS in either order.
specific_intval< false > m_SpecificInt(const APInt &V)
Match a specific integer value or vector with all elements equal to the value.
bool match(Val *V, const Pattern &P)
bind_ty< Instruction > m_Instruction(Instruction *&I)
Match an instruction, capturing it if we match.
specificval_ty m_Specific(const Value *V)
Match if we have a specific specified value.
cst_pred_ty< is_one > m_One()
Match an integer 1 or a vector with all elements equal to 1.
ThreeOps_match< Cond, LHS, RHS, Instruction::Select > m_Select(const Cond &C, const LHS &L, const RHS &R)
Matches SelectInst.
BinaryOp_match< LHS, RHS, Instruction::Xor, true > m_c_Xor(const LHS &L, const RHS &R)
Matches an Xor with LHS and RHS in either order.
class_match< CmpInst > m_Cmp()
Matches any compare instruction and ignore it.
class_match< Value > m_Value()
Match an arbitrary value and ignore it.
AnyBinaryOp_match< LHS, RHS, true > m_c_BinOp(const LHS &L, const RHS &R)
Matches a BinaryOperator with LHS and RHS in either order.
CmpClass_match< LHS, RHS, ICmpInst > m_ICmp(CmpPredicate &Pred, const LHS &L, const RHS &R)
match_combine_or< match_combine_or< CastInst_match< OpTy, ZExtInst >, CastInst_match< OpTy, TruncInst > >, OpTy > m_ZExtOrTruncOrSelf(const OpTy &Op)
match_combine_or< LTy, RTy > m_CombineOr(const LTy &L, const RTy &R)
Combine two pattern matchers matching L || R.
bind_ty< const SCEVMulExpr > m_scev_Mul(const SCEVMulExpr *&V)
bool match(const SCEV *S, const Pattern &P)
SCEVBinaryExpr_match< SCEVUDivExpr, Op0_t, Op1_t > m_scev_UDiv(const Op0_t &Op0, const Op1_t &Op1)
cst_pred_ty< is_specific_cst > m_scev_SpecificInt(uint64_t V)
Match an SCEV constant with a plain unsigned integer.
class_match< const SCEV > m_SCEV()
This is an optimization pass for GlobalISel generic memory operations.
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
AnalysisManager< Loop, LoopStandardAnalysisResults & > LoopAnalysisManager
The loop analysis manager.
LLVM_ABI bool matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO, Value *&Start, Value *&Step)
Attempt to match a simple first order recurrence cycle of the form: iv = phi Ty [Start,...
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
bool none_of(R &&Range, UnaryPredicate P)
Provide wrappers to std::none_of which take ranges instead of having to pass begin/end explicitly.
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
A structure that can hold either a Simple Recurrence or a Conditional Recurrence.
LLVM_DUMP_METHOD void dump() const
bool matchConditionalRecurrence(const PHINode *P, Instruction::BinaryOps BOWithConstOpToMatch=Instruction::BinaryOpsEnd)
A Conditional Recurrence is a recurrence of the form:
void print(raw_ostream &OS, unsigned Indent=0) const
std::optional< APInt > ExtraConst
bool matchSimpleRecurrence(const PHINode *P)
Wraps llvm::matchSimpleRecurrence.
RecurrenceInfo(const Loop &L)
A custom std::array with 256 entries, that also has a print function.
LLVM_DUMP_METHOD void dump() const
void print(raw_ostream &OS) const
static KnownBits makeConstant(const APInt &C)
Create known bits from a known constant.
unsigned getBitWidth() const
Get the bit width of this value.
The adaptor from a function pass to a loop pass computes these analyses and makes them available to t...
The structure that is returned when a polynomial algorithm was recognized by the analysis.
PolynomialInfo(unsigned TripCount, Value *LHS, const APInt &RHS, Value *ComputedValue, bool ByteOrderSwapped, Value *LHSAux=nullptr)