使用 CostModel 获取 LLVM IR 的 cpu 周期

2023-12-20

从LLVM 3.0开始，Analyse目录下有CostModel.cpp。参考它的文档，它说

该文件定义了成本模型分析。它为 LLVM-IR 提供了非常基本的成本估算。此分析使用代码生成器的服务来估算任何 IR 指令降低为机器指令时的成本。成本结果是无单位的，成本数字代表机器的吞吐量，假设所有负载都命中缓存，预测所有分支等。可以添加成本数字以比较两个或多个转换替代方案。

我想知道如何在 IR 文件上编译和使用此通道。带有适当命令的具体示例将是完美的。

以下是对我有用的示例：

待测试的主要功能文件

#include <iostream>
#include <string>
#include <llvm/Support/MemoryBuffer.h>
#include <llvm/Support/ErrorOr.h>
#include <llvm/IR/Module.h>
#include <llvm/IR/LLVMContext.h>
#include <llvm/Bitcode/BitcodeReader.h>
#include <llvm/Support/raw_ostream.h>
#include <llvm/Analysis/Passes.h>
#include <llvm/Analysis/TargetTransformInfo.h>
#include <llvm/Analysis/CostModelDummy.h>
#include "llvm/IRReader/IRReader.h"
#include "llvm/Support/SourceMgr.h"

using namespace llvm;

int main(int argc, char *argv[]) {
    StringRef filename = "FILE_NAME";
    LLVMContext context;

    ErrorOr<std::unique_ptr<MemoryBuffer>> fileOrErr =
        MemoryBuffer::getFileOrSTDIN(filename);
    if (std::error_code ec = fileOrErr.getError()) {
        std::cerr << " Error opening input file: " + ec.message() << std::endl;
        return 2;
    }
    Expected<std::unique_ptr<Module>> moduleOrErr =
        parseBitcodeFile(fileOrErr.get()->getMemBufferRef(), context);
    if (std::error_code ec = fileOrErr.getError()) {
        std::cerr << "Error reading Moduule: " + ec.message() << std::endl;
        return 3;
    }

    llvm::SMDiagnostic Err;
    llvm::LLVMContext Context;
    std::unique_ptr<llvm::Module> m(parseIRFile(filename, Err, Context));
    if (!m)
      return 4;

    std::cout << "Successfully read Module:" << std::endl;
    std::cout << " Name: " << m->getName().str() << std::endl;
    std::cout << " Target triple: " << m->getTargetTriple() << std::endl;

    for (auto iter1 = m->getFunctionList().begin(); iter1 != m->getFunctionList().end(); iter1++) {
        Function &f = *iter1;

        CostModelAnalysisDummy obj;
        std::cout << " STEP: 1 Function: " << f.getName().str() << std::endl;
        obj.runOnFunction(f);

        std::cout << " STEP: 2 Function: " << f.getName().str() << std::endl;
        for (auto iter2 = f.getBasicBlockList().begin(); iter2 != f.getBasicBlockList().end(); iter2++) {
            BasicBlock &bb = *iter2;
            std::cout << "  BasicBlock: " << bb.getName().str() << std::endl;
            for (auto iter3 = bb.begin(); iter3 != bb.end(); iter3++) {
                Instruction &inst = *iter3;

                std::cout << std::endl << " Number of Cycles" << obj.getInstructionCost(&inst) << std::endl;

                std::cout << "   Instruction " << &inst << " : " << inst.getOpcodeName();

                unsigned int  i = 0;
                unsigned int opnt_cnt = inst.getNumOperands();
                for (; i < opnt_cnt; ++i)
                {
                    Value *opnd = inst.getOperand(i);
                    std::string o;
                    //          raw_string_ostream os(o);
                    //         opnd->print(os);
                    //opnd->printAsOperand(os, true, m);
                    if (opnd->hasName()) {
                        o = opnd->getName();
                        std::cout << " " << o << ",";
                    }
                    else {
                        std::cout << " ptr" << opnd << ",";
                    }
                }
                std::cout << std::endl;
            }
        }
    }
    return 0;
}

源文件

//===- CostModel.cpp ------ Cost Model Analysis ---------------------------===//
//
//                     The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file defines the cost model analysis. It provides a very basic cost
// estimation for LLVM-IR. This analysis uses the services of the codegen
// to approximate the cost of any IR instruction when lowered to machine
// instructions. The cost results are unit-less and the cost number represents
// the throughput of the machine assuming that all loads hit the cache, all
// branches are predicted, etc. The cost numbers can be added in order to
// compare two or more transformation alternatives.
//
//===----------------------------------------------------------------------===//

#include "llvm/Analysis/CostModelDummy.h"

using namespace llvm;

// Register this pass.
char CostModelAnalysisDummy::ID = 0;
static const char cm_name[] = "Cost Model Analysis";
INITIALIZE_PASS_BEGIN(CostModelAnalysisDummy, CM_NAME, cm_name, false, true)
INITIALIZE_PASS_END(CostModelAnalysisDummy, CM_NAME, cm_name, false, true)


static cl::opt<bool> EnableReduxCost("costmodel-reduxcost-Dummy", cl::init(false),
    cl::Hidden,
    cl::desc("Recognize reduction patterns."));

FunctionPass *createCostModelAnalysisDummyPass() {
  return new CostModelAnalysisDummy();
}

void
CostModelAnalysisDummy::getAnalysisUsage(AnalysisUsage &AU) const {
  AU.setPreservesAll();
}

bool
CostModelAnalysisDummy::runOnFunction(Function &F) {
 this->F = &F;
 PMDataManager *DM = getAsPMDataManager();
 AnalysisResolver *AR = new AnalysisResolver(*DM);
 setResolver(AR);
 setTopLevelManager(new CostModelAnalysisDummy());


 recordAvailableAnalysis(new TargetTransformInfoWrapperPass());
 auto *TTIWP = getAnalysisIfAvailable<TargetTransformInfoWrapperPass>();
 TTI = TTIWP ? &TTIWP->getTTI(F) : nullptr;

 return false;
}

static bool isReverseVectorMask(ArrayRef<int> Mask) {
  for (unsigned i = 0, MaskSize = Mask.size(); i < MaskSize; ++i)
    if (Mask[i] >= 0 && Mask[i] != (int)(MaskSize - 1 - i))
      return false;
  return true;
}

static bool isSingleSourceVectorMask(ArrayRef<int> Mask) {
  bool Vec0 = false;
  bool Vec1 = false;
  for (unsigned i = 0, NumVecElts = Mask.size(); i < NumVecElts; ++i) {
    if (Mask[i] >= 0) {
      if ((unsigned)Mask[i] >= NumVecElts)
        Vec1 = true;
      else
        Vec0 = true;
    }
  }
  return !(Vec0 && Vec1);
}

static bool isZeroEltBroadcastVectorMask(ArrayRef<int> Mask) {
  for (unsigned i = 0; i < Mask.size(); ++i)
    if (Mask[i] > 0)
      return false;
  return true;
}

static bool isAlternateVectorMask(ArrayRef<int> Mask) {
  bool isAlternate = true;
  unsigned MaskSize = Mask.size();

  // Example: shufflevector A, B, <0,5,2,7>
  for (unsigned i = 0; i < MaskSize && isAlternate; ++i) {
    if (Mask[i] < 0)
      continue;
    isAlternate = Mask[i] == (int)((i & 1) ? MaskSize + i : i);
  }

  if (isAlternate)
    return true;

  isAlternate = true;
  // Example: shufflevector A, B, <4,1,6,3>
  for (unsigned i = 0; i < MaskSize && isAlternate; ++i) {
    if (Mask[i] < 0)
      continue;
    isAlternate = Mask[i] == (int)((i & 1) ? i : MaskSize + i);
  }

  return isAlternate;
}

static TargetTransformInfo::OperandValueKind getOperandInfo(Value *V) {
  TargetTransformInfo::OperandValueKind OpInfo =
      TargetTransformInfo::OK_AnyValue;

  // Check for a splat of a constant or for a non uniform vector of constants.
  if (isa<ConstantVector>(V) || isa<ConstantDataVector>(V)) {
    OpInfo = TargetTransformInfo::OK_NonUniformConstantValue;
    if (cast<Constant>(V)->getSplatValue() != nullptr)
      OpInfo = TargetTransformInfo::OK_UniformConstantValue;
  }

  // Check for a splat of a uniform value. This is not loop aware, so return
  // true only for the obviously uniform cases (argument, globalvalue)
  const Value *Splat = getSplatValue(V);
  if (Splat && (isa<Argument>(Splat) || isa<GlobalValue>(Splat)))
    OpInfo = TargetTransformInfo::OK_UniformValue;

  return OpInfo;
}

static bool matchPairwiseShuffleMask(ShuffleVectorInst *SI, bool IsLeft,
                                     unsigned Level) {
  // We don't need a shuffle if we just want to have element 0 in position 0 of
  // the vector.
  if (!SI && Level == 0 && IsLeft)
    return true;
  else if (!SI)
    return false;

  SmallVector<int, 32> Mask(SI->getType()->getVectorNumElements(), -1);

  // Build a mask of 0, 2, ... (left) or 1, 3, ... (right) depending on whether
  // we look at the left or right side.
  for (unsigned i = 0, e = (1 << Level), val = !IsLeft; i != e; ++i, val += 2)
    Mask[i] = val;

  SmallVector<int, 16> ActualMask = SI->getShuffleMask();
  return Mask == ActualMask;
}

static bool matchPairwiseReductionAtLevel(const BinaryOperator *BinOp,
                                          unsigned Level, unsigned NumLevels) {
  // Match one level of pairwise operations.
  // %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
  //       <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
  // %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
  //       <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
  // %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
  if (BinOp == nullptr)
    return false;

  assert(BinOp->getType()->isVectorTy() && "Expecting a vector type");

  unsigned Opcode = BinOp->getOpcode();
  Value *L = BinOp->getOperand(0);
  Value *R = BinOp->getOperand(1);

  ShuffleVectorInst *LS = dyn_cast<ShuffleVectorInst>(L);
  if (!LS && Level)
    return false;
  ShuffleVectorInst *RS = dyn_cast<ShuffleVectorInst>(R);
  if (!RS && Level)
    return false;

  // On level 0 we can omit one shufflevector instruction.
  if (!Level && !RS && !LS)
    return false;

  // Shuffle inputs must match.
  Value *NextLevelOpL = LS ? LS->getOperand(0) : nullptr;
  Value *NextLevelOpR = RS ? RS->getOperand(0) : nullptr;
  Value *NextLevelOp = nullptr;
  if (NextLevelOpR && NextLevelOpL) {
    // If we have two shuffles their operands must match.
    if (NextLevelOpL != NextLevelOpR)
      return false;

    NextLevelOp = NextLevelOpL;
  } else if (Level == 0 && (NextLevelOpR || NextLevelOpL)) {
    // On the first level we can omit the shufflevector <0, undef,...>. So the
    // input to the other shufflevector <1, undef> must match with one of the
    // inputs to the current binary operation.
    // Example:
    //  %NextLevelOpL = shufflevector %R, <1, undef ...>
    //  %BinOp        = fadd          %NextLevelOpL, %R
    if (NextLevelOpL && NextLevelOpL != R)
      return false;
    else if (NextLevelOpR && NextLevelOpR != L)
      return false;

    NextLevelOp = NextLevelOpL ? R : L;
  } else
    return false;

  // Check that the next levels binary operation exists and matches with the
  // current one.
  BinaryOperator *NextLevelBinOp = nullptr;
  if (Level + 1 != NumLevels) {
    if (!(NextLevelBinOp = dyn_cast<BinaryOperator>(NextLevelOp)))
      return false;
    else if (NextLevelBinOp->getOpcode() != Opcode)
      return false;
  }

  // Shuffle mask for pairwise operation must match.
  if (matchPairwiseShuffleMask(LS, true, Level)) {
    if (!matchPairwiseShuffleMask(RS, false, Level))
      return false;
  } else if (matchPairwiseShuffleMask(RS, true, Level)) {
    if (!matchPairwiseShuffleMask(LS, false, Level))
      return false;
  } else
    return false;

  if (++Level == NumLevels)
    return true;

  // Match next level.
  return matchPairwiseReductionAtLevel(NextLevelBinOp, Level, NumLevels);
}

static bool matchPairwiseReduction(const ExtractElementInst *ReduxRoot,
                                   unsigned &Opcode, Type *&Ty) {
  if (!EnableReduxCost)
    return false;

  // Need to extract the first element.
  ConstantInt *CI = dyn_cast<ConstantInt>(ReduxRoot->getOperand(1));
  unsigned Idx = ~0u;
  if (CI)
    Idx = CI->getZExtValue();
  if (Idx != 0)
    return false;

  BinaryOperator *RdxStart = dyn_cast<BinaryOperator>(ReduxRoot->getOperand(0));
  if (!RdxStart)
    return false;

  Type *VecTy = ReduxRoot->getOperand(0)->getType();
  unsigned NumVecElems = VecTy->getVectorNumElements();
  if (!isPowerOf2_32(NumVecElems))
    return false;

  // We look for a sequence of shuffle,shuffle,add triples like the following
  // that builds a pairwise reduction tree.
  //
  //  (X0, X1, X2, X3)
  //   (X0 + X1, X2 + X3, undef, undef)
  //    ((X0 + X1) + (X2 + X3), undef, undef, undef)
  //
  // %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
  //       <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
  // %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
  //       <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
  // %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
  // %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
  //       <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
  // %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
  //       <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  // %bin.rdx8 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
  // %r = extractelement <4 x float> %bin.rdx8, i32 0
  if (!matchPairwiseReductionAtLevel(RdxStart, 0,  Log2_32(NumVecElems)))
    return false;

  Opcode = RdxStart->getOpcode();
  Ty = VecTy;

  return true;
}

static std::pair<Value *, ShuffleVectorInst *>
getShuffleAndOtherOprd(BinaryOperator *B) {

  Value *L = B->getOperand(0);
  Value *R = B->getOperand(1);
  ShuffleVectorInst *S = nullptr;

  if ((S = dyn_cast<ShuffleVectorInst>(L)))
    return std::make_pair(R, S);

  S = dyn_cast<ShuffleVectorInst>(R);
  return std::make_pair(L, S);
}

static bool matchVectorSplittingReduction(const ExtractElementInst *ReduxRoot,
                                          unsigned &Opcode, Type *&Ty) {
  if (!EnableReduxCost)
    return false;

  // Need to extract the first element.
  ConstantInt *CI = dyn_cast<ConstantInt>(ReduxRoot->getOperand(1));
  unsigned Idx = ~0u;
  if (CI)
    Idx = CI->getZExtValue();
  if (Idx != 0)
    return false;

  BinaryOperator *RdxStart = dyn_cast<BinaryOperator>(ReduxRoot->getOperand(0));
  if (!RdxStart)
    return false;
  unsigned RdxOpcode = RdxStart->getOpcode();

  Type *VecTy = ReduxRoot->getOperand(0)->getType();
  unsigned NumVecElems = VecTy->getVectorNumElements();
  if (!isPowerOf2_32(NumVecElems))
    return false;

  // We look for a sequence of shuffles and adds like the following matching one
  // fadd, shuffle vector pair at a time.
  //
  // %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
  //                           <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
  // %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
  // %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
  //                          <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  // %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
  // %r = extractelement <4 x float> %bin.rdx8, i32 0

  unsigned MaskStart = 1;
  Value *RdxOp = RdxStart;
  SmallVector<int, 32> ShuffleMask(NumVecElems, 0);
  unsigned NumVecElemsRemain = NumVecElems;
  while (NumVecElemsRemain - 1) {
    // Check for the right reduction operation.
    BinaryOperator *BinOp;
    if (!(BinOp = dyn_cast<BinaryOperator>(RdxOp)))
      return false;
    if (BinOp->getOpcode() != RdxOpcode)
      return false;

    Value *NextRdxOp;
    ShuffleVectorInst *Shuffle;
    std::tie(NextRdxOp, Shuffle) = getShuffleAndOtherOprd(BinOp);

    // Check the current reduction operation and the shuffle use the same value.
    if (Shuffle == nullptr)
      return false;
    if (Shuffle->getOperand(0) != NextRdxOp)
      return false;

    // Check that shuffle masks matches.
    for (unsigned j = 0; j != MaskStart; ++j)
      ShuffleMask[j] = MaskStart + j;
    // Fill the rest of the mask with -1 for undef.
    std::fill(&ShuffleMask[MaskStart], ShuffleMask.end(), -1);

    SmallVector<int, 16> Mask = Shuffle->getShuffleMask();
    if (ShuffleMask != Mask)
      return false;

    RdxOp = NextRdxOp;
    NumVecElemsRemain /= 2;
    MaskStart *= 2;
  }

  Opcode = RdxOpcode;
  Ty = VecTy;
  return true;
}

unsigned CostModelAnalysisDummy::getInstructionCost(const Instruction *I) const {

  if (!TTI)
    return -1;

  switch (I->getOpcode()) {
  case Instruction::GetElementPtr:
    return TTI->getUserCost(I);

  case Instruction::Ret:
  case Instruction::PHI:
  case Instruction::Br: {
    return TTI->getCFInstrCost(I->getOpcode());
  }
  case Instruction::Add:
  case Instruction::FAdd:
  case Instruction::Sub:
  case Instruction::FSub:
  case Instruction::Mul:
  case Instruction::FMul:
  case Instruction::UDiv:
  case Instruction::SDiv:
  case Instruction::FDiv:
  case Instruction::URem:
  case Instruction::SRem:
  case Instruction::FRem:
  case Instruction::Shl:
  case Instruction::LShr:
  case Instruction::AShr:
  case Instruction::And:
  case Instruction::Or:
  case Instruction::Xor: {
    TargetTransformInfo::OperandValueKind Op1VK =
      getOperandInfo(I->getOperand(0));
    TargetTransformInfo::OperandValueKind Op2VK =
      getOperandInfo(I->getOperand(1));
    SmallVector<const Value*, 2> Operands(I->operand_values()); 
    return TTI->getArithmeticInstrCost(I->getOpcode(), I->getType(), Op1VK,
                                       Op2VK, TargetTransformInfo::OP_None, 
                                       TargetTransformInfo::OP_None, 
                                       Operands);
  }
  case Instruction::Select: {
    const SelectInst *SI = cast<SelectInst>(I);
    Type *CondTy = SI->getCondition()->getType();
    return TTI->getCmpSelInstrCost(I->getOpcode(), I->getType(), CondTy);
  }
  case Instruction::ICmp:
  case Instruction::FCmp: {
    Type *ValTy = I->getOperand(0)->getType();
    return TTI->getCmpSelInstrCost(I->getOpcode(), ValTy);
  }
  case Instruction::Store: {
    const StoreInst *SI = cast<StoreInst>(I);
    Type *ValTy = SI->getValueOperand()->getType();
    return TTI->getMemoryOpCost(I->getOpcode(), ValTy,
                                 SI->getAlignment(),
                                 SI->getPointerAddressSpace());
  }
  case Instruction::Load: {
    const LoadInst *LI = cast<LoadInst>(I);
    return TTI->getMemoryOpCost(I->getOpcode(), I->getType(),
                                 LI->getAlignment(),
                                 LI->getPointerAddressSpace());
  }
  case Instruction::ZExt:
  case Instruction::SExt:
  case Instruction::FPToUI:
  case Instruction::FPToSI:
  case Instruction::FPExt:
  case Instruction::PtrToInt:
  case Instruction::IntToPtr:
  case Instruction::SIToFP:
  case Instruction::UIToFP:
  case Instruction::Trunc:
  case Instruction::FPTrunc:
  case Instruction::BitCast:
  case Instruction::AddrSpaceCast: {
    Type *SrcTy = I->getOperand(0)->getType();
    return TTI->getCastInstrCost(I->getOpcode(), I->getType(), SrcTy);
  }
  case Instruction::ExtractElement: {
    const ExtractElementInst * EEI = cast<ExtractElementInst>(I);
    ConstantInt *CI = dyn_cast<ConstantInt>(I->getOperand(1));
    unsigned Idx = -1;
    if (CI)
      Idx = CI->getZExtValue();

    // Try to match a reduction sequence (series of shufflevector and vector
    // adds followed by a extractelement).
    unsigned ReduxOpCode;
    Type *ReduxType;

    if (matchVectorSplittingReduction(EEI, ReduxOpCode, ReduxType))
      return TTI->getArithmeticReductionCost(ReduxOpCode, ReduxType, false);
    else if (matchPairwiseReduction(EEI, ReduxOpCode, ReduxType))
      return TTI->getArithmeticReductionCost(ReduxOpCode, ReduxType, true);

    return TTI->getVectorInstrCost(I->getOpcode(),
                                   EEI->getOperand(0)->getType(), Idx);
  }
  case Instruction::InsertElement: {
    const InsertElementInst * IE = cast<InsertElementInst>(I);
    ConstantInt *CI = dyn_cast<ConstantInt>(IE->getOperand(2));
    unsigned Idx = -1;
    if (CI)
      Idx = CI->getZExtValue();
    return TTI->getVectorInstrCost(I->getOpcode(),
                                   IE->getType(), Idx);
  }
  case Instruction::ShuffleVector: {
    const ShuffleVectorInst *Shuffle = cast<ShuffleVectorInst>(I);
    Type *VecTypOp0 = Shuffle->getOperand(0)->getType();
    unsigned NumVecElems = VecTypOp0->getVectorNumElements();
    SmallVector<int, 16> Mask = Shuffle->getShuffleMask();

    if (NumVecElems == Mask.size()) {
      if (isReverseVectorMask(Mask))
        return TTI->getShuffleCost(TargetTransformInfo::SK_Reverse, VecTypOp0,
                                   0, nullptr);
      if (isAlternateVectorMask(Mask))
        return TTI->getShuffleCost(TargetTransformInfo::SK_SELECT,
                                   VecTypOp0, 0, nullptr);

      if (isZeroEltBroadcastVectorMask(Mask))
        return TTI->getShuffleCost(TargetTransformInfo::SK_Broadcast,
                                   VecTypOp0, 0, nullptr);

      if (isSingleSourceVectorMask(Mask))
        return TTI->getShuffleCost(TargetTransformInfo::SK_PermuteSingleSrc,
                                   VecTypOp0, 0, nullptr);

      return TTI->getShuffleCost(TargetTransformInfo::SK_PermuteTwoSrc,
                                 VecTypOp0, 0, nullptr);
    }

    return -1;
  }
  case Instruction::Call:
    if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
      SmallVector<Value *, 4> Args;
      for (unsigned J = 0, JE = II->getNumArgOperands(); J != JE; ++J)
        Args.push_back(II->getArgOperand(J));

      FastMathFlags FMF;
      if (auto *FPMO = dyn_cast<FPMathOperator>(II))
        FMF = FPMO->getFastMathFlags();

      return TTI->getIntrinsicInstrCost(II->getIntrinsicID(), II->getType(),
                                        Args, FMF);
    }
    return -1;
  default:
    // We don't have any information on this instruction.
    return -1;
  }
}

void CostModelAnalysisDummy::print(raw_ostream &OS, const Module*) const {
  if (!F)
    return;

  for (BasicBlock &B : *F) {
    for (Instruction &Inst : B) {
      unsigned Cost = getInstructionCost(&Inst);
      if (Cost != (unsigned)-1)
        OS << "Cost Model: Found an estimated cost of " << Cost;
      else
        OS << "Cost Model: Unknown cost";

      OS << " for instruction: " << Inst << "\n";
    }
  }
}

头文件

#include "llvm/ADT/STLExtras.h"
#include "llvm/Analysis/Passes.h"
#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/VectorUtils.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Value.h"
#include "llvm/Pass.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"
#include <iostream>
#include "llvm/IR/LegacyPassManagers.h"

using namespace llvm;


#define CM_NAME "cost-model-sanji"
#define DEBUG_TYPE CM_NAME

    class CostModelAnalysisDummy : public PMDataManager, public FunctionPass, public PMTopLevelManager {

    public:
        static char ID; // Class identification, replacement for typeinfo
        CostModelAnalysisDummy() : FunctionPass(ID), PMDataManager(), PMTopLevelManager(new FPPassManager()), F(nullptr), TTI(nullptr) {
            llvm::initializeCostModelAnalysisDummyPass(
                *PassRegistry::getPassRegistry());
        }

        /// Returns the expected cost of the instruction.
        /// Returns -1 if the cost is unknown.
        /// Note, this method does not cache the cost calculation and it
        /// can be expensive in some cases.
        unsigned getInstructionCost(const Instruction *I) const;
        bool runOnFunction(Function &F) override;

        PMDataManager *getAsPMDataManager() override { return this; }
        Pass *getAsPass() override { return this; }

        PassManagerType getTopLevelPassManagerType() override {
            return PMT_BasicBlockPassManager;
        }

        FPPassManager *getContainedManager(unsigned N) {
            assert(N < PassManagers.size() && "Pass number out of range!");
            FPPassManager *FP = static_cast<FPPassManager *>(PassManagers[N]);
            return FP;
        }
    private:
        void getAnalysisUsage(AnalysisUsage &AU) const override;

        void print(raw_ostream &OS, const Module*) const override;

        /// The function that we analyze.
        Function *F;
        /// Target information.
        const TargetTransformInfo *TTI;
    };

    FunctionPass *createCostModelAnalysisDummyPass();

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

LLVM

LLVMIR

使用 CostModel 获取 LLVM IR 的 cpu 周期的相关文章

对于使用块的 clang 程序，您需要链接哪些库

我发现如下在编译使用块的代码时需要使用 fblocks 我需要链接哪个库才能让链接器解析 NSConcreteStackBlock 在 Ubuntu 9 10 AMD64 上 chris chris desktop clang ctes
如何将 c++filt 与 llvm-cov 报告一起使用？

我正在尝试将 demangler 与 llvm cov 报告工具一起使用以下是我正在运行的命令 llvm cov report path to executable instr profile path to default profda
如何强制 Xcode 使用自定义编译器？

我想强制 Xcode 使用自定义编译器从 src 构建的 clang llvm 以便我可以使用 clang 插件我的Xcode版本是7 3 1 人们说使用自定义工具链是可能的我没有对它们进行研究因为更简单的解决方案对我来说效果很好
通过修改LLVM Backend来Clobber X86寄存器

我正在尝试稍微改变 X86 目标的 LLVM 后端以产生一些所需的行为更具体地说我想模拟一个像 gcc 的 fcall used 这样的标志reg option https gcc gnu org onlinedocs gcc Cod
可以从 LLVM-IR 自动生成 llvm c++ api 代码吗？

clang 3 0 在线演示页面http llvm org demo index cgi http llvm org demo index cgi提供输出 LLVM C API 代码的选项表示输入程序的 LLVM IR 生成 LLVM C
LLVM 和编译器术语

我正在研究 LLVM 系统并且我已经阅读了入门文档 http llvm org docs GettingStarted html 然而一些术语以及 clang 示例中的措辞仍然有点令人困惑以下术语和命令都是编译过程的一部分我想知道
使用 LLVM 将 x86 代码重新编译为更快的 x86

是否可以输入 x86 32 位代码来运行 LLVM 编译器有一个巨大的算法我没有源代码我想让它在相同的硬件上运行得更快我可以通过优化将其从 x86 转换回 x86 吗这段代码运行时间很长所以我想对其进行静态重新编译另外我可以
LLVM 的 amd64 输出中向量的对齐

我正在尝试通过 LLVM 在结构内部使用向量我的结构有以下 C 定义 struct Foo uint32 t len uint32 t data 32 attribute aligned 16 下面是一些 LLVM 代码用于将 42 添
从 Haskell 代码生成 LLVM IR

我的目标是获取不同语言主要是 C C Obj C 和 Haskell 的源代码并提供有关它们的各种统计信息例如变量函数内存分配复杂性等的数量 LLVM 似乎是一个完美的工具因为我可以为这些语言生成位码并且通过 LLVM 的可
如何使用 LLVM IRBuilder 从外部 DLL 调用函数？

如何从 LLVM 调用外部 DLL 的函数如何从 LLVM 代码调用 DLL 文件中定义的函数由于您的问题缺少重要信息我猜您想实现以下目标我猜你会使用 c c 接口并且该函数有一个签名void fun void 我还猜测您将使用 L
llvm OCaml 绑定

我正在研究 llvm OCaml 绑定我通过 opam 安装了 llvm 包 opam install llvm 当我在 utop 中使用 llvm 时出现以下错误 require llvm Error The external fun
如何在 CMake 项目中使用 LLVM 的 libcxx 和 libcxxabi？

目前我正在跑步Debian 9 https en wikipedia org wiki Debian version history Debian 9 Stretch 拉伸用系统默认的编译器GCC 6 3 0 但我有一个使用 CMake
ld：警告：__DATA/__objc_imageinfo__DATA 节的大小意外地大

有谁知道这个警告是什么意思接下来是错误 Command Developer Platforms iPhoneSimulator platform Developer usr bin llvm gcc 4 2 failed with exi
LLVM到底是什么？

我一直听说 LLVM 它是 Perl 语言然后是 Haskell 语言然后有人用其他语言使用它它是什么它与 GCC 到底有什么区别视角安全等 LLVM 是一个用于构建优化和生成中间和或二进制机器代码的库 LLVM 可以用作编
识别 IR 中的阵列类型

我一直在尝试使用以下代码来识别 IR 中的数组访问 for BasicBlock iterator ii BB gt begin ii2 ii BB gt end ii Instruction I ii if GetElementPtrIn
LLVM cpp 后端，它会取代 c 后端吗？

我的问题是关于 CPP 后端它与 C 后端的用途相同吗 C 后端是我最喜欢的 LLVM 功能之一我对它被删除感到非常沮丧真正的区别是什么我非常感谢任何帮助参考 LLVM 3 1 发行说明 http llvm org release
是否可以使用 gold 链接器编译和链接 Clang/LLVM？

我正在为 LLVM Clang 编写自定义通道重新编译往往需要一段时间并使用大量内存我听说 gold 链接器 1 比标准 ld 链接器花费更少的时间并且 2 使用更少的内存有没有办法将标志传递到 LLVM Clang 构建过程并更改为
如何在 LLVM IR 中使用 RISC-V Vector (RVV) 指令？

In 这个演示文稿 https llvm org devmtg 2019 04 slides TechTalk Kruppe Espasa RISC V Vectors and LLVM pdfKruppe 和 Espasa 概述了 RIS
如何从 LLVM 指令获取文件名和目录？

我需要在 llvm 过程中提取目录和文件名当前版本的 llvm 已移动getFilename and getDirectory from DebugLoc to DebugInfoMetadata 我找不到班级成员getFilename直
LLVM cmake安装找不到DIA SDK

我正在尝试使用 cmake 构建 LLVM 安装但它给了我一个关于 LLVM ENABLE DIA SDK 的错误我之前在没有 PDB 的情况下成功构建了 LLVM 但我正在尝试开始使用 libclang 所以我需要 PDB Cmake

随机推荐

允许强制转换为 void（不是指针），为什么？

为什么我可以将此向量转换为 void 甚至不是指针 int main std vector
在 C# 中重现小数点的撕裂读取

眼见为实任何人都可以重现读取撕裂的小数的程序吗我尝试旋转多个线程在 1 和 2 之间更改相同的小数我没有捕获任何与 1 或 2 不同的读取我希望看到读取器线程看不到写入器线程的原子更改因此该值应该与 1 或 2 不同 void
删除方法不适用于 Indexed DB HTML5...它返回成功但记录未删除

我在使用桌面 Chrome 时遇到的 HTML5 Indexed DB 的另一个问题是我无法从对象存储中删除记录 onsuccess 事件被触发但记录仍然存在我的 ID 是一个时间戳只是因为我想更快地实现一个工作应用程序我硬编码了它
是否值得在存储卡上安装 Compact Framework？

随着应用程序的增长我们的 Windows CE 设备上需要更多空间我们安装了 SD 卡从 SD 卡运行我们的应用程序速度很慢如果从持久路径运行应用程序则需求分页会出现一些严重问题我们看到的唯一选择是在 SD 卡上安装 Compa
应用程序包 (.abb) 大小大于 APK (.apk)，不应该相反吗？

代码实际上是相同的我只是添加了一个类是不是因为当 Bundle 安装在设备中时只有一部分会转到该设备并且它的大小会更小来自docs https developer android com platform technology a
运行 babel 时，方法的 JSdoc 在转译代码中丢失

我正在使图书馆变得更加用户友好让消费者在使用图书馆时看到文档我有一个在构建时运行的脚本 babel src out dir dist quiet 这是我的 babelrc presets es2015 loose true module
在具有 16 位 PCM 的 iOS 中生成音调，AudioEngine.connect() 会抛出 AUSetFormat：错误 -10868

我有以下代码用于生成给定频率和持续时间的音频它大致基于在 Android 上执行相同操作的这个答案感谢 Steve Pomeroy https stackoverflow com a 3731075 973364 https stack
jquery 查找所有精确的 td 匹配

servertable td eq server 这仅找到 1 个我认为是第一个匹配项如何找到所有匹配项顺便提一句 td contains 对我不起作用 eq期望数字索引仅返回一行如果你想通过 td 的内容来匹配它你必须使用包
HBase 中类似 SQL LIMIT 的命令

HBase 有没有类似的命令SQL LIMIT query 我可以这样做setStart and setEnd 但我不想迭代所有行在 HBase shell 中您可以使用 LIMIT hbase gt scan test table L
iPhone - 无法在 iOS 5.0.1 设备上从 XCode 运行应用程序，因为 iOS 从 5.0 更新

我刚刚将 iPhone 更新到 iOS 5 0 1 XCode 不再将其识别为运行应用程序的有效设备我已经去找组织者将设备重置为开发设备更新了我的组件和库但仍然没有任何结果该设备没有出现在主窗口弹出窗口的可用目的地中我该如何在设
我的免费送货是在加税后计算的

我尝试设置 50 美元免运费但当我添加总计达到 47 00 美元的产品时税费将被取消并允许免运费我不敢相信这是标准的所以我一定有什么设置错误我在联邦快递 FedEx 和美国邮政 USPS 承运商下设置了免费送货服务我在配置销售
Spring Boot IMAP 通道适配器在处理大量入站电子邮件时丢失电子邮件

我有一个正在运行的 Spring boot 应用程序它正在处理来自专用邮箱的电子邮件当电子邮件数量有限时在测试环境中一切都运行良好在生产环境中这些电子邮件是由计划作业生成的有时一批中会有超过 10000 封电子邮件每封电子邮件
使用 Yii 下载文件

我正在使用 Yii 框架并且我有网站允许管理员上传文本文件或 pdf 现在我想允许用户单击链接并开始下载该文件这在 Yii 框架内是如何实现的我将文件存储在 Yiiapplication uploads downloads test
会话变量在本地服务器上有效，但在托管服务器上无效

我正在开发一个简单的 php mysql 讨论论坛该声明在我的本地计算机上产生了所需的结果但是当我上传代码以实时测试论坛时会话变量的值不再显示可能是什么原因造成的可能导致这种情况的一件事是如果实时 Web 服务器位于使用默认
Django：用户登录时发出信号？

在我的 Django 应用程序中我需要在用户登录时开始运行一些定期后台作业并在用户注销时停止运行它们因此我正在寻找一种优雅的方法来收到用户登录注销的通知查询用户登录状态从我的角度来看理想的解决方案是每个发送的信号djang
Python - 找到两个图的所有交点

我试图找到两个图的所有交点并将它们显示在最终的绘图上我环顾四周并尝试了多种方法但一直无法获得我想要的东西目前我尝试生成一个列表其中将列出交点但我不断收到以下错误具有多个元素的数组的真值是不明确的使用a any or a al
x64 上的 WMI 链接器错误

我正在尝试使用 msdn 中的 WMI 示例 http msdn microsoft com en us library windows desktop aa384724 28v vs 85 29 aspx http msdn micros
访问联合类型中的属性

请考虑下面的简单联合类型示例 interface Alarm alarmText string quali number interface Car speed number type unionT Alarm Car var alarm
使用迭代最近点 (ICP) 时如何在点云库 (PCL) 中标记 NULL 数据

我正在尝试使用以下方法对齐 2 组点云迭代最近点 ICP 算法集成在点云库 PCL 我收到错误报告指出找不到足够的对应点我已经放宽了参数的条件 setEuclideanFitnessEpsilon 1 797e 5 setMaximum
使用 CostModel 获取 LLVM IR 的 cpu 周期

从LLVM 3 0开始 Analyse目录下有CostModel cpp 参考它的文档它说该文件定义了成本模型分析它为 LLVM IR 提供了非常基本的成本估算此分析使用代码生成器的服务来估算任何 IR 指令降低为机器指令时的成本

使用 CostModel 获取 LLVM IR 的 cpu 周期

使用 CostModel 获取 LLVM IR 的 cpu 周期 的相关文章

随机推荐

热门标签

使用 CostModel 获取 LLVM IR 的 cpu 周期的相关文章