Skip to content

[analyzer] Introduce per-entry-point statistics #131175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/docs/analyzer/developer-docs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ Contents:
developer-docs/nullability
developer-docs/RegionStore
developer-docs/PerformanceInvestigation
developer-docs/Statistics
33 changes: 33 additions & 0 deletions clang/docs/analyzer/developer-docs/Statistics.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
===================
Analysis Statistics
===================

Clang Static Analyzer enjoys two facilities to collect statistics: per translation unit and per entry point.
We use `llvm/ADT/Statistic.h`_ for numbers describing the entire translation unit.
We use `clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h`_ to collect data for each symbolic-execution entry point.

.. _llvm/ADT/Statistic.h: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/Statistic.h#L171
.. _clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h: https://github.com/llvm/llvm-project/blob/main/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h

In many cases, it makes sense to collect statistics on both translation-unit level and entry-point level. You can use the two macros defined in EntryPointStats.h for that:

- ``STAT_COUNTER`` for additive statistics, for example, "the number of steps executed", "the number of functions inlined".
- ``STAT_MAX`` for maximizing statistics, for example, "the maximum environment size", or "the longest execution path".

If you want to define a statistic that makes sense only for the entire translation unit, for example, "the number of entry points", Statistic.h defines two macros: ``STATISTIC`` and ``ALWAYS_ENABLED_STATISTIC``.
You should prefer ``ALWAYS_ENABLED_STATISTIC`` unless you have a good reason not to.
``STATISTIC`` is controlled by ``LLVM_ENABLE_STATS`` / ``LLVM_FORCE_ENABLE_STATS``.
However, note that with ``LLVM_ENABLE_STATS`` disabled, only storage of the values is disabled, the computations producing those values still carry on unless you took an explicit precaution to make them conditional too.

If you want to define a statistic only for entry point, EntryPointStats.h has four classes at your disposal:


- ``BoolEPStat`` - a boolean value assigned at most once per entry point. For example: "has the inline limit been reached".
- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body".
- ``CounterEPStat`` - an additive statistic. It starts with 0 and you can add to it as many times as needed. For example: "the number of bugs discovered".
- ``UnsignedMaxEPStat`` - a maximizing statistic. It starts with 0 and when you join it with a value, it picks the maximum of the previous value and the new one. For example, "the longest execution path of a bug".

To produce a CSV file with all the statistics collected per entry point, use the ``dump-entry-point-stats-to-csv=<file>.csv`` parameter.

Note, EntryPointStats.h is not meant to be complete, and if you feel it is lacking certain kind of statistic, odds are that it does.
Feel free to extend it!
6 changes: 6 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -353,6 +353,12 @@ ANALYZER_OPTION(bool, DisplayCTUProgress, "display-ctu-progress",
"the analyzer's progress related to ctu.",
false)

ANALYZER_OPTION(
StringRef, DumpEntryPointStatsToCSV, "dump-entry-point-stats-to-csv",
"If provided, the analyzer will dump statistics per entry point "
"into the specified CSV file.",
"")

ANALYZER_OPTION(bool, ShouldTrackConditions, "track-conditions",
"Whether to track conditions that are a control dependency of "
"an already tracked variable.",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
// EntryPointStats.h - Tracking statistics per entry point ------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H
#define CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H

#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringRef.h"

namespace llvm {
class raw_ostream;
} // namespace llvm

namespace clang {
class Decl;

namespace ento {

class EntryPointStat {
public:
llvm::StringLiteral name() const { return Name; }

static void lockRegistry();

static void takeSnapshot(const Decl *EntryPoint);
static void dumpStatsAsCSV(llvm::raw_ostream &OS);
static void dumpStatsAsCSV(llvm::StringRef FileName);

protected:
explicit EntryPointStat(llvm::StringLiteral Name) : Name{Name} {}
EntryPointStat(const EntryPointStat &) = delete;
EntryPointStat(EntryPointStat &&) = delete;
EntryPointStat &operator=(EntryPointStat &) = delete;
EntryPointStat &operator=(EntryPointStat &&) = delete;

private:
llvm::StringLiteral Name;
};

class BoolEPStat : public EntryPointStat {
std::optional<bool> Value = {};

public:
explicit BoolEPStat(llvm::StringLiteral Name);
unsigned value() const { return Value && *Value; }
void set(bool V) {
assert(!Value.has_value());
Value = V;
}
void reset() { Value = {}; }
};

// used by CounterEntryPointTranslationUnitStat
class CounterEPStat : public EntryPointStat {
using EntryPointStat::EntryPointStat;
unsigned Value = {};

public:
explicit CounterEPStat(llvm::StringLiteral Name);
unsigned value() const { return Value; }
void reset() { Value = {}; }
CounterEPStat &operator++() {
++Value;
return *this;
}

CounterEPStat &operator++(int) {
// No difference as you can't extract the value
return ++(*this);
}

CounterEPStat &operator+=(unsigned Inc) {
Value += Inc;
return *this;
}
};

// used by UnsignedMaxEtryPointTranslationUnitStatistic
class UnsignedMaxEPStat : public EntryPointStat {
using EntryPointStat::EntryPointStat;
unsigned Value = {};

public:
explicit UnsignedMaxEPStat(llvm::StringLiteral Name);
unsigned value() const { return Value; }
void reset() { Value = {}; }
void updateMax(unsigned X) { Value = std::max(Value, X); }
};

class UnsignedEPStat : public EntryPointStat {
using EntryPointStat::EntryPointStat;
std::optional<unsigned> Value = {};

public:
explicit UnsignedEPStat(llvm::StringLiteral Name);
unsigned value() const { return Value.value_or(0); }
void reset() { Value.reset(); }
void set(unsigned V) {
assert(!Value.has_value());
Value = V;
}
};

class CounterEntryPointTranslationUnitStat {
CounterEPStat M;
llvm::TrackingStatistic S;

public:
CounterEntryPointTranslationUnitStat(const char *DebugType,
llvm::StringLiteral Name,
llvm::StringLiteral Desc)
: M(Name), S(DebugType, Name.data(), Desc.data()) {}
CounterEntryPointTranslationUnitStat &operator++() {
++M;
++S;
return *this;
}

CounterEntryPointTranslationUnitStat &operator++(int) {
// No difference with prefix as the value is not observable.
return ++(*this);
}

CounterEntryPointTranslationUnitStat &operator+=(unsigned Inc) {
M += Inc;
S += Inc;
return *this;
}
};

class UnsignedMaxEntryPointTranslationUnitStatistic {
UnsignedMaxEPStat M;
llvm::TrackingStatistic S;

public:
UnsignedMaxEntryPointTranslationUnitStatistic(const char *DebugType,
llvm::StringLiteral Name,
llvm::StringLiteral Desc)
: M(Name), S(DebugType, Name.data(), Desc.data()) {}
void updateMax(uint64_t Value) {
M.updateMax(static_cast<unsigned>(Value));
S.updateMax(Value);
}
};

#define STAT_COUNTER(VARNAME, DESC) \
static clang::ento::CounterEntryPointTranslationUnitStat VARNAME = { \
DEBUG_TYPE, #VARNAME, DESC}

#define STAT_MAX(VARNAME, DESC) \
static clang::ento::UnsignedMaxEntryPointTranslationUnitStatistic VARNAME = \
{DEBUG_TYPE, #VARNAME, DESC}

} // namespace ento
} // namespace clang

#endif // CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H
9 changes: 4 additions & 5 deletions clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@
#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/Support/raw_ostream.h"
#include <optional>

Expand All @@ -27,10 +27,9 @@ using namespace ento;

#define DEBUG_TYPE "StatsChecker"

STATISTIC(NumBlocks,
"The # of blocks in top level functions");
STATISTIC(NumBlocksUnreachable,
"The # of unreachable blocks in analyzing top level functions");
STAT_COUNTER(NumBlocks, "The # of blocks in top level functions");
STAT_COUNTER(NumBlocksUnreachable,
"The # of unreachable blocks in analyzing top level functions");

namespace {
class AnalyzerStatsChecker : public Checker<check::EndAnalysis> {
Expand Down
28 changes: 14 additions & 14 deletions clang/lib/StaticAnalyzer/Core/BugReporter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/CheckerRegistryData.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
Expand All @@ -54,7 +55,6 @@
#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/iterator_range.h"
Expand Down Expand Up @@ -82,19 +82,19 @@ using namespace llvm;

#define DEBUG_TYPE "BugReporter"

STATISTIC(MaxBugClassSize,
"The maximum number of bug reports in the same equivalence class");
STATISTIC(MaxValidBugClassSize,
"The maximum number of bug reports in the same equivalence class "
"where at least one report is valid (not suppressed)");

STATISTIC(NumTimesReportPassesZ3, "Number of reports passed Z3");
STATISTIC(NumTimesReportRefuted, "Number of reports refuted by Z3");
STATISTIC(NumTimesReportEQClassAborted,
"Number of times a report equivalence class was aborted by the Z3 "
"oracle heuristic");
STATISTIC(NumTimesReportEQClassWasExhausted,
"Number of times all reports of an equivalence class was refuted");
STAT_MAX(MaxBugClassSize,
"The maximum number of bug reports in the same equivalence class");
STAT_MAX(MaxValidBugClassSize,
"The maximum number of bug reports in the same equivalence class "
"where at least one report is valid (not suppressed)");

STAT_COUNTER(NumTimesReportPassesZ3, "Number of reports passed Z3");
STAT_COUNTER(NumTimesReportRefuted, "Number of reports refuted by Z3");
STAT_COUNTER(NumTimesReportEQClassAborted,
"Number of times a report equivalence class was aborted by the Z3 "
"oracle heuristic");
STAT_COUNTER(NumTimesReportEQClassWasExhausted,
"Number of times all reports of an equivalence class was refuted");

BugReporterVisitor::~BugReporterVisitor() = default;

Expand Down
1 change: 1 addition & 0 deletions clang/lib/StaticAnalyzer/Core/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ add_clang_library(clangStaticAnalyzerCore
CoreEngine.cpp
DynamicExtent.cpp
DynamicType.cpp
EntryPointStats.cpp
Environment.cpp
ExplodedGraph.cpp
ExprEngine.cpp
Expand Down
16 changes: 7 additions & 9 deletions clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@
#include "clang/Basic/LLVM.h"
#include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/BlockCounter.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/WorkList.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Statistic.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/FormatVariadic.h"
Expand All @@ -43,14 +43,12 @@ using namespace ento;

#define DEBUG_TYPE "CoreEngine"

STATISTIC(NumSteps,
"The # of steps executed.");
STATISTIC(NumSTUSteps, "The # of STU steps executed.");
STATISTIC(NumCTUSteps, "The # of CTU steps executed.");
STATISTIC(NumReachedMaxSteps,
"The # of times we reached the max number of steps.");
STATISTIC(NumPathsExplored,
"The # of paths explored by the analyzer.");
STAT_COUNTER(NumSteps, "The # of steps executed.");
STAT_COUNTER(NumSTUSteps, "The # of STU steps executed.");
STAT_COUNTER(NumCTUSteps, "The # of CTU steps executed.");
ALWAYS_ENABLED_STATISTIC(NumReachedMaxSteps,
"The # of times we reached the max number of steps.");
STAT_COUNTER(NumPathsExplored, "The # of paths explored by the analyzer.");

//===----------------------------------------------------------------------===//
// Core analysis engine.
Expand Down
Loading
Loading