Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
// -*- mode: C++; c-file-style: "cc-mode" -*-
|
|
|
|
|
//*************************************************************************
|
|
|
|
|
// DESCRIPTION: Verilator: Variable ordering
|
|
|
|
|
//
|
|
|
|
|
// Code available from: https://verilator.org
|
|
|
|
|
//
|
|
|
|
|
//*************************************************************************
|
|
|
|
|
//
|
2025-01-01 14:30:25 +01:00
|
|
|
// Copyright 2003-2025 by Wilson Snyder. This program is free software; you
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
// can redistribute it and/or modify it under the terms of either the GNU
|
|
|
|
|
// Lesser General Public License Version 3 or the Perl Artistic License
|
|
|
|
|
// Version 2.0.
|
|
|
|
|
// SPDX-License-Identifier: LGPL-3.0-only OR Artistic-2.0
|
|
|
|
|
//
|
|
|
|
|
//*************************************************************************
|
|
|
|
|
// V3VariableOrder's Transformations:
|
|
|
|
|
//
|
|
|
|
|
// Each module:
|
|
|
|
|
// Order module variables
|
|
|
|
|
//
|
|
|
|
|
//*************************************************************************
|
|
|
|
|
|
2024-09-06 14:04:26 +02:00
|
|
|
#include "V3PchAstMT.h"
|
2023-10-18 12:37:46 +02:00
|
|
|
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
#include "V3VariableOrder.h"
|
2022-08-05 11:56:57 +02:00
|
|
|
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
#include "V3AstUserAllocator.h"
|
|
|
|
|
#include "V3EmitCBase.h"
|
2024-03-16 17:32:12 +01:00
|
|
|
#include "V3ExecGraph.h"
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
#include "V3TSP.h"
|
2024-09-06 14:04:26 +02:00
|
|
|
#include "V3ThreadPool.h"
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
|
|
|
|
|
#include <vector>
|
|
|
|
|
|
2022-09-18 21:53:42 +02:00
|
|
|
VL_DEFINE_DEBUG_FUNCTIONS;
|
|
|
|
|
|
2024-03-16 17:32:12 +01:00
|
|
|
using MTaskIdVec = std::vector<bool>; // Used as a bit-set indexed by MTask ID
|
|
|
|
|
using MTaskAffinityMap = std::unordered_map<const AstVar*, MTaskIdVec>;
|
|
|
|
|
|
|
|
|
|
// Trace through code reachable form an MTask and annotate referenced variabels
|
|
|
|
|
class GatherMTaskAffinity final : VNVisitorConst {
|
|
|
|
|
// NODE STATE
|
|
|
|
|
// AstCFunc::user1() // bool: Already traced this function
|
|
|
|
|
// AstVar::user1() // bool: Already traced this variable
|
|
|
|
|
const VNUser1InUse m_user1InUse;
|
|
|
|
|
|
|
|
|
|
// STATE
|
|
|
|
|
MTaskAffinityMap& m_results; // The result map being built;
|
|
|
|
|
const uint32_t m_id; // Id of mtask being analysed
|
|
|
|
|
const size_t m_usedIds = ExecMTask::numUsedIds(); // Value of max id + 1
|
|
|
|
|
|
|
|
|
|
// CONSTRUCTOR
|
|
|
|
|
GatherMTaskAffinity(const ExecMTask* mTaskp, MTaskAffinityMap& results)
|
|
|
|
|
: m_results{results}
|
|
|
|
|
, m_id{mTaskp->id()} {
|
|
|
|
|
iterateChildrenConst(mTaskp->bodyp());
|
|
|
|
|
}
|
|
|
|
|
~GatherMTaskAffinity() = default;
|
|
|
|
|
VL_UNMOVABLE(GatherMTaskAffinity);
|
|
|
|
|
|
|
|
|
|
// VISIT
|
2024-06-14 03:29:03 +02:00
|
|
|
void visit(AstNodeVarRef* nodep) override {
|
2024-03-16 17:32:12 +01:00
|
|
|
// Cheaper than relying on emplace().second
|
|
|
|
|
if (nodep->user1SetOnce()) return;
|
|
|
|
|
AstVar* const varp = nodep->varp();
|
|
|
|
|
// Ignore TriggerVec. They are big and read-only in the MTask bodies
|
|
|
|
|
AstBasicDType* const basicp = varp->dtypep()->basicp();
|
|
|
|
|
if (basicp && basicp->isTriggerVec()) return;
|
|
|
|
|
// Set affinity bit
|
|
|
|
|
MTaskIdVec& affinity = m_results
|
|
|
|
|
.emplace(std::piecewise_construct, //
|
|
|
|
|
std::forward_as_tuple(varp), //
|
|
|
|
|
std::forward_as_tuple(m_usedIds))
|
|
|
|
|
.first->second;
|
|
|
|
|
affinity[m_id] = true;
|
|
|
|
|
}
|
|
|
|
|
|
2024-06-14 03:29:03 +02:00
|
|
|
void visit(AstCFunc* nodep) override {
|
2024-03-16 17:32:12 +01:00
|
|
|
if (nodep->user1SetOnce()) return; // Prevent repeat traversals/recursion
|
|
|
|
|
iterateChildrenConst(nodep);
|
|
|
|
|
}
|
|
|
|
|
|
2024-06-14 03:29:03 +02:00
|
|
|
void visit(AstNodeCCall* nodep) override {
|
2024-03-16 17:32:12 +01:00
|
|
|
iterateChildrenConst(nodep); // Arguments
|
|
|
|
|
iterateConst(nodep->funcp()); // Callee
|
|
|
|
|
}
|
|
|
|
|
|
2024-06-14 03:29:03 +02:00
|
|
|
void visit(AstNode* nodep) override { iterateChildrenConst(nodep); }
|
2024-03-16 17:32:12 +01:00
|
|
|
|
|
|
|
|
public:
|
|
|
|
|
static void apply(const ExecMTask* mTaskp, MTaskAffinityMap& results) {
|
|
|
|
|
GatherMTaskAffinity{mTaskp, results};
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
//######################################################################
|
|
|
|
|
// Establish mtask variable sort order in mtasks mode
|
|
|
|
|
|
|
|
|
|
class VarTspSorter final : public V3TSP::TspStateBase {
|
|
|
|
|
// MEMBERS
|
2024-03-16 17:32:12 +01:00
|
|
|
const MTaskIdVec& m_mTaskIds; // Mtask we're ordering
|
|
|
|
|
static uint32_t s_serialNext; // Unique ID to establish serial order
|
|
|
|
|
const uint32_t m_serial = ++s_serialNext; // Serial ordering
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
public:
|
|
|
|
|
// CONSTRUCTORS
|
2024-03-16 17:32:12 +01:00
|
|
|
explicit VarTspSorter(const MTaskIdVec& mTaskIds)
|
|
|
|
|
: m_mTaskIds{mTaskIds} {
|
|
|
|
|
UASSERT(mTaskIds.size() == ExecMTask::numUsedIds(), "Wrong size for MTask ID vector");
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
2022-07-30 16:01:25 +02:00
|
|
|
~VarTspSorter() override = default;
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
// METHODS
|
2022-09-16 12:22:11 +02:00
|
|
|
bool operator<(const TspStateBase& other) const override {
|
2022-09-02 12:29:02 +02:00
|
|
|
return operator<(static_cast<const VarTspSorter&>(other));
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
|
|
|
|
bool operator<(const VarTspSorter& other) const { return m_serial < other.m_serial; }
|
2024-03-16 17:32:12 +01:00
|
|
|
const MTaskIdVec& mTaskIds() const { return m_mTaskIds; }
|
2024-09-06 14:04:26 +02:00
|
|
|
int cost(const TspStateBase* otherp) const override VL_MT_SAFE {
|
2022-09-02 12:29:02 +02:00
|
|
|
return cost(static_cast<const VarTspSorter*>(otherp));
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
2024-09-06 14:04:26 +02:00
|
|
|
int cost(const VarTspSorter* otherp) const VL_MT_SAFE {
|
2024-03-16 17:32:12 +01:00
|
|
|
// Compute the number of MTasks not shared (Hamming distance)
|
|
|
|
|
int cost = 0;
|
|
|
|
|
const size_t size = ExecMTask::numUsedIds();
|
2024-03-27 22:57:49 +01:00
|
|
|
for (size_t i = 0; i < size; ++i) cost += m_mTaskIds.at(i) ^ otherp->m_mTaskIds.at(i);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
return cost;
|
|
|
|
|
}
|
|
|
|
|
};
|
|
|
|
|
|
2024-03-16 17:32:12 +01:00
|
|
|
uint32_t VarTspSorter::s_serialNext = 0;
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
|
2024-09-06 14:04:26 +02:00
|
|
|
struct VarAttributes final {
|
|
|
|
|
uint8_t stratum; // Roughly equivalent to alignment requirement, to avoid padding
|
|
|
|
|
bool anonOk; // Can be emitted as part of anonymous structure
|
|
|
|
|
};
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
class VariableOrder final {
|
2024-09-06 14:04:26 +02:00
|
|
|
std::unordered_map<const AstVar*, VarAttributes> m_attributes;
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
|
2024-03-16 17:32:12 +01:00
|
|
|
const MTaskAffinityMap& m_mTaskAffinity;
|
2024-09-06 14:04:26 +02:00
|
|
|
std::vector<AstVar*>& m_varps;
|
2024-03-16 17:32:12 +01:00
|
|
|
|
2024-09-06 14:04:26 +02:00
|
|
|
VariableOrder(AstNodeModule* modp, const MTaskAffinityMap& mTaskAffinity,
|
|
|
|
|
std::vector<AstVar*>& varps)
|
|
|
|
|
: m_mTaskAffinity{mTaskAffinity}
|
|
|
|
|
, m_varps{varps} {
|
2024-03-16 17:32:12 +01:00
|
|
|
orderModuleVars(modp);
|
|
|
|
|
}
|
|
|
|
|
~VariableOrder() = default;
|
|
|
|
|
VL_UNCOPYABLE(VariableOrder);
|
|
|
|
|
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
//######################################################################
|
|
|
|
|
|
|
|
|
|
// Simple sort
|
|
|
|
|
void simpleSortVars(std::vector<AstVar*>& varps) {
|
|
|
|
|
stable_sort(varps.begin(), varps.end(),
|
|
|
|
|
[this](const AstVar* ap, const AstVar* bp) -> bool {
|
|
|
|
|
if (ap->isStatic() != bp->isStatic()) { // Non-statics before statics
|
|
|
|
|
return bp->isStatic();
|
|
|
|
|
}
|
2024-09-06 14:04:26 +02:00
|
|
|
UASSERT(m_attributes.find(ap) != m_attributes.end()
|
|
|
|
|
&& m_attributes.find(bp) != m_attributes.end(),
|
|
|
|
|
"m_attributes should be populated for each AstVar");
|
|
|
|
|
const auto& attrA = m_attributes.at(ap);
|
|
|
|
|
const auto& attrB = m_attributes.at(bp);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
if (attrA.anonOk != attrB.anonOk) { // Anons before non-anons
|
|
|
|
|
return attrA.anonOk;
|
|
|
|
|
}
|
|
|
|
|
return attrA.stratum < attrB.stratum; // Finally sort by stratum
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Sort by MTask-affinity first, then the same as simpleSortVars
|
|
|
|
|
void tspSortVars(std::vector<AstVar*>& varps) {
|
|
|
|
|
// Map from "MTask affinity" -> "variable list"
|
2024-03-16 17:32:12 +01:00
|
|
|
std::map<const MTaskIdVec, std::vector<AstVar*>> m2v;
|
|
|
|
|
const MTaskIdVec emptyVec(ExecMTask::numUsedIds(), false);
|
|
|
|
|
for (AstVar* const varp : varps) {
|
|
|
|
|
const auto it = m_mTaskAffinity.find(varp);
|
|
|
|
|
const MTaskIdVec& key = it == m_mTaskAffinity.end() ? emptyVec : it->second;
|
|
|
|
|
m2v[key].push_back(varp);
|
|
|
|
|
}
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
|
|
|
|
|
// Create a TSP sort state for each unique MTaskIdSet, except for the empty set
|
|
|
|
|
V3TSP::StateVec states;
|
|
|
|
|
for (const auto& pair : m2v) {
|
2024-03-16 17:32:12 +01:00
|
|
|
const MTaskIdVec& vec = pair.first;
|
|
|
|
|
const bool empty = std::find(vec.begin(), vec.end(), true) == vec.end();
|
|
|
|
|
if (!empty) states.push_back(new VarTspSorter{vec});
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Do the TSP sort
|
|
|
|
|
V3TSP::StateVec sortedStates;
|
|
|
|
|
V3TSP::tspSort(states, &sortedStates);
|
|
|
|
|
|
|
|
|
|
varps.clear();
|
|
|
|
|
|
|
|
|
|
// Helper function to sort given vector, then append to 'varps'
|
|
|
|
|
const auto sortAndAppend = [this, &varps](std::vector<AstVar*>& subVarps) {
|
|
|
|
|
simpleSortVars(subVarps);
|
2024-02-25 23:12:13 +01:00
|
|
|
for (AstVar* const varp : subVarps) varps.push_back(varp);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
// Enumerate by sorted MTaskIdSet, sort within the set separately
|
|
|
|
|
for (const V3TSP::TspStateBase* const stateBasep : sortedStates) {
|
|
|
|
|
const VarTspSorter* const statep = dynamic_cast<const VarTspSorter*>(stateBasep);
|
2024-03-16 17:32:12 +01:00
|
|
|
sortAndAppend(m2v[statep->mTaskIds()]);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
VL_DO_DANGLING(delete statep, statep);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Finally add the variables with no known MTask affinity
|
2024-03-16 17:32:12 +01:00
|
|
|
sortAndAppend(m2v[emptyVec]);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
|
|
|
|
|
2025-08-21 10:43:37 +02:00
|
|
|
// cppcheck-suppress constParameterPointer
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
void orderModuleVars(AstNodeModule* modp) {
|
|
|
|
|
// Unlink all module variables from the module, compute attributes
|
|
|
|
|
for (AstNode *nodep = modp->stmtsp(), *nextp; nodep; nodep = nextp) {
|
|
|
|
|
nextp = nodep->nextp();
|
|
|
|
|
if (AstVar* const varp = VN_CAST(nodep, Var)) {
|
2024-09-06 14:04:26 +02:00
|
|
|
m_varps.push_back(varp);
|
|
|
|
|
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
// Compute attributes up front
|
|
|
|
|
// Stratum
|
|
|
|
|
const int sigbytes = varp->dtypeSkipRefp()->widthAlignBytes();
|
Deprecate clocker attribute and --clk option (#6463)
The only use for the clocker attribute and the AstVar::isUsedClock that
is actually necessary today for correctness is to mark top level inputs
of --lib-create blocks as being (or driving) a clock signal. Correctness
of --lib-create (and hence hierarchical blocks) actually used to depend
on having the right optimizations eliminate intermediate clocks (e.g.:
V3Gate), when the top level port was not used directly in a sensitivity
list, or marking top level signals manually via --clk or the clocker
attribute. However V3Sched::partition already needs to trace through the
logic to figure out what signals might drive a sensitivity list, so it
can very easily mark all top level inputs as such.
In this patch we remove the AstVar::attrClocker and AstVar::isUsedClock
attributes, and replace them with AstVar::isPrimaryClock, automatically
set by V3Sched::partition. This eliminates all need for manual
annotation so we are deprecating the --clk/--no-clk options and the
clocker/no_clocker attributes.
This also eliminates the opportunity for any further mis-optimization
similar to #6453.
Regarding the other uses of the removed AstVar attributes:
- As of 5.000, initial edges are triggered via a separate mechanism
applied in V3Sched, so the use in V3EmitCFunc.cpp is redundant
- Also as of 5.000, we can handle arbitrary sensitivity expressions, so
the restriction on eliminating clock signals in V3Gate is unnecessary
- Since the recent change when Dfg is applied after V3Scope, it does
perform the equivalent of GateClkDecomp, so we can delete that pass.
2025-09-20 16:50:22 +02:00
|
|
|
const uint8_t stratum = (v3Global.opt.hierChild() && varp->isPrimaryIO()) ? 0
|
|
|
|
|
: (varp->isPrimaryClock() && varp->widthMin() == 1) ? 1
|
|
|
|
|
: VN_IS(varp->dtypeSkipRefp(), UnpackArrayDType) ? 9
|
|
|
|
|
: (varp->basicp() && varp->basicp()->isOpaque()) ? 8
|
|
|
|
|
: (varp->isScBv() || varp->isScBigUint()) ? 7
|
|
|
|
|
: (sigbytes == 8) ? 6
|
|
|
|
|
: (sigbytes == 4) ? 5
|
|
|
|
|
: (sigbytes == 2) ? 3
|
|
|
|
|
: (sigbytes == 1) ? 2
|
|
|
|
|
: 10;
|
2025-08-26 04:05:40 +02:00
|
|
|
m_attributes.emplace(varp, VarAttributes{stratum, EmitCUtil::isAnonOk(varp)});
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2024-09-06 14:04:26 +02:00
|
|
|
if (!m_varps.empty()) {
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
if (!v3Global.opt.mtasks()) {
|
2024-09-06 14:04:26 +02:00
|
|
|
simpleSortVars(m_varps);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
} else {
|
2024-09-06 14:04:26 +02:00
|
|
|
tspSortVars(m_varps);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
public:
|
2024-09-06 14:04:26 +02:00
|
|
|
static void processModule(AstNodeModule* modp, const MTaskAffinityMap& mTaskAffinity,
|
|
|
|
|
std::vector<AstVar*>& varps) VL_MT_STABLE {
|
|
|
|
|
VariableOrder{modp, mTaskAffinity, varps};
|
2024-03-16 17:32:12 +01:00
|
|
|
}
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
//######################################################################
|
|
|
|
|
// V3VariableOrder static functions
|
|
|
|
|
|
2024-03-16 17:32:12 +01:00
|
|
|
void V3VariableOrder::orderAll(AstNetlist* netlistp) {
|
2025-05-23 02:29:32 +02:00
|
|
|
UINFO(2, __FUNCTION__ << ":");
|
2024-03-16 17:32:12 +01:00
|
|
|
|
|
|
|
|
MTaskAffinityMap mTaskAffinity;
|
|
|
|
|
|
|
|
|
|
// Gather MTask affinities
|
|
|
|
|
if (v3Global.opt.mtasks()) {
|
|
|
|
|
netlistp->topModulep()->foreach([&](AstExecGraph* execGraphp) {
|
2024-03-26 00:06:25 +01:00
|
|
|
for (const V3GraphVertex& vtx : execGraphp->depGraphp()->vertices()) {
|
|
|
|
|
GatherMTaskAffinity::apply(vtx.as<const ExecMTask>(), mTaskAffinity);
|
2024-03-16 17:32:12 +01:00
|
|
|
}
|
|
|
|
|
});
|
|
|
|
|
}
|
2024-09-06 14:04:26 +02:00
|
|
|
if (v3Global.opt.stats()) V3Stats::statsStage("variableorder-gather");
|
|
|
|
|
|
|
|
|
|
// Sort variables for each module
|
|
|
|
|
std::unordered_map<AstNodeModule*, std::vector<AstVar*>> sortedVars;
|
|
|
|
|
{
|
|
|
|
|
V3ThreadScope threadScope;
|
|
|
|
|
|
|
|
|
|
for (AstNodeModule* modp = v3Global.rootp()->modulesp(); modp;
|
|
|
|
|
modp = VN_AS(modp->nextp(), NodeModule)) {
|
|
|
|
|
std::vector<AstVar*>& varps = sortedVars[modp];
|
|
|
|
|
threadScope.enqueue([modp, mTaskAffinity, &varps]() {
|
|
|
|
|
VariableOrder::processModule(modp, mTaskAffinity, varps);
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if (v3Global.opt.stats()) V3Stats::statsStage("variableorder-sort");
|
2024-03-16 17:32:12 +01:00
|
|
|
|
2024-09-06 14:04:26 +02:00
|
|
|
// Insert them back under the module, in the new order, but at
|
|
|
|
|
// the front of the list so they come out first in dumps/XML.
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
for (AstNodeModule* modp = v3Global.rootp()->modulesp(); modp;
|
2021-10-22 14:56:48 +02:00
|
|
|
modp = VN_AS(modp->nextp(), NodeModule)) {
|
2024-09-06 14:04:26 +02:00
|
|
|
const std::vector<AstVar*>& varps = sortedVars[modp];
|
|
|
|
|
|
|
|
|
|
if (!varps.empty()) {
|
|
|
|
|
auto it = varps.cbegin();
|
|
|
|
|
AstVar* const firstp = *it++;
|
|
|
|
|
firstp->unlinkFrBack();
|
|
|
|
|
for (; it != varps.cend(); ++it) {
|
|
|
|
|
AstVar* const varp = *it;
|
|
|
|
|
varp->unlinkFrBack();
|
|
|
|
|
firstp->addNext(varp);
|
|
|
|
|
}
|
|
|
|
|
if (AstNode* const stmtsp = modp->stmtsp()) {
|
|
|
|
|
stmtsp->unlinkFrBackWithNext();
|
|
|
|
|
AstNode::addNext<AstNode, AstNode>(firstp, stmtsp);
|
|
|
|
|
}
|
|
|
|
|
modp->addStmtsp(firstp);
|
|
|
|
|
}
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|
2024-03-16 17:32:12 +01:00
|
|
|
|
|
|
|
|
// Done
|
2024-01-09 16:35:13 +01:00
|
|
|
V3Global::dumpCheckGlobalTree("variableorder", 0, dumpTreeEitherLevel() >= 3);
|
Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-06-29 18:57:07 +02:00
|
|
|
}
|