This patch gets rid of over 80% of temporary dynamic memory allocations
(when a malloced node is immediately freed with no other malloc in
between). It also gets rid of over 20% of all calls to malloc.
It's worth ~3% average verilation speed up with tcmalloc, and more
without tcmalloc.
Internals: Refactor generate construct Ast handling (#6280)
We introduce AstNodeGen, the common base class of AstGenBlock,
AstGenCase, AstGenFor, and AstGenIf, which together represent all SV
generate constructs. Subsequently remove AstNodeFor, AstNodeCase
(AstCase is now directly derived from AstNodeStmt) and adjust internals
to work on the new representation.
Output is identical modulo hashes do to changed AstNode type ids, no
functional change intended.
Step towards #6280.
Rename AstAssignAlias to AstAlias and make it derive from AstNode
instead of AstNodeStmt.
Replace AstAlias with AstAssignW in V3LinkDot::linkDotScope, which is
the last place we need to be aware of the alias construct. Using
AstAssignW dowstream enables further optimization while preserving the
same functionality.
The only use for the clocker attribute and the AstVar::isUsedClock that
is actually necessary today for correctness is to mark top level inputs
of --lib-create blocks as being (or driving) a clock signal. Correctness
of --lib-create (and hence hierarchical blocks) actually used to depend
on having the right optimizations eliminate intermediate clocks (e.g.:
V3Gate), when the top level port was not used directly in a sensitivity
list, or marking top level signals manually via --clk or the clocker
attribute. However V3Sched::partition already needs to trace through the
logic to figure out what signals might drive a sensitivity list, so it
can very easily mark all top level inputs as such.
In this patch we remove the AstVar::attrClocker and AstVar::isUsedClock
attributes, and replace them with AstVar::isPrimaryClock, automatically
set by V3Sched::partition. This eliminates all need for manual
annotation so we are deprecating the --clk/--no-clk options and the
clocker/no_clocker attributes.
This also eliminates the opportunity for any further mis-optimization
similar to #6453.
Regarding the other uses of the removed AstVar attributes:
- As of 5.000, initial edges are triggered via a separate mechanism
applied in V3Sched, so the use in V3EmitCFunc.cpp is redundant
- Also as of 5.000, we can handle arbitrary sensitivity expressions, so
the restriction on eliminating clock signals in V3Gate is unnecessary
- Since the recent change when Dfg is applied after V3Scope, it does
perform the equivalent of GateClkDecomp, so we can delete that pass.
Large scale refactoring to simplify some of the more obtuse internals of
DFG. Remove multiple redundant internal APIs, simplify representation of
variables, fix potential unsoundness in circular decomposition. No
functional change intended.
Added cppcheck-suppressions.txt in the repo root. You can add new
patterns in there instead of having to parse the XML output.
Also configure to add the -D__GNUC__ preprocessor macro, which makes it
understand UASSERT (it understands the 'noreturn' function attribute).
Added some case by case specific suppressions and fixed up other code,
especially in V3Ast*h and V3Dfg*.h, including code generated by astgen
that had some no-ops that irks cppcheck.
One thing it does not seem to like is `const` class members with default
initializers in the class. It will assume that's always the value, even
if overridden in the constructor. We had few so removed them.
With that a lot of files in `src/` are now clean or only have a handful
of issues. Therefore, I have also deleted cppcheck_filtered, and made it
produce human readable output straight to the terminal.
Regarding cleaning up the reported nits, I kind of got bored after
V3[A-E] so pausing here. Apologies for the merge conflicts.
Tested with cppcheck 2.13.0
This patch adds DfgLogic, which is a vertex that represents a whole,
arbitrarily complex combinational AstAlways or AstAssignW in the
DfgGraph.
Implementing this requires computing the variables live at entry to the
AstAlways (variables read by the block), so there is a new
ControlFlowGraph data structure and a classical data-flow analysis based
live variable analysis to do that at the variable level (as opposed to
bit/element level).
The actual CFG construction and live variable analysis is best effort,
and might fail for currently unhandled constructs or data types. This
can be extended later.
V3DfgAstToDfg is changed to convert the Ast into an initial DfgGraph
containing only DfgLogic, DfgVertexSplice and DfgVertexVar vertices.
The DfgLogic are then subsequently synthesized into primitive operations
by the new V3DfgSynthesize pass, which is a combination of the old
V3DfgAstToDfg conversion and new code to handle AstAlways blocks with
complex flow control.
V3DfgSynthesize by default will synthesize roughly the same constructs
as V3DfgAstToDfg used to handle before, plus any logic that is part of a
combinational cycle within the DfgGraph. This enables breaking up these
cycles, for which there are extensions to V3DfgBreakCycles in this patch
as well. V3DfgSynthesize will then delete all non synthesized or non
synthesizable DfgLogic vertices and the rest of the Dfg pipeline is
identical, with minor changes to adjust for the changed representation.
Because with this change we can now eliminate many more UNOPTFLAT, DFG
has been disabled in all the tests that specifically target testing the
scheduling and reporting of circular combinational logic.
Rewrote V3LifePost to not depend on having AstAssignPre and
AstAssignPost types, but work with generic AstNodeAssign. There is an
extra flag in AstVarScope to denote it's part of an NBA and should be
considered.
Step towards #6280.
Remove AstJumpLabel
AstJumpGo now references one if its enclosing AstJumpBlocks, and
branches straight after the referenced block.
That is:
```
JumpBlock a {
...
JumpGo(a);
...
}
// <--- the JumpGo(a) goes here
```
This is sufficient for all use cases and makes control flow much easier to
reason about. As a result, V3Const can optimize a bit more aggressively.
Second half of, and fixes#6216