All code is built as C++ via CXX, but we still have some references to
CC. Trying to make sure we don't add plain C later by hiding the C
compiler. (So it's always enough to override CXX=... in configure)
This patch gets rid of over 80% of temporary dynamic memory allocations
(when a malloced node is immediately freed with no other malloc in
between). It also gets rid of over 20% of all calls to malloc.
It's worth ~3% average verilation speed up with tcmalloc, and more
without tcmalloc.
This patch implements #6480. All loop statements are represented using
AstLoop and AstLoopTest.
This necessitates rework of the loop unroller to handle loops of
arbitrary form. To enable this, I have split the old unroller used for
'generate for' statements and moved it into V3Param, and subsequently
rewrote V3Unroll to handle the new representation. V3Unroll can now
unroll more complex loops, including with loop conditions containing
multiple variable references or inlined functions.
Handling the more generic code also requires some restrictions. If a
loop contains any of the following, it cannot be unrolled:
- A timing control that might suspend the loop
- A non-inlined call to a non-pure function
These constructs can change the values of variables in the loop, so are
generally not safe to unroll if they are present. (We could still unroll
if all the variables needed for unrolling are automatic, however we
don't do that right now.)
These restrictions seem ok in the benchmark suite, where the new
unroller can generally unroll many more loops than before.
Internals: Refactor generate construct Ast handling (#6280)
We introduce AstNodeGen, the common base class of AstGenBlock,
AstGenCase, AstGenFor, and AstGenIf, which together represent all SV
generate constructs. Subsequently remove AstNodeFor, AstNodeCase
(AstCase is now directly derived from AstNodeStmt) and adjust internals
to work on the new representation.
Output is identical modulo hashes do to changed AstNode type ids, no
functional change intended.
Step towards #6280.
Rename AstAssignAlias to AstAlias and make it derive from AstNode
instead of AstNodeStmt.
Replace AstAlias with AstAssignW in V3LinkDot::linkDotScope, which is
the last place we need to be aware of the alias construct. Using
AstAssignW dowstream enables further optimization while preserving the
same functionality.
The only use for the clocker attribute and the AstVar::isUsedClock that
is actually necessary today for correctness is to mark top level inputs
of --lib-create blocks as being (or driving) a clock signal. Correctness
of --lib-create (and hence hierarchical blocks) actually used to depend
on having the right optimizations eliminate intermediate clocks (e.g.:
V3Gate), when the top level port was not used directly in a sensitivity
list, or marking top level signals manually via --clk or the clocker
attribute. However V3Sched::partition already needs to trace through the
logic to figure out what signals might drive a sensitivity list, so it
can very easily mark all top level inputs as such.
In this patch we remove the AstVar::attrClocker and AstVar::isUsedClock
attributes, and replace them with AstVar::isPrimaryClock, automatically
set by V3Sched::partition. This eliminates all need for manual
annotation so we are deprecating the --clk/--no-clk options and the
clocker/no_clocker attributes.
This also eliminates the opportunity for any further mis-optimization
similar to #6453.
Regarding the other uses of the removed AstVar attributes:
- As of 5.000, initial edges are triggered via a separate mechanism
applied in V3Sched, so the use in V3EmitCFunc.cpp is redundant
- Also as of 5.000, we can handle arbitrary sensitivity expressions, so
the restriction on eliminating clock signals in V3Gate is unnecessary
- Since the recent change when Dfg is applied after V3Scope, it does
perform the equivalent of GateClkDecomp, so we can delete that pass.
These are no longer required for correct scheduling. They are still
accepted for backward compatibility, but have no effect on simulation
and are dropped in the front-end. Also removed the then redundant
AstAlwaysPublic class.
Fixes#6442
Added a lock around the table used to detect memory leaks. Note the only
part that is multi-threaded at the moment is V3Emit where we create
AstCFiles and the like, so the lock should be almost always uncontested.
This code is not thread safe. Specifically AstNode constructors are not
thread safe, as they may create entries in the shared Dtype table via
which can be racy.
Replace std::exit with v3Global.exit, and make V3Error::vlAbort call
v3Global.shutdown. This gives us an opportunity to release resources to
facilitate leak checking even when exiting early on an error.
Note we still don't release most resources by default without
VL_LEAK_CHECKS, so there is no behaviour change there.
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Signed-off-by: Artur Bieniek <abieniek@internships.antmicro.com>
Co-authored-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Combined Dfg variable elimination into the regularization pass that runs
before converting back to Ast. This avoids introducing some unnecessary
temporaries.
Added replacing of variables with constants in the Ast if after the
Dfg passes they are known to be constants. This is only done in final
scoped Dfg application.
Avoid introducing temporaries for common sub-expressions that are
cheaper to re-compute than store in a temporary variable.
Enable removal of redundant unpacked array variables.
Also fixes#6394 as this patch involved changes to that code.
Do not apply V3Const in V3Expand after a function if nothing was
expanded in it.
Also fix statistics counter while at it.
Inspired by #6379, follow up from #6111
1. Move class V3ParseGrammar into V3ParseGrammar.h so editors understand
it as c++ code
2. Fix #line directives in the bison output file
This together enables us to gdb through V3ParseGrammar, verilog.y, and
the bison generated C code step by step, with all source annotations in
the debug info pointing to the right place (e.g.: you will step to the
right place in verilog.y, then step back to the bison generated switch
statement/loop, and then step into calls in V3ParseGrammar as kind of
expected.
These are all genuine bugs, brief descriptions.
1. V3OrderCFuncEmitter.h used to delete a node early that was still
reference in a graph dump later. Not a big deal, it can be deleted
later at the end of V3Order.
2. V3Param.cpp: this one is tricky. The variable referenced by
AstVarXRef was deleted at the end of `visit(AstGenCase*)`, but then
`visit(AstVarXRef*)` checks `nodep->varp()` (already deleted) to see
if it's in an interface.
3. V3String::wildMatch is sometimes called with an empty 's' (the string
we are matching against tha pattern 'p'), in which case it used to go
off into the woods. Added check on call. An arbitrary number of `*`
will still match the empty string.
4. V3Task.cpp: There was an error reported for an unsupported construct,
then a subsequent SEGV. Just signal the error upward so we bail on an
error in a more graceful way.
5. verylog.y: Some unsupported constructs failed to set the parsed node,
so some memory thrash made it into some code downstream. Just parse
these into nullptr.
Also increased the timeout on one test, which sometimes tripped with
asan on GCC during heavy host load.
Added a mini type system for Dfg using DfgDataType to replace Dfg's use
of AstNodeDType. This is much more restricted and represents only the
types Dfg can handle in a canonical form. This will be needed when
adding more support for unpacked arrays and maybe unpacked structs one
day.
Also added an internal type checker for DfgGraphs which encodes all the
assumptions the code makes about type relationships in the graph. Run
this in a few places with --debug-check. Fix resulting fallout.
Reduce set of synthesized logic to be more in-line with what Dfg used to
handle before + drivers of circular variables. This was always the
intention but the previous algorithm was both a bit too eager, and also
missed some circular variables. We can add back more heuristics based on
performance measurements for non-circular logic later.
Parts of this algorithm were distributed over many files some
masquerading as re-usable APIs, they were not. Move everything into one
file and avoid unnecessary virtual functions.
- Remove _ENUM_END, so -Wswitch does not demand it's covered. Use the
new NUM_TYPES constexpr member instead.
- Remove 'at' prefix. This seems historical and is not particularly useful.
- Fix some cppcheck warts while at it
Add DfgUserMap as a handle around the one pointer worth of algorithm
specific 'user' storage in each DfgVertex. This reduces verbosity,
improves type safety and correctness. Also enables us to remove one
pointer from DfgVertex to reduce memory use. No functional change.
Large scale refactoring to simplify some of the more obtuse internals of
DFG. Remove multiple redundant internal APIs, simplify representation of
variables, fix potential unsoundness in circular decomposition. No
functional change intended.
* Fix broken support of unassigned virtual interfaces
Unassigned virtual interface support added by #6245 is broken - PR marks
dead module as alive - we can't do that as once a module is dead it
needs to remain dead because earlier steps (e.g. port resolution) have
already been skipped.
This commit handles unassigned virtual interfaces at the beginning of
first pass of LinkDot (so it is never marked as dead, and no linking
steps are getting skipped).
Fixes#6253.
* Apply suggestions from code review
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Apply 'make format'
* Revert add of redundant iterateChildren() call
* Add original test case
---------
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
Co-authored-by: github action <action@example.com>
The previous algorithm was designed to handle the general case where a
full control flow path predicate is required to select which value to
use when synthesizing control flow join point in an always block.
Here we add a better algorithm that tries to use the predicate of
the closest dominating branch if the branch paths dominate the joining
paths. This is almost universally true in synthesizable logic (RTLMeter
has no exceptions), however there are cases where this is not
applicable, for which we fall back on the previous generic algorithm.
Overall this significantly simplifies the synthesized Dfg graphs and
enables further optimization.
There were a couple corner case bugs in V3Inline, and one in Dfg when
dealing with inlining of modules/variables.
V3Inline:
- Invalid code generated when inlining an input that also had an
assignment to it (Throws an ASSIGNIN, but this is sometimes reasonable
to do, e.g. hiererchical reference to an unonnected input port)
- Inlining (aliasing) publicly writeable input port.
- Inlining forcable port connected to constant.
Dfg:
- Inining publicly writeable variables
The tests that cover these are the same and fixing one will trigger the
other bug, so fixing them all in one go. Also cleanup V3Inline to be less
out of order and rely less on unique APIs only used by V3Inine (will
remove those in follow up patch).
Small step towards #6280.
Rewrite with much less running around in the templates. Use private
methods only + friend functions that do the actual type check. This
avoids cppcheck warnings.
Added cppcheck-suppressions.txt in the repo root. You can add new
patterns in there instead of having to parse the XML output.
Also configure to add the -D__GNUC__ preprocessor macro, which makes it
understand UASSERT (it understands the 'noreturn' function attribute).
Added some case by case specific suppressions and fixed up other code,
especially in V3Ast*h and V3Dfg*.h, including code generated by astgen
that had some no-ops that irks cppcheck.
One thing it does not seem to like is `const` class members with default
initializers in the class. It will assume that's always the value, even
if overridden in the constructor. We had few so removed them.
With that a lot of files in `src/` are now clean or only have a handful
of issues. Therefore, I have also deleted cppcheck_filtered, and made it
produce human readable output straight to the terminal.
Regarding cleaning up the reported nits, I kind of got bored after
V3[A-E] so pausing here. Apologies for the merge conflicts.
Tested with cppcheck 2.13.0
This patch adds DfgLogic, which is a vertex that represents a whole,
arbitrarily complex combinational AstAlways or AstAssignW in the
DfgGraph.
Implementing this requires computing the variables live at entry to the
AstAlways (variables read by the block), so there is a new
ControlFlowGraph data structure and a classical data-flow analysis based
live variable analysis to do that at the variable level (as opposed to
bit/element level).
The actual CFG construction and live variable analysis is best effort,
and might fail for currently unhandled constructs or data types. This
can be extended later.
V3DfgAstToDfg is changed to convert the Ast into an initial DfgGraph
containing only DfgLogic, DfgVertexSplice and DfgVertexVar vertices.
The DfgLogic are then subsequently synthesized into primitive operations
by the new V3DfgSynthesize pass, which is a combination of the old
V3DfgAstToDfg conversion and new code to handle AstAlways blocks with
complex flow control.
V3DfgSynthesize by default will synthesize roughly the same constructs
as V3DfgAstToDfg used to handle before, plus any logic that is part of a
combinational cycle within the DfgGraph. This enables breaking up these
cycles, for which there are extensions to V3DfgBreakCycles in this patch
as well. V3DfgSynthesize will then delete all non synthesized or non
synthesizable DfgLogic vertices and the rest of the Dfg pipeline is
identical, with minor changes to adjust for the changed representation.
Because with this change we can now eliminate many more UNOPTFLAT, DFG
has been disabled in all the tests that specifically target testing the
scheduling and reporting of circular combinational logic.
Rewrote V3LifePost to not depend on having AstAssignPre and
AstAssignPost types, but work with generic AstNodeAssign. There is an
extra flag in AstVarScope to denote it's part of an NBA and should be
considered.
Step towards #6280.
NBAs targeting a variable in a different scope are now allocated
temporary variables for captured values in the scope of the NBA, not the
scope of the target variable.
Fixes#6286
Both V3DfgBreakCycles.cpp and V3DfgDecomposition.cpp used to contain an
implementation of the same algorithm to color strongly connected
components. Now there is only one, and it lives in V3DfgColorSCCs.cpp.
After making a cyclic DFG component acyclic, merge that component back
into the acyclic sub-graph of the input graph. This enables optimizing
through the components that were made acyclic in their larger context.
Store all flags in a DfgVertexVar relating to the underlying
AstVar/AstVarScope stored via AstNode::user1(). user2/user3/user4 are
then usable by DFG algorithms as needed.
When we had a `{a, b} = ...`, and the DFG conversion of `a = ...`
succeeded, but `b = ...` failed, we still used to include `a = ...` in
the DFG, which then caused a spurious multi-driver error for `a` on
a subsequent DFG pass, as the original `{a, b} = ...` was still present
in the Ast, but we also had the extra `a = ...` from converting out of
DFG on the previous pass.
In this patch we only convert assignments with a concatenation on the
LHS, if all target LValues can be converted into DFG.
This is the proper fix for #4231
Introduce V3OutStream as a V3OutFormatter that writes to a stream
instead of a file. This can be used to emit formatted code fragments
e.g. in debug prints and graph dumps.
Limits columns to 16M columns, or may report wrong column on such long lines.
Limits single token to 64K lines, or may report wrong line for that token.