Fixes#3872.
Testing this is a bit tricky, as the front-end fixes up the operand
widths in shifts to match, and we need V3Const to introduce a mismatched
one by reducing `4'd2 ** x` (with x being 2 2-bit wide signal) to `4'd1
<< x`, but t_dfg_peephole runs with V3Const disabled exactly because it
makes it hard to write tests. Rather than fixing this one case in
V3Const (which we should do systematically at some point), I fixed DFG
to accept these just in case V3Const generates more of them. The
assertions were there only because of paranoia (as I thought these were
not possible inputs), the code otherwise works.
In order to avoid unexpected breakage on multi-driven variables, we
resolve in DFG construction by using only the first driver encountered.
Also issues the MULTIDRIVEN error for these signals.
Replace the 'run to fixed point' algorithm with a work list driven
approach. Instead of marking the graph as changed, we explicitly add
vertices to the work list, to be visited, when a vertex is changed. This
improves both memory locality (as the work list is processed in last in
first out order), and removed unnecessary visitations when only a few
nodes changes.
Folding an AstLogAnd with a non-zero constant operand used to coerce the
type of the other operand, yielding an ill-typed node that DFG was then
unhappy about. Add a RedOr instead if the width of the replacement
operand is greater than zero.
Fixes#3726
Apart from the representational changes below, this patch renames
AstNodeMath to AstNodeExpr, and AstCMath to AstCExpr.
Now every expression (i.e.: those AstNodes that represent a [possibly
void] value, with value being interpreted in a very general sense) has
AstNodeExpr as a super class. This necessitates the introduction of an
AstStmtExpr, which represents an expression in statement position, e.g :
'foo();' would be represented as AstStmtExpr(AstCCall(foo)). In exchange
we can get rid of isStatement() in AstNodeStmt, which now really always
represent a statement
Peak memory consumption and verilation speed are not measurably changed.
Partial step towards #3420
In V3Active, we try hard to turn `always @(a or b or c)` into an
`always_comb` if the only variables read in the block are also in the
sensitivity list. In addition, also allow this optimization when reading
variables that are not in the sensitivity list, but are known to be
constant/never changing after initialization. In particular lookup
tables introduced by V3Table are covered by this. This can have a
significant impact on designs that use the `always @(a or b or c)` style
for combinational logic.
The cost of an AstCMethodHard right now is generally unknown. However,
VlTriggerVec::at is used a lot in conditions, so we make an effort
to estimate this correctly via 2 changes:
- In general when an AstVarRef appears as the target of an
AstCMethodHard, we cost it as a simple address computation (an add)
- Check for VlTriggerVec::at explicitly when costing AstCMethodHard,
which is essentially a load.
This can have a significant effect when there are a lot of unique
triggers in the design.
In non-static contexts like class objects or stack frames, the use of
global trigger evaluation is not feasible. The concept of dynamic
triggers allows for trigger evaluation in such cases. These triggers are
simply local variables, and coroutines are themselves responsible for
evaluating them. They await the global dynamic trigger scheduler object,
which is responsible for resuming them during the trigger evaluation
step in the 'act' eval region. Once the trigger is set, they await the
dynamic trigger scheduler once again, and then get resumed during the
resumption step in the 'act' eval region.
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
This changeset brings support for accesses like:
class Cls#(type TYPE1);
TYPE1::some_method();
endclass
It is done by delaying dot resolution on type parameters until they get
resolved by V3Param, and doing a more thorough reference skip.
In DFG DfgVertex::width() is only defined for vertices representing
packed values, which DfgVertex::hash() used to violate. The only
non-packed values at the moment are DfgVarArray, which is a
DfgVertexVar, which are handled specially anyway, so this is easy to
fix.
Fixes#3682
In order to not leak signal names with --protect-ids, we simply make the
trigger dump function empty (this is a debug only construct).
Partial fix for #3689
The emitted SystemC types (e.g. sc_bv) are not interchangeable with
Verilator internal C++ types (e.g.: VlWide), so the variables themselves
are not interchangeable (but can be assigned to/from each other). We can
preserve correctness simply be not inlining any SystemC variables (i.e.:
don't simplify any 'sc = nonSc' or 'nonSc = sc' assignments). SystemC
types only appear at top level ports so this should have no significant
impact.
Fixes#3688
Prevents the possibility of assigning an integer to a class reference,
both at the SystemVerilog and the emitted C++ levels.
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Vertices representing variables (DfgVertexVar) and constants (DfgConst)
are very common (40-50% of all vertices created in some large designs),
and we also need to, or can treat them specially in algorithms. Keep
these as separate lists in DfgGraph for direct access to them. This
improve verilation speed.
Cyclic components are now extracted separately, so there is no
functional reason to have to do a topological sort (previously we used it
to detect cyclic graphs). Removing it to gain some speed.
AstSel is a ternary node, but the 'widthp' is always constant and is
hence redundant, and 'lsbp' is very often constant. As AstSel is fairly
common, we special case as a DfgSel for the constant 'lsbp', and as
'DfgMux` for the non-constant 'lsbp'.
Added a DfgVertex::user() mechanism for storing data in vertices.
Similar in spirit to AstNode user data, but the generation counter is
stored in the DfgGraph the vertex is held under. Use this to cache
DfgVertex::hash results, and also speed up DfgVertex hashing in general.
Use these and additional improvements to speed up CSE.
`V3SchedTiming` currently assumes that if a fork still exists, it must
have statements within it (otherwise it would have been deleted by
`V3Timing`). However, in a case like this:
```
module t;
reg a;
initial fork a = 1; join
endmodule
```
the assignment in the fork is optimized out by `V3Dead` after
`V3Timing`. This leads to `V3SchedTiming` accessing fork's `stmtsp`
pointer, which at this point is null. This patch addresses that issue.
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Allow constant folding through adjacent nodes of all associative
operations, for example '((a & 2) & 3)' or '(3 & (2 & a))' can now be
folded into '(a & 2)' and '(2 & a)' respectively. Also improve speed of
making associative expression trees right leaning by using rotation of
the existing vertices whenever instead of allocation of new nodes.
Only apply when there is guaranteed to be a subsequent constant folding
and elimination of some of the expression, otherwise this sometimes
interferes with the simplification of concatenations and harms overall
performance.
Before this change, a design verilated with `--timing` that does not
actually use timing features would be emitted with `eventsPending` and
`nextTimeSlot` declared in the top class. However, their definitions
would be missing, leading to linker errors during design compilation.
This patch makes Verilator always emit the definitions, which prevents
linker errors. Trying to use `nextTimeSlot` without delays in the design
will result in an error at runtime.
Also added a testing only -fno-const-before-dfg option, as otherwise
V3Const eats up a lot of the simple inputs. A lot of the things V3Const
swallows in the simple cases can make it to DFG in complex cases, or DFG
itself can create them during optimization. In any case to save
complexity of testing DFG constant folding, we use this option to turn
off V3Const prior to the DFG passes in the relevant test.
Some optimizations are only a net win if they help us remove a graph
node (or at least ensure they don't grow the graph), or yields otherwise
special logic, so try to apply them only in these cases.
Use the same style, and reuse the bulk of astgen to generate DfgVertex
related code. In particular allow for easier definition of custom
DfgVertex sub-types that do not directly correspond to an AstNode
sub-type. Also introduces specific names for the fixed arity vertices.
No functional change intended.
A lot of optimizations in DFG assume a DAG, but the more things are
representable, the more likely it is that a small cyclic sub-graph is
present in an otherwise very large graph that is mostly acyclic. In
order to avoid loosing optimization opportunities, we explicitly extract
the cyclic sub-graphs (which are the strongly connected components +
anything feeing them, up to variable boundaries) and treat them
separately. This enables optimization of the remaining input.