There was exactly one place in V3Task, handling DPI arguments when we
relied on cleanOut of AstCExpr being false for masking. Made that code
do the relevant masking via a few new run-time functions, which also
eliminates some special cases in the relevant V3Task functions.
This patch implements #6480. All loop statements are represented using
AstLoop and AstLoopTest.
This necessitates rework of the loop unroller to handle loops of
arbitrary form. To enable this, I have split the old unroller used for
'generate for' statements and moved it into V3Param, and subsequently
rewrote V3Unroll to handle the new representation. V3Unroll can now
unroll more complex loops, including with loop conditions containing
multiple variable references or inlined functions.
Handling the more generic code also requires some restrictions. If a
loop contains any of the following, it cannot be unrolled:
- A timing control that might suspend the loop
- A non-inlined call to a non-pure function
These constructs can change the values of variables in the loop, so are
generally not safe to unroll if they are present. (We could still unroll
if all the variables needed for unrolling are automatic, however we
don't do that right now.)
These restrictions seem ok in the benchmark suite, where the new
unroller can generally unroll many more loops than before.
Internals: Refactor generate construct Ast handling (#6280)
We introduce AstNodeGen, the common base class of AstGenBlock,
AstGenCase, AstGenFor, and AstGenIf, which together represent all SV
generate constructs. Subsequently remove AstNodeFor, AstNodeCase
(AstCase is now directly derived from AstNodeStmt) and adjust internals
to work on the new representation.
Output is identical modulo hashes do to changed AstNode type ids, no
functional change intended.
Step towards #6280.
There were a couple corner case bugs in V3Inline, and one in Dfg when
dealing with inlining of modules/variables.
V3Inline:
- Invalid code generated when inlining an input that also had an
assignment to it (Throws an ASSIGNIN, but this is sometimes reasonable
to do, e.g. hiererchical reference to an unonnected input port)
- Inlining (aliasing) publicly writeable input port.
- Inlining forcable port connected to constant.
Dfg:
- Inining publicly writeable variables
The tests that cover these are the same and fixing one will trigger the
other bug, so fixing them all in one go. Also cleanup V3Inline to be less
out of order and rely less on unique APIs only used by V3Inine (will
remove those in follow up patch).
Small step towards #6280.
Having many triggers still hits a bottleneck in LLVM leading to long
compile times.
Instead of setting triggers bit-wise, set them as a whole 64-bit word
when possible. This improves C++ compile times by ~4x on some large
designs and has minor run-time performance benefit.
Continuing the idea of decoupling the implementations of the various algorithms.
The main points:
-Move the former "processDomain" stuff, dealing with assigning combinational logic into the relevant sensitivity domains into V3OrderProcessDomains.cpp
-Move the parallel code construction in V3OrderParallel.cpp (Could combine this with some parts of V3Partition - those not called from V3Partition::finalize - but that's not for this patch).
-Move the serial code construction into V3OrderSerial.cpp
-Factored the very small common code between the parallel and serial code construction (processMoveOneLogic) into V3OrderCFuncEmitter.cpp
Again --prof-exec have bit-rotted a little with all the recent changes
to the structure of the generated code. This patch contains a few
improvements:
- Repalce the eval/evl_loop begin/end events with generic
section_push/section_pop events, that can be arbitrarily sprinkled
into the generate code (so long as they are matched correctly) to
measure various sections. The report then contains a nested profile
of the sections, and the VCD trace shows the section names.
- Better handling of exec graphs
- Clearer overall statistics