Instead of carrying around MTask affinity from scheduling, compute it in
V3VariableOrder (where it is used), by tracing through the code. This
simplifies some code and has the benefit of handling variables
introduced after scheduling. It's worth a few % speed at run-time, and
the new implementation of V3VariableOrder is slightly more efficient,
though the speed/space is still dominated by the TSP sort.
Continuing the idea of decoupling the implementations of the various algorithms.
The main points:
-Move the former "processDomain" stuff, dealing with assigning combinational logic into the relevant sensitivity domains into V3OrderProcessDomains.cpp
-Move the parallel code construction in V3OrderParallel.cpp (Could combine this with some parts of V3Partition - those not called from V3Partition::finalize - but that's not for this patch).
-Move the serial code construction into V3OrderSerial.cpp
-Factored the very small common code between the parallel and serial code construction (processMoveOneLogic) into V3OrderCFuncEmitter.cpp
This saves about 5% memory. V3AstUserAllocator is appropriate for most use
cases, performance is marginally up as we are mostly D-cache bound on
large designs.
This patch adds some abstract enums to pass to the trace decl* APIs, so
the VCD/FST specific code can be kept in verilated_{vcd,fst}_*.cc, and
removed from V3Emit*. It also reworks the generation of the trace init
functions (those that call 'decl*' for the signals) such that the scope
hierarchy is traversed precisely once during initialization, which
simplifies the FST writer. This later change also has the side effect of
fixing tracing of nested interfaces when traced via an interface
reference - see the change in the expected t_interface_ref_trace - which
previously were missed.
Some values emitted to the trace files are constant (e.g.: actual
parameter values), and never change. Previously we used to trace these
in the 'full' dumps, which also included all other truly variable
signals. This patch introduces a new generated trace function 'const',
to complement the 'full' and 'chg' flavour, and 'const' now only
contains the constant signals, while 'full' and 'chg' contain only the
truly variable signals. The generate 'full' and 'chg' trace functions
now have exactly the same shape. Note that 'const' signals are still
traced using the 'full*' dump methods of the trace buffers, so there is
no need for a third flavour of those.
`getCommonClassTypep` and its helper code has been moved to AstNode
class. This is a lot better place for this functionality. Moreover, it
allowed to get rid of the dependency on V3Width from generic AST-related
code.