Instead of using the number of processors in the host, use the number of
processors available to the process, respecting cpu affinity
assignments. Without pthreads, fall back and use the number of
processors in the host as before.
This is now applied everywhere so runing `nuamctl -C 0-3 verilator` or
`numactl -C 0-3 Vsim` should behave as if the host has 4 cores (e.g.
like in CI jobs)
Accessing the ports of hier_block instances directly under the current
hier_block (or top level) work just fine (the heir stub .sv has them),
and this can simplify hooking up dotted references into hier blocks:
push part of the reference under the hier block into the hier block, and
wire it to a port, then resolve the rest of the reference to the port of
the instance.
The only use for the clocker attribute and the AstVar::isUsedClock that
is actually necessary today for correctness is to mark top level inputs
of --lib-create blocks as being (or driving) a clock signal. Correctness
of --lib-create (and hence hierarchical blocks) actually used to depend
on having the right optimizations eliminate intermediate clocks (e.g.:
V3Gate), when the top level port was not used directly in a sensitivity
list, or marking top level signals manually via --clk or the clocker
attribute. However V3Sched::partition already needs to trace through the
logic to figure out what signals might drive a sensitivity list, so it
can very easily mark all top level inputs as such.
In this patch we remove the AstVar::attrClocker and AstVar::isUsedClock
attributes, and replace them with AstVar::isPrimaryClock, automatically
set by V3Sched::partition. This eliminates all need for manual
annotation so we are deprecating the --clk/--no-clk options and the
clocker/no_clocker attributes.
This also eliminates the opportunity for any further mis-optimization
similar to #6453.
Regarding the other uses of the removed AstVar attributes:
- As of 5.000, initial edges are triggered via a separate mechanism
applied in V3Sched, so the use in V3EmitCFunc.cpp is redundant
- Also as of 5.000, we can handle arbitrary sensitivity expressions, so
the restriction on eliminating clock signals in V3Gate is unnecessary
- Since the recent change when Dfg is applied after V3Scope, it does
perform the equivalent of GateClkDecomp, so we can delete that pass.
These are no longer required for correct scheduling. They are still
accepted for backward compatibility, but have no effect on simulation
and are dropped in the front-end. Also removed the then redundant
AstAlwaysPublic class.
Fixes#6442
This patch adds DfgLogic, which is a vertex that represents a whole,
arbitrarily complex combinational AstAlways or AstAssignW in the
DfgGraph.
Implementing this requires computing the variables live at entry to the
AstAlways (variables read by the block), so there is a new
ControlFlowGraph data structure and a classical data-flow analysis based
live variable analysis to do that at the variable level (as opposed to
bit/element level).
The actual CFG construction and live variable analysis is best effort,
and might fail for currently unhandled constructs or data types. This
can be extended later.
V3DfgAstToDfg is changed to convert the Ast into an initial DfgGraph
containing only DfgLogic, DfgVertexSplice and DfgVertexVar vertices.
The DfgLogic are then subsequently synthesized into primitive operations
by the new V3DfgSynthesize pass, which is a combination of the old
V3DfgAstToDfg conversion and new code to handle AstAlways blocks with
complex flow control.
V3DfgSynthesize by default will synthesize roughly the same constructs
as V3DfgAstToDfg used to handle before, plus any logic that is part of a
combinational cycle within the DfgGraph. This enables breaking up these
cycles, for which there are extensions to V3DfgBreakCycles in this patch
as well. V3DfgSynthesize will then delete all non synthesized or non
synthesizable DfgLogic vertices and the rest of the Dfg pipeline is
identical, with minor changes to adjust for the changed representation.
Because with this change we can now eliminate many more UNOPTFLAT, DFG
has been disabled in all the tests that specifically target testing the
scheduling and reporting of circular combinational logic.