This patch adds IEEE-1800 compliant scheduling support for the Inactive
scheduling region used for #0 delays.
Implementing this requires that **all** IEEE-1800 active region events
are placed in the internal 'act' section. This has simulation
performance implications. It prevents some optimizations (e.g.
V3LifePost), which reduces single threaded performance. It also reduces
the available work and parallelism in the internal 'nba' section, which
reduced the effectiveness of multi-threading severely.
Performance impact on RTLMeter when using scheduling adjusted to support
proper #0 delays is ~10-20% slowdown in single-threaded mode, and ~100%
(2x slower) with --threads 4.
To avoid paying this performance penalty unconditionally, the scheduling
is only adjusted if either:
1. The input contains a statically known #0 delay
2. The input contains a variable #x delay unknown at compile time
If no #0 is present, but #x variable delays are, a ZERODLY warning is
issued advising the use of '--no-runtime-zero-delay' which is a promise
by the user that none of the variable delays will evaluate to a zero
delay at run-time. This warning is turned off if '--runtime-zero-delay'
is explicitly given. This is similar to the '--timing' option.
If '--no-runtime-zero-delay' was used at compile time, then executing
a zero delay will fail at runtime.
A ZERODLY warning is also issued if a static #0 if found, but the user
specified '--no-runtime-zero-delay'. In this case the scheduling is not
adjusted to support #0, so executing it will fail at runtime. Presumably
the user knows it won't be executed.
The intended behaviour with all this is the following:
No #0, no #var in the design (#constant is OK)
-> Same as current behaviour, scheduling not adjusted,
same code generated as before
Has static #0 and '--no-runtime-zero-delay' is NOT given:
-> No warnings, scheduling adjusted so it just works, runs slow
Has static #0 and '--no-runtime-zero-delay' is given:
-> ZERODLY on the #0, scheduling not adjusted, fails at runtime if hit
No static #0, but has #var and no option is given:
-> ZERODLY on the #var advising use of '--no-runtime-zero-delay' or
'--runtime-zero-delay' (similar to '--timing'), scheduling adjusted
assuming it can be a zero delay and it just works
No static #0, but has #var and '--no-runtime-zero-delay' is given:
-> No warning, scheduling not adjusted, fails at runtime if zero delay
No static #0, but has #var and '--runtime-zero-delay' is given:
-> No warning, scheduling adjusted so it just works
This is an attempt to generate an identical trace file scope hierarchy
both with and without -fno-inline. Primarily because it's needed for
testing in upcoming patch, but also improves consitency prior to #7001
This is primarily cleanup, but there are 2 functional changes included:
- It used to accidentally reorder bodies of AstNodeIf that were outside
an AstAlways. Now it will not touch anything outside an AstAlways.
- Removed one redundant edge from the graph which perturbs the result of
V3Graph::acyclic. This should make no difference for the actual
intended result of reordering NBAs to eliminate shadow variables.
Add a new Dfg pass 'pushDownSel'. This will try to move selects through
a tree of concatenations in order to eliminate temporary nodes holding
intermediate concatenation results. This can get rid of a lot of
variables when packed arrays are assigned in parts (e.g. bit-wise).
Use uint32_t max value instead of zero as sentinel value for a trace
code being unassigned. Prep for follow on patch.
Note the actual trace file will still start codes from one, the codes
in the model are just an offset from the base code.