After conversion of Ast to Dfg, but before synthesizing AstAlways into
primitives, run a pass to remove variables that are not observable, and
all logic that only computes such variables. This can get rid of a lot
of content early so we don't build redundant Dfgs, and also enables
synthesizing always blocks that use temporaries only in some branches,
which will come in a follow up.
This patch adds IEEE-1800 compliant scheduling support for the Inactive
scheduling region used for #0 delays.
Implementing this requires that **all** IEEE-1800 active region events
are placed in the internal 'act' section. This has simulation
performance implications. It prevents some optimizations (e.g.
V3LifePost), which reduces single threaded performance. It also reduces
the available work and parallelism in the internal 'nba' section, which
reduced the effectiveness of multi-threading severely.
Performance impact on RTLMeter when using scheduling adjusted to support
proper #0 delays is ~10-20% slowdown in single-threaded mode, and ~100%
(2x slower) with --threads 4.
To avoid paying this performance penalty unconditionally, the scheduling
is only adjusted if either:
1. The input contains a statically known #0 delay
2. The input contains a variable #x delay unknown at compile time
If no #0 is present, but #x variable delays are, a ZERODLY warning is
issued advising the use of '--no-sched-zero-delay' which is a promise
by the user that none of the variable delays will evaluate to a zero
delay at run-time. This warning is turned off if '--sched-zero-delay'
is explicitly given. This is similar to the '--timing' option.
If '--no-sched-zero-delay' was used at compile time, then executing
a zero delay will fail at runtime.
A ZERODLY warning is also issued if a static #0 if found, but the user
specified '--no-sched-zero-delay'. In this case the scheduling is not
adjusted to support #0, so executing it will fail at runtime. Presumably
the user knows it won't be executed.
The intended behaviour with all this is the following:
No #0, no #var in the design (#constant is OK)
-> Same as current behaviour, scheduling not adjusted,
same code generated as before
Has static #0 and '--no-sched-zero-delay' is NOT given:
-> No warnings, scheduling adjusted so it just works, runs slow
Has static #0 and '--no-sched-zero-delay' is given:
-> ZERODLY on the #0, scheduling not adjusted, fails at runtime if hit
No static #0, but has #var and no option is given:
-> ZERODLY on the #var advising use of '--no-sched-zero-delay' or
'--sched-zero-delay' (similar to '--timing'), scheduling adjusted
assuming it can be a zero delay and it just works
No static #0, but has #var and '--no-sched-zero-delay' is given:
-> No warning, scheduling not adjusted, fails at runtime if zero delay
No static #0, but has #var and '--sched-zero-delay' is given:
-> No warning, scheduling adjusted so it just works
AstCAwait is only ever uses in statement position, so model it as a
statement. We should never ever have a coroutine that returns a value.
There is no need for it in SV, nor should we rely on it for internals.
Also reworks the fix for V3Life incorrectly constant propagating the
beforeTrig functions (#7072). The property that upsets V3Life is that
a function:
1. Is called from multiple static call sites (multiple AstCCall)
2. Reads model state directly (AstVarRef to non-locals/arguments)
Such function can only be created internally after scheduling (V3Task
throws an unsupported error on a non-inlined function that reads model
state), so added a flag to AstCFunc to mark the dangerous ones for
V3Life.
Many rules in the Dfg Peephole pass check if a node has more than one
sinks. Redundant variables that will ultimately be removed can prevent
these from matching. Remove such variables during the Peeophole pass
itself to enable more matches.