- Allow reordering pure statements with DPI import calls iff no public
variables (including those read via a DPI export) are involved. This
ensures the DPI import can't observe the reordering
- Allow reordering of pure statements with AstDisplay and AstStop. This
requires an assumption that AstDisplay and AstStop will not read or
write model state other than via a VarRef explicitly present int the
Ast.
Overall this allows eliminating a lot of conditionals around assertions,
which were previously not possible.
Introduce new pass that converts impure expressions, or those with
function and method calls into simple assignment statements. Please see
the blurb at the top of the file why this is useful and how it works.
In particular currently it enables more Dfg optimization as functions
will be inlined without AstExprStmt.
Ideally we should enforce this lowering is applied to every procedural
statement (there are still a handful of exceptions). With that, long
term with this pass + #6820, there should be no need to ever use an
AstExprStmt past this new lowering pass, which should enable more easier
optimization down the line.
Also ideally this should be run earlier. Currently it's after V3Tristate
as that calls pinReconnectSimple so we don't have to touch Cell ports.
Currently disabled when code coverage is enabled due to #7119.
The recent V3InlineCFuncs only checks AstCFunc::varsp for locals, but
V3Reloop used to insert them into AstCFunc::stmtsp resulting in multiple
locals with the same name being inlined into the caller if the stars
align. Fix Reloop. Such things will also go away with #6280.
Rename AstSelLoopVars to AstForeachHeader, and make it a non-NodeExpr.
Tweak parser to always create an AstForeachHeader, so no need to fix it
up later.
After conversion of Ast to Dfg, but before synthesizing AstAlways into
primitives, run a pass to remove variables that are not observable, and
all logic that only computes such variables. This can get rid of a lot
of content early so we don't build redundant Dfgs, and also enables
synthesizing always blocks that use temporaries only in some branches,
which will come in a follow up.
This patch adds IEEE-1800 compliant scheduling support for the Inactive
scheduling region used for #0 delays.
Implementing this requires that **all** IEEE-1800 active region events
are placed in the internal 'act' section. This has simulation
performance implications. It prevents some optimizations (e.g.
V3LifePost), which reduces single threaded performance. It also reduces
the available work and parallelism in the internal 'nba' section, which
reduced the effectiveness of multi-threading severely.
Performance impact on RTLMeter when using scheduling adjusted to support
proper #0 delays is ~10-20% slowdown in single-threaded mode, and ~100%
(2x slower) with --threads 4.
To avoid paying this performance penalty unconditionally, the scheduling
is only adjusted if either:
1. The input contains a statically known #0 delay
2. The input contains a variable #x delay unknown at compile time
If no #0 is present, but #x variable delays are, a ZERODLY warning is
issued advising the use of '--no-sched-zero-delay' which is a promise
by the user that none of the variable delays will evaluate to a zero
delay at run-time. This warning is turned off if '--sched-zero-delay'
is explicitly given. This is similar to the '--timing' option.
If '--no-sched-zero-delay' was used at compile time, then executing
a zero delay will fail at runtime.
A ZERODLY warning is also issued if a static #0 if found, but the user
specified '--no-sched-zero-delay'. In this case the scheduling is not
adjusted to support #0, so executing it will fail at runtime. Presumably
the user knows it won't be executed.
The intended behaviour with all this is the following:
No #0, no #var in the design (#constant is OK)
-> Same as current behaviour, scheduling not adjusted,
same code generated as before
Has static #0 and '--no-sched-zero-delay' is NOT given:
-> No warnings, scheduling adjusted so it just works, runs slow
Has static #0 and '--no-sched-zero-delay' is given:
-> ZERODLY on the #0, scheduling not adjusted, fails at runtime if hit
No static #0, but has #var and no option is given:
-> ZERODLY on the #var advising use of '--no-sched-zero-delay' or
'--sched-zero-delay' (similar to '--timing'), scheduling adjusted
assuming it can be a zero delay and it just works
No static #0, but has #var and '--no-sched-zero-delay' is given:
-> No warning, scheduling not adjusted, fails at runtime if zero delay
No static #0, but has #var and '--sched-zero-delay' is given:
-> No warning, scheduling adjusted so it just works
AstCAwait is only ever uses in statement position, so model it as a
statement. We should never ever have a coroutine that returns a value.
There is no need for it in SV, nor should we rely on it for internals.
Also reworks the fix for V3Life incorrectly constant propagating the
beforeTrig functions (#7072). The property that upsets V3Life is that
a function:
1. Is called from multiple static call sites (multiple AstCCall)
2. Reads model state directly (AstVarRef to non-locals/arguments)
Such function can only be created internally after scheduling (V3Task
throws an unsupported error on a non-inlined function that reads model
state), so added a flag to AstCFunc to mark the dangerous ones for
V3Life.
Many rules in the Dfg Peephole pass check if a node has more than one
sinks. Redundant variables that will ultimately be removed can prevent
these from matching. Remove such variables during the Peeophole pass
itself to enable more matches.