Extend the decoder-pattern case optimization to selectors that are too
wide for a full 2^width lookup table. A decoder-pattern case (where
every case item assigns constants to a fixed set of LHSs) is lowered to
a new AstMachMasked expression. AstMachMasked is emitted as a run-time
VL_MATCHMASKEd_* function call. It contains a packed constant pool table,
'matchp', which is a list of '(mask, bits)' pairs. At runtime, the index of the
first matching entry is returned, and is used to index a value table. This single
(albeit complicated) expression can replace large if-else trees whole, resulting
in much more compact code with fewer static hard to predict branches. It
is worth about 10% speed and 30% code size in some designs.
Example:
```systemverilog
logic [39:0] sel;
always_comb
casez (sel)
40'b???????????????????????????????????????1: out = 8'h01;
40'b??????????????????????????????????????1?: out = 8'h02;
40'b?????????????????????????????????????1??: out = 8'h03;
default: out = 8'hff;
endcase
```
is compiled to:
```c++
out = TABLE_value[VL_MATCHMASKED_Q(sel, CONST_match)];
```
Where 'CONST_match' contains 4 entries, of a 40-bit mask and 40-bit bit
pattern each, and 'TABLE_value' contains 4 entries of the corresponding
8-bit results. (Entries are aligned to word boundaries to avoid runtime
bit swizzling)
Turn `x = $sformatf(...)` into `$sformat(x, ...)`. The former requires
checking and running a destructor for `x` at the call site, the later
does it in the callee VL_SFORMAT. This reduces the size of the call
site, which can be significant e.g. in the presence of many assertions.
Also added a rewrite of `$sformat(x, "const-string")` back into `x =
"const-string"` for the cases where the `$sformatf` would have been
folded into a constant string.
Change WDataInP/WDataOutP to be opaque handles types instead of aliases
to raw pointers. This subsequently eliminates needing an implicit cast
operator in VlWide, which is replaced with implicit constructors of
WDataInP/WDataOutP that can create a handle from a VlWide. This
eliminates some unsafe conversions that the previous implicit cast
operator unintentionally enabled (e.g. #7618). It also eliminates
having to insert ".data()" in various places int he generated code, which
simplifies internals (the only place ".data()" should be needed is in
calls to variadic functions where the expected type of the argument is
not WDataInP/WDataOutP).
The handles otherwise behave like pointers, implementing the minimal
amount of operators required to code the runtime. The handle is still
only a single pointer, and will be passed in registers as before, so
this patch should be performance neutral.
As part of this removed WData, which used to be an alias for EData.
All uses are now either EData*, WDataInP, WDataOutP, or VlWide directly.
Explicitly annotate those AstNodeExpr subclasses that should have a
corresponding DfgVertex subclass generated by astgen. This avoids having
to tweak things in Dfg when adding new AstNode subclasses, which can
then be handled separately.
- Allow reordering pure statements with DPI import calls iff no public
variables (including those read via a DPI export) are involved. This
ensures the DPI import can't observe the reordering
- Allow reordering of pure statements with AstDisplay and AstStop. This
requires an assumption that AstDisplay and AstStop will not read or
write model state other than via a VarRef explicitly present int the
Ast.
Overall this allows eliminating a lot of conditionals around assertions,
which were previously not possible.
Introduce new pass that converts impure expressions, or those with
function and method calls into simple assignment statements. Please see
the blurb at the top of the file why this is useful and how it works.
In particular currently it enables more Dfg optimization as functions
will be inlined without AstExprStmt.
Ideally we should enforce this lowering is applied to every procedural
statement (there are still a handful of exceptions). With that, long
term with this pass + #6820, there should be no need to ever use an
AstExprStmt past this new lowering pass, which should enable more easier
optimization down the line.
Also ideally this should be run earlier. Currently it's after V3Tristate
as that calls pinReconnectSimple so we don't have to touch Cell ports.
Currently disabled when code coverage is enabled due to #7119.
Rename AstSelLoopVars to AstForeachHeader, and make it a non-NodeExpr.
Tweak parser to always create an AstForeachHeader, so no need to fix it
up later.