verilator

Commit Graph

Author	SHA1	Message	Date
Geza Lore	eafe9636cf	Internals: Dump Ast expression pattern statistics like Dfg (#7818 ) Remove the expression combination counts from the default stats file, and add a new `--dump-ast-patterns` option, which will dump new `_ast_patterns_.txt` files. These contain the expression combinations in a similar S-expression format as Dfg already produces with `--dump-dfg-stats`. These dumps are not produced by just `--stats` as they are fairly expensive to compute. Currently the new option will dump at two points: just before we change to C types via widthMin usage, and just before emit.	2026-06-21 22:17:36 +01:00
Geza Lore	bcaa110f60	Optimize generated function inlining (#7811 ) Previously V3InlineCFuncs inlined call sites but never deleted the now dead callees. Also missed a lot of opportunities due to evaluation order. Rewrite using a graph based algorithm, using only a single traversal of the netlist. This is clearer, more accurate, and faster at compile time. Also add a clean -fno-inline-cfuncs disable. Setting the limits to 0 still disables inlining, except of empty functions, which can be inlined with 0 limits (they are no ops). It will also prune unused functions without -fno-inline-cfuncs. Pass now also respects `--output-split`	2026-06-21 18:31:56 +01:00
Geza Lore	a37e2ee94b	Optimize wide decoder case statements into decoder expressions (#7804 ) Extend the decoder-pattern case optimization to selectors that are too wide for a full 2^width lookup table. A decoder-pattern case (where every case item assigns constants to a fixed set of LHSs) is lowered to a new AstMachMasked expression. AstMachMasked is emitted as a run-time VL_MATCHMASKEd_* function call. It contains a packed constant pool table, 'matchp', which is a list of '(mask, bits)' pairs. At runtime, the index of the first matching entry is returned, and is used to index a value table. This single (albeit complicated) expression can replace large if-else trees whole, resulting in much more compact code with fewer static hard to predict branches. It is worth about 10% speed and 30% code size in some designs. Example: ```systemverilog logic [39:0] sel; always_comb casez (sel) 40'b???????????????????????????????????????1: out = 8'h01; 40'b??????????????????????????????????????1?: out = 8'h02; 40'b?????????????????????????????????????1??: out = 8'h03; default: out = 8'hff; endcase ``` is compiled to: ```c++ out = TABLE_value[VL_MATCHMASKED_Q(sel, CONST_match)]; ``` Where 'CONST_match' contains 4 entries, of a 40-bit mask and 40-bit bit pattern each, and 'TABLE_value' contains 4 entries of the corresponding 8-bit results. (Entries are aligned to word boundaries to avoid runtime bit swizzling)	2026-06-19 19:46:13 +01:00
Wilson Snyder	749b93e405	Commentary: Use standard multiline rst comments, other cleanups	2026-06-18 21:58:01 -04:00
Geza Lore	5712f9b614	Optimize decoder case statements into lookup tables (#7795 ) Recognize "decoder" case statements (where every case item only assigns constants to a fixed set of left-hand sides) and replace them with a single packed constant lookup table indexed by the case expression. Small tables are materialized inline in the generated code, and are always optimized. Larger ones are placed in the constant pool and only optimized if deemed beneficial over branches. While this slightly conflicts with V3Table, and is not worth that much on it's own, there will be a follow up patch that converts more cases of this form which will be much more valuable. This patch does the necessary analysis and the simple table conversion when possible. Split -fcase into -fcase-table (this new conversion) and -fcase-tree (the existing bitwise branch-tree conversion); -fno-case is now an alias for both. Default branches, assignments preceding the case (used as default values), casez wildcards, multiple and partial left-hand sides, and both blocking and non-blocking assignments are handled. Cases that cannot be safely tabled (e.g. non-exhaustive with no default, overlapping writes to one variable, or mixed blocking/non-blocking assignments) fall back to the existing if/else lowering. Consequently disabled re-inlining of constant pool variables in V3Const, and rebuild the constant pool hash in V3Dead (previously we didn't create constant pool entries early enough for this to matter)	2026-06-18 09:30:50 +01:00
Wilson Snyder	c86816476c	Commentary: Changes update	2026-06-15 17:37:49 -04:00
Geza Lore	5ab2bf1ec4	Optimize input combinational logic by change detection (#7784 ) When a lot of combinational logic is driven from top level inputs, work can be wasted evaluating that logic if the top level inputs don't change. This change adds an optimization by performing a change detect on the top level inputs, and evaluate 'ico' logic only if the top level input actually changed. This especially helps with --hierarchical/--lib-create which runs the 'ico' of each sub-model in the eval settle loop. This was observed to yield 40%+ run-time speedup on some partitioned designs. The added change detection is cheap, so it is emitted even if the 'ico' region is small, and is on by default. The optimization is only sound if the model itself does not write to the top level inputs (otherwise the 'previous value' variables would be out of sync, which are not updated by internal writes.). If we can detect a top level input is written within the design, then for that input, we fall back on always running the relevant logic. With --vpi we cannot prove safety statically, so --vpi will disable this optimisation unless explicitly enabled. (In which case it's the user's responsibility to not write to top level inputs via the VPI.)	2026-06-15 05:42:00 +01:00
Wilson Snyder	816ab67826	Commentary: Changes update	2026-06-05 18:36:55 -04:00
Yogish Sekhar	cf8713aebc	Add `--coverage-per-instance`	2026-05-24 18:08:55 -04:00
Yilou Wang	00c9e58006	Fix internal error on consecutive repetition with N > 256 (#7552 ) (#7603 )	2026-05-17 21:54:10 -04:00
Igor Zaworski	25d4827bd5	Internals: Four state pre-pull (types) (#7520 )	2026-04-30 16:56:15 -04:00
Yogish Sekhar	a680919edc	Support native FSM state and arc coverage (#7412 )	2026-04-22 15:18:59 -04:00
Geza Lore	2b9d006097	Change Dfg pattern dumps to use --dump-dfg-patterns (#7455 ) Dumping Dfg patterns can take a non-trivial amount of time, so do it only with --dump-dfg-patterns, instead of with --stats. Also further improve dumping format.	2026-04-21 12:07:19 +01:00
Geza Lore	97454a1bc5	Remove multi-threaded FST tracing (#7443 ) Remove parallel (using the FST library writer thread) and offloaded (separate Verilator internal thread) tracing (only used by FST). These are not compatible with #6992, and #5806 should yield better performance in all cases. Consequently mark '--trace-threads' and '--trace-fst-thread' options as deprecated	2026-04-19 16:02:12 +01:00
Geza Lore	9f9532ff78	Optimize Dfg only once, after V3Scope (#7362 )	2026-04-09 08:31:12 -04:00
Wilson Snyder	947cbaf330	Deprecate `--structs-packed` (#7222 ).	2026-03-21 10:59:27 -04:00
Wilson Snyder	3097df46fa	Change `--converge-limit` default to 10000 (#7209 ). Fixes #7209.	2026-03-07 09:05:37 -05:00
Rahul Behl	9a5c1d27c8	Support array reduction methods with 'with' clause in constraints (#6455 ) (#6999 )	2026-03-04 12:01:35 -05:00
jalcim	7cf539cf05	Add --func-recursion-depth CLI option (#7175 ) (#7179 )	2026-03-04 06:46:07 -05:00
Geza Lore	098fe96643	Add V3LiftExpr pass to lower impure expressions and calls (#7141 ) Introduce new pass that converts impure expressions, or those with function and method calls into simple assignment statements. Please see the blurb at the top of the file why this is useful and how it works. In particular currently it enables more Dfg optimization as functions will be inlined without AstExprStmt. Ideally we should enforce this lowering is applied to every procedural statement (there are still a handful of exceptions). With that, long term with this pass + #6820, there should be no need to ever use an AstExprStmt past this new lowering pass, which should enable more easier optimization down the line. Also ideally this should be run earlier. Currently it's after V3Tristate as that calls pinReconnectSimple so we don't have to touch Cell ports. Currently disabled when code coverage is enabled due to #7119.	2026-02-28 22:20:09 +00:00
Wilson Snyder	7dde11b4c6	Docs: Split control.rst from exe_verilator.rst.	2026-02-24 21:11:39 -05:00
Todd Strader	6a5d3b0b72	Add --max-replication option (#7139 )	2026-02-23 16:51:37 -05:00
Wilson Snyder	28d04c809f	Commentary: Changes update	2026-02-16 05:38:03 -05:00
Geza Lore	505d33b35a	Support #0 delays with IEEE-1800 compliant semantics (#7079 ) This patch adds IEEE-1800 compliant scheduling support for the Inactive scheduling region used for #0 delays. Implementing this requires that all IEEE-1800 active region events are placed in the internal 'act' section. This has simulation performance implications. It prevents some optimizations (e.g. V3LifePost), which reduces single threaded performance. It also reduces the available work and parallelism in the internal 'nba' section, which reduced the effectiveness of multi-threading severely. Performance impact on RTLMeter when using scheduling adjusted to support proper #0 delays is ~10-20% slowdown in single-threaded mode, and ~100% (2x slower) with --threads 4. To avoid paying this performance penalty unconditionally, the scheduling is only adjusted if either: 1. The input contains a statically known #0 delay 2. The input contains a variable #x delay unknown at compile time If no #0 is present, but #x variable delays are, a ZERODLY warning is issued advising the use of '--no-sched-zero-delay' which is a promise by the user that none of the variable delays will evaluate to a zero delay at run-time. This warning is turned off if '--sched-zero-delay' is explicitly given. This is similar to the '--timing' option. If '--no-sched-zero-delay' was used at compile time, then executing a zero delay will fail at runtime. A ZERODLY warning is also issued if a static #0 if found, but the user specified '--no-sched-zero-delay'. In this case the scheduling is not adjusted to support #0, so executing it will fail at runtime. Presumably the user knows it won't be executed. The intended behaviour with all this is the following: No #0, no #var in the design (#constant is OK) -> Same as current behaviour, scheduling not adjusted, same code generated as before Has static #0 and '--no-sched-zero-delay' is NOT given: -> No warnings, scheduling adjusted so it just works, runs slow Has static #0 and '--no-sched-zero-delay' is given: -> ZERODLY on the #0, scheduling not adjusted, fails at runtime if hit No static #0, but has #var and no option is given: -> ZERODLY on the #var advising use of '--no-sched-zero-delay' or '--sched-zero-delay' (similar to '--timing'), scheduling adjusted assuming it can be a zero delay and it just works No static #0, but has #var and '--no-sched-zero-delay' is given: -> No warning, scheduling not adjusted, fails at runtime if zero delay No static #0, but has #var and '--sched-zero-delay' is given: -> No warning, scheduling adjusted so it just works	2026-02-16 03:55:55 +00:00
Geza Lore	3dd2b762e7	Fix scope tree in traces in hierarchical mode (#7042 )	2026-02-12 20:54:03 -05:00
Geza Lore	bb0e1c8c61	Optimize temporary insertion for concatenations in Dfg (#7013 ) Add a new Dfg pass 'pushDownSel'. This will try to move selects through a tree of concatenations in order to eliminate temporary nodes holding intermediate concatenation results. This can get rid of a lot of variables when packed arrays are assigned in parts (e.g. bit-wise).	2026-02-07 18:06:12 +00:00
Wilson Snyder	7c6c6a684b	Add SPDX copyright identifiers, and get 'reuse' clean. No functional change.	2026-01-26 20:24:34 -05:00
Luca Colagrande	f9f7a7146d	Comnentsry: Fix `--trace` flag description in docs (#6884 )	2026-01-06 07:16:35 -05:00
Wilson Snyder	40cf3c4b16	Remove deprecated `--make cmake`.	2026-01-01 09:27:20 -05:00
Wilson Snyder	a7b80966ec	Remove `--xml-only`.	2026-01-01 09:23:05 -05:00
Wilson Snyder	13327fa9c0	Copyright year update.	2026-01-01 07:22:09 -05:00
Wilson Snyder	4080284e53	Fix warning lint directive ordering and consistency (#4185 ) (#5368 ) (#5610 ) (#6876 ).	2025-12-30 20:31:34 -05:00
Wilson Snyder	e6114b6bbb	Commentary	2025-12-30 08:24:41 -05:00
Iztok Jeras	6a07595a44	Commentary: Text formatting fix (#6863 )	2025-12-25 19:01:38 -05:00
Wilson Snyder	1b93033690	Add `--quiet-build` to suppress make/compiler informationals.	2025-12-23 19:21:42 -05:00
Wilson Snyder	921ad64d22	Commentary: Changes update	2025-12-23 19:20:42 -05:00
Wilson Snyder	5dc05e1fa8	Internals: Update some JSON references. No functional change.	2025-12-23 10:13:23 -05:00
Jose Drowne	c0a0f0dab9	Optimize inlining small C functions and add `-inline-cfuncs` (#6815 )	2025-12-21 13:14:50 -05:00
Wilson Snyder	605915f307	Commentary: Changes update	2025-12-20 22:04:29 -05:00
Geza Lore	f990dd747e	Change metacomments to not enable warnings disabled in control file (#6836 ) (#6842 ) Track the location based message/feature enable bits separately for code and control file directives. A message/feature is disabled if disabled either in the control file, or in code directives/metacomments. That is, enabled only if both agree should be enabled.	2025-12-20 06:33:46 -05:00
Wilson Snyder	b90865a08a	Change `--lint-only` and `--json-only` to imply `--timing` (#6790 ).	2025-12-17 19:24:43 -05:00
Wilson Snyder	f1ee434dca	Commentary: Changes update	2025-12-16 20:43:08 -05:00
Dan Ruelas-Petrisko	394d9cf168	Support `-libmap` (#5891 partial) (#6764 )	2025-12-16 11:21:46 -05:00
Wilson Snyder	91a59bbcc5	Documentation: Adapt format suggested by docstrfmt	2025-11-22 10:59:38 -05:00
Wilson Snyder	4cc4ff3e07	Commentary: Fix some .rst style issues	2025-11-21 22:25:03 -05:00
Wilson Snyder	7e3cab8e5d	Commentary: Changes update	2025-11-21 19:39:51 -05:00
Jakub Wasilewski	0b8c369740	Add `sc_biguint` pragma (#6712 )	2025-11-20 17:08:59 -05:00
Geza Lore	a1056c6ae9	Add `-param`/`-port` options to `public_flat*` control directives (#6685 )	2025-11-13 06:59:02 -05:00
Geza Lore	0dc9f779f8	Add `-fno-inline-funcs-eager` option to disable excessive inlining (#6682 )	2025-11-11 21:46:19 +00:00
Wilson Snyder	c87a3e92fc	Commentary: Changes update	2025-11-09 14:50:31 -05:00

1 2 3 4 5

227 Commits