Commit Graph

20 Commits

Author SHA1 Message Date
Geza Lore e9c48cd1ce
Internals: Optimize temporary memory allocations (#6517)
This patch gets rid of over 80% of temporary dynamic memory allocations
(when a malloced node is immediately freed with no other malloc in
between). It also gets rid of over 20% of all calls to malloc.

It's worth ~3% average verilation speed up with tcmalloc, and more
without tcmalloc.
2025-10-01 15:01:30 +01:00
Lan Zongwei 2aa260a03b
Fix V3Hash MacOS ambiguity again (#6350) 2025-08-31 09:54:13 -04:00
Geza Lore 67da797816
Internals: Cleanup cppcheck warnings (#6317)
`make cppcheck-4` is now clean.
2025-08-20 18:21:24 +01:00
Wilson Snyder 8fbb725f34 Copyright year update. 2025-01-01 08:30:25 -05:00
Wilson Snyder 990ccd6763 Internals: Standardize on `template<typename`. No functional change. 2024-11-29 18:01:50 -05:00
Wilson Snyder e76f29e5ba Copyright year update 2024-01-01 03:19:59 -05:00
Kamil Rakoczy bbf53bd5af
Add VL_MT_SAFE attribute to several functions. (#3729) 2023-03-16 19:48:56 -04:00
Wilson Snyder b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Wilson Snyder 73d6de4471 Internals: Fix constructor style. 2022-11-21 20:41:32 -05:00
Wilson Snyder 352d0b4582 Internals: Fix constructor style. 2022-11-20 13:11:01 -05:00
Kamil Rakoczy b6c116d4bf
Internals: Add VL_MT_SAFE annotations to const functions (#3681) 2022-10-18 17:07:09 -04:00
Geza Lore 47bce4157d
Introduce DFG based combinational logic optimizer (#3527)
Added a new data-flow graph (DFG) based combinational logic optimizer.
The capabilities of this covers a combination of V3Const and V3Gate, but
is also more capable of transforming combinational logic into simplified
forms and more.

This entail adding a new internal representation, `DfgGraph`, and
appropriate `astToDfg` and `dfgToAst` conversion functions. The graph
represents some of the combinational equations (~continuous assignments)
in a module, and for the duration of the DFG passes, it takes over the
role of AstModule. A bulk of the Dfg vertices represent expressions.
These vertex classes, and the corresponding conversions to/from AST are
mostly auto-generated by astgen, together with a DfgVVisitor that can be
used for dynamic dispatch based on vertex (operation) types.

The resulting combinational logic graph (a `DfgGraph`) is then optimized
in various ways. Currently we perform common sub-expression elimination,
variable inlining, and some specific peephole optimizations, but there
is scope for more optimizations in the future using the same
representation. The optimizer is run directly before and after inlining.
The pre inline pass can operate on smaller graphs and hence converges
faster, but still has a chance of substantially reducing the size of the
logic on some designs, making inlining both faster and less memory
intensive. The post inline pass can then optimize across the inlined
module boundaries. No optimization is performed across a module
boundary.

For debugging purposes, each peephole optimization can be disabled
individually via the -fno-dfg-peepnole-<OPT> option, where <OPT> is one
of the optimizations listed in V3DfgPeephole.h, for example
-fno-dfg-peephole-remove-not-not.

The peephole patterns currently implemented were mostly picked based on
the design that inspired this work, and on that design the optimizations
yields ~30% single threaded speedup, and ~50% speedup on 4 threads. As
you can imagine not having to haul around redundant combinational
networks in the rest of the compilation pipeline also helps with memory
consumption, and up to 30% peak memory usage of Verilator was observed
on the same design.

Gains on other arbitrary designs are smaller (and can be improved by
analyzing those designs). For example OpenTitan gains between 1-15%
speedup depending on build type.
2022-09-23 16:46:22 +01:00
Geza Lore 38a8d7fb2e Remove redundant 'inline' keywords from definitions
Also add checks to t/t_dist_cppstyle
2022-09-16 15:52:25 +01:00
Geza Lore 96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Wilson Snyder ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Geza Lore 5adc856950 Tests: ignore all hashes in files_identical
Also add 'h' prefix to all printed hashes, to reduce ambiguity. No
functional change.
2021-08-11 16:55:11 +01:00
Geza Lore 0c4d88bacc Fix V3Hash when building -m32 2021-06-13 23:19:25 +01:00
Geza Lore c207e98306
Implement a distinct constant pool (#3013)
What previously used to be per module static constants created in
V3Table and V3Prelim are now merged globally within the whole model and
emitted as part of a separate constant pool. Members of the constant
pool are global variables which are declared lazily when used (similar to
loose methods).
2021-06-13 15:05:55 +01:00
Geza Lore 24b3816f5e Improve distribution of V3Hash/V3Hasher
- Better combining, without multiplication, which means a 0 hash value
is now allowed.
- Do not OR in bottom bits (this was used to avoid a 0 hash but had the
side effect of hashing 0 and 1 to the same value, which are actually
common inputs.
- Hash whole content of V3Number. This does not seem to be noticeable in
runtime, but quite often the bottom word can be a special value like
zero while the rest of the content varies.
2021-06-09 21:17:02 +01:00
Geza Lore 8814111724
Rework Ast hashing to be stable (#2974)
Rework Ast hashing to be stable

Eliminated reliance on pointer values in AstNode hashes in order to make
them stable. This requires moving the sameHash functions into a visitor,
as nodes pointed to via members (and not child nodes) need to be hashed
themselves.

The hashes are now stable (as far as I could test them), and the impact
on verilation time is small enough not to be reliably measurable.
2021-05-21 14:34:27 +01:00