Commit Graph

3638 Commits

Author SHA1 Message Date
Arkadiusz Kozdra 0e4da3b0bf
Support virtual interfaces (#3654) 2022-10-20 06:31:00 -04:00
Krzysztof Bieganski e6add5e0b8
Fix trace activity with --timing (#3576) (#3678) (#3696) 2022-10-20 06:28:55 -04:00
Krzysztof Bieganski 22243d1e49
Support class type params without defaults (#3693) 2022-10-19 21:59:26 -04:00
Krzysztof Bieganski bec0b7d4d0
Disallow delays with `--lib-create` (#3691) 2022-10-19 20:52:29 -04:00
Wilson Snyder f6f13c7fda Internals: Comment out debug that may flag ASAN problem (#3574) 2022-10-18 21:17:52 -04:00
Wilson Snyder e7068369fe Fix $display of fixed-width numbers (#3565). 2022-10-18 21:10:35 -04:00
Wilson Snyder b930d0731a Fix foreach and pre/post increment in functions (#3613). 2022-10-18 20:04:09 -04:00
Wilson Snyder 2723223884 Fix LSB error on --hierarchical submodules (#3539). 2022-10-18 17:29:51 -04:00
Kamil Rakoczy b6c116d4bf
Internals: Add VL_MT_SAFE annotations to const functions (#3681) 2022-10-18 17:07:09 -04:00
github action c057847760 Apply 'make format' 2022-10-17 23:52:01 +00:00
Topa Topino 46c5764383
Split UNUSED warning into genvar, param, and signal warnings (#3607) 2022-10-17 19:51:13 -04:00
Wilson Snyder 22ce36012e Add VERILATOR_TIMING define (#3684) 2022-10-17 18:18:56 -04:00
Geza Lore 5c65e0cfa1 Dfg: Fix incorrect folding of associative expressions with shared terms
Fixes #3679
2022-10-17 15:03:30 +01:00
Geza Lore 840e26b69a Fix incorrect return in DFG decomposition
Fixes #3676
2022-10-17 14:41:20 +01:00
Wilson Snyder 76ccd332a6 Internals: Remove DETECTARRAY, dead code. 2022-10-16 09:41:51 -04:00
Wilson Snyder 3cd2c8532d Internals: Cleanup spacing of Vi for loops. 2022-10-15 18:47:10 -04:00
Wilson Snyder c0739e908c Fix internal traceActivity to be zero reset not randomized. 2022-10-15 18:37:44 -04:00
Wilson Snyder 916a3d9066 Fix --main --trace missing initial timestep (#3678). 2022-10-15 13:24:38 -04:00
Wilson Snyder 14f58ed6c7 Add error on real edge event control. 2022-10-15 06:21:34 -04:00
Arkadiusz Kozdra 038d57070b
Support standalone 'this' in classes (#3675) (#2594) (#3248) 2022-10-14 08:55:55 -04:00
Krzysztof Bieganski 8a347248f5
Use `AstDelay` nodes for intra-assignment delays (#3672)
Also fix messy implementation of net delays.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-10-14 09:35:26 +02:00
Krzysztof Bieganski caed086516
Move Postponed logic after the eval loop (#3673)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-10-13 21:04:43 +02:00
Krzysztof Bieganski 68927d4fd3
Make class ref typing stricter (#3671)
Prevents the possibility of assigning an integer to a class reference,
both at the SystemVerilog and the emitted C++ levels.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-10-13 14:33:15 +02:00
github action 8dacbdec3a Apply 'make format' 2022-10-11 09:04:38 +00:00
Geza Lore 2a110c91cf Speed up DfgGraph decomposition algorithms 2022-10-11 09:55:08 +01:00
Krzysztof Bieganski ba052beccd
Make reference to increment temporary an rvalue (#3659)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-10-10 13:58:05 +02:00
Wilson Snyder 18c26b90af Fix --trace with --main/--binary (#3664) 2022-10-09 14:16:44 -04:00
Geza Lore ff49f797e5 Speed up DfgGraph::addGraph
Append whole lists in one go, rather than going item by item.
2022-10-08 12:46:02 +01:00
Geza Lore c033a0d7c8 Optimize DfgGraph vertex storage
Vertices representing variables (DfgVertexVar) and constants (DfgConst)
are very common (40-50% of all vertices created in some large designs),
and we also need to, or can treat them specially in algorithms. Keep
these as separate lists in DfgGraph for direct access to them. This
improve verilation speed.
2022-10-08 12:46:02 +01:00
Geza Lore 461f3c1004 DFG: Remove topological sort
Cyclic components are now extracted separately, so there is no
functional reason to have to do a topological sort (previously we used it
to detect cyclic graphs). Removing it to gain some speed.
2022-10-08 12:46:02 +01:00
Geza Lore 90447d54d1 Make DfgConst hold V3Number directly
Remove intermediary AstConst. No functional change intended.
2022-10-08 12:46:02 +01:00
Geza Lore 439d30a953 Minor cleanup in V3Number 2022-10-08 12:46:02 +01:00
Geza Lore 29a080dd9b DFG: Special case representation of AstSel
AstSel is a ternary node, but the 'widthp' is always constant and is
hence redundant, and 'lsbp' is very often constant. As AstSel is fairly
common, we special case as a DfgSel for the constant 'lsbp', and as
'DfgMux` for the non-constant 'lsbp'.
2022-10-06 19:59:01 +01:00
Geza Lore 0570cb8d9f DFG: Correctly set dtype when converting DfgCountOnes to Ast 2022-10-06 19:59:01 +01:00
Geza Lore 6fa14bf029 Speed up DfgPeephole in various ways 2022-10-06 19:59:01 +01:00
Geza Lore 4f0158b5e0 Speed up Dfg common sub-expression elimination
Added a DfgVertex::user() mechanism for storing data in vertices.
Similar in spirit to AstNode user data, but the generation counter is
stored in the DfgGraph the vertex is held under. Use this to cache
DfgVertex::hash results, and also speed up DfgVertex hashing in general.

Use these and additional improvements to speed up CSE.
2022-10-06 19:59:01 +01:00
Krzysztof Bieganski 97add4d57a
Fix null access on optimized-out fork statements (#3658)
`V3SchedTiming` currently assumes that if a fork still exists, it must
have statements within it (otherwise it would have been deleted by
`V3Timing`). However, in a case like this:
```
module t;
    reg a;
    initial fork a = 1; join
endmodule
```
the assignment in the fork is optimized out by `V3Dead` after
`V3Timing`. This leads to `V3SchedTiming` accessing fork's `stmtsp`
pointer, which at this point is null. This patch addresses that issue.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-10-06 15:38:59 +02:00
Geza Lore 5b742571d3 DFG: run removeVars after CSE
This enables removing some more redundant variables.
2022-10-06 09:31:56 +01:00
Geza Lore a83043d735 DfgPeephole: Rework folding of associative operations
Allow constant folding through adjacent nodes of all associative
operations, for example '((a & 2) & 3)' or '(3 & (2 & a))' can now be
folded into '(a & 2)' and '(2 & a)' respectively. Also improve speed of
making associative expression trees right leaning by using rotation of
the existing vertices whenever instead of allocation of new nodes.
2022-10-06 09:10:26 +01:00
Geza Lore 22fcd616aa DfgPeephole: Further restrict PUSH_REDUCTION_THROUGH_CONCAT
Only apply when there is guaranteed to be a subsequent constant folding
and elimination of some of the expression, otherwise this sometimes
interferes with the simplification of concatenations and harms overall
performance.
2022-10-06 09:10:26 +01:00
Krzysztof Bieganski 1a8188e1b4
Fix linker errors in user-facing timing functions (#3657)
Before this change, a design verilated with `--timing` that does not
actually use timing features would be emitted with `eventsPending` and
`nextTimeSlot` declared in the top class. However, their definitions
would be missing, leading to linker errors during design compilation.
This patch makes Verilator always emit the definitions, which prevents
linker errors. Trying to use `nextTimeSlot` without delays in the design
will result in an error at runtime.
2022-10-05 18:16:05 -04:00
Geza Lore f87fe4c3b4 DfgPeephole: add constant folding for all integer types
Also added a testing only -fno-const-before-dfg option, as otherwise
V3Const eats up a lot of the simple inputs. A lot of the things V3Const
swallows in the simple cases can make it to DFG in complex cases, or DFG
itself can create them during optimization. In any case to save
complexity of testing DFG constant folding, we use this option to turn
off V3Const prior to the DFG passes in the relevant test.
2022-10-05 12:05:40 +01:00
Geza Lore f23f3ca907 Try to ensure DFG peephole patterns don't grow the graph
Some optimizations are only a net win if they help us remove a graph
node (or at least ensure they don't grow the graph), or yields otherwise
special logic, so try to apply them only in these cases.
2022-10-04 18:54:46 +01:00
Geza Lore 965d99f1bc DFG: Make implementation more similar to AST
Use the same style, and reuse the bulk of astgen to generate DfgVertex
related code. In particular allow for easier definition of custom
DfgVertex sub-types that do not directly correspond to an AstNode
sub-type. Also introduces specific names for the fixed arity vertices.
No functional change intended.
2022-10-04 15:49:30 +01:00
Wilson Snyder ced82cbac4 Internals: Add some internal coverage exclusions etc. No functional change. 2022-10-03 10:57:37 -04:00
Wilson Snyder 10fc1f757c Internals: cppcheck cleanups. No functional change intended. 2022-10-02 23:04:55 -04:00
Wilson Snyder 4367e03e46 Internals: Make VL_UNREACHABLE similar to std::unreachable() 2022-10-02 16:35:45 -04:00
Geza Lore 2a12b052f2 DFG: handle simple always blocks 2022-10-01 16:46:58 +01:00
Geza Lore 84b9502af4 DFG: Add more peephole patterns 2022-10-01 16:46:58 +01:00
Geza Lore 694bdbc130 DFG: Improve .dot dumps slightly 2022-10-01 16:46:58 +01:00
Wilson Snyder 880cac2fdd Merge branch 'master' into develop-v5 2022-10-01 11:24:55 -04:00
github action a204b24fcf Apply 'make format' 2022-10-01 15:06:12 +00:00
Marcel Chang 526e6b9fc7
Add --dump-tree-dot to enable dumping Ast Tree .dot files (#3636) 2022-10-01 11:05:33 -04:00
github action f1ba6cb517 Apply 'make format' 2022-10-01 14:53:40 +00:00
Kanad Kanhere 159cf0429c
Support linting for top module interfaces (#3635) 2022-10-01 10:48:37 -04:00
Ryszard Rozak 46b8dca360
Add handling of tristate select/extend (#3604) 2022-10-01 10:34:30 -04:00
Geza Lore cc51966ad1 DFG: Remove unconneced variables early 2022-09-30 11:53:03 +01:00
Geza Lore c9d6344f2f DFG: Extract cyclic components separately
A lot of optimizations in DFG assume a DAG, but the more things are
representable, the more likely it is that a small cyclic sub-graph is
present in an otherwise very large graph that is mostly acyclic. In
order to avoid loosing optimization opportunities, we explicitly extract
the cyclic sub-graphs (which are the strongly connected components +
anything feeing them, up to variable boundaries) and treat them
separately. This enables optimization of the remaining input.
2022-09-30 09:51:10 +01:00
Geza Lore acebafcbc2 DFG: Partial support for unpacked arrays
Representation and Ast / Dfg conversions available, for element-wise
access only. Not much optimization yet (only CSE).
2022-09-29 19:00:45 +01:00
Geza Lore 4a1a2def95 DFG: make variable inlining part of the peephole optimizer
This saves some traversals and prepares us to better handle cyclic DFGs.
2022-09-29 18:40:10 +01:00
Geza Lore 09e352ef66 DFG: support hashing of graphs circular through variables
No functional change
2022-09-29 18:40:10 +01:00
Geza Lore 17976d7401 DFG: fix REPLACE_EQ_OF_CONST_AND_CONST peephole pattern 2022-09-29 18:40:10 +01:00
Wilson Snyder cd2a5771b8 Add --timing to --binary (#3625). 2022-09-28 19:02:23 -04:00
Krzysztof Bieganski 9c2ead90d5
Add custom memory management for verilated classes (#3595)
This change introduces a custom reference-counting pointer class that
allows creating such pointers from 'this'. This lets us keep the
receiver object around even if all references to it outside of a class
method no longer exist. Useful for coroutine methods, which may outlive
all external references to the object.

The deletion of objects is deferred until the next time slot. This is to
make clearing the triggered flag on named events in classes safe
(otherwise freed memory could be accessed).
2022-09-28 18:54:18 -04:00
Geza Lore a999c73ce0 Commentary 2022-09-28 14:43:40 +01:00
Wilson Snyder b92173bf3d Add --binary option as alias of --main --exe --build (#3625). 2022-09-28 09:04:33 -04:00
Wilson Snyder c6bce636ee Merge branch 'master' into develop-v5 2022-09-27 22:19:04 -04:00
Wilson Snyder 75a70bee6d Update to clang-format-14 on Ubuntu22.04 2022-09-27 21:47:45 -04:00
Ryszard Rozak 4931e48016
Support resolving assignments with equal strengths (#3637) 2022-09-26 21:21:37 -04:00
Geza Lore 1b17acdb01 DFG: Support AstSel and AstConcat on LHS of assignments
Added DfgVertexVariadic to represent DFG vetices with a varying number
of source operands. Converted DfgVar to be a variadic vertex, with each
driver corresponding to a fixed range of bits in the packed variable.
This allows us to handle AstSel on the LHS of assignments. Also added
support for AstConcat on the LHS by selecting into the RHS as
appropriate.

This improves OpenTitan ST speed by ~13%
2022-09-26 19:54:52 +01:00
Geza Lore 9c1cc5465d DFG: Support packed structure and union types 2022-09-26 18:31:50 +01:00
Geza Lore d8b5359fcb Merge branch 'master' into develop-v5 2022-09-26 14:45:08 +01:00
Geza Lore 9da012568c Ensure DFG stats are consistent 2022-09-26 14:38:26 +01:00
Geza Lore 9a20a258f5 Omit AstNode::m_editCount in release build
This is only a debugging aid at this point, so compile out of the
release build. This reduces peak memory consumption by 4-5%. We still
keep the global counters to detect the tree have changed, to avoid
unnecessary dumps.
2022-09-25 08:57:33 +01:00
Geza Lore 10796457d2 V3Life: don't depend on AstNode::editCountGbl()
No functional change intended.
2022-09-24 20:45:30 +01:00
Geza Lore 78e659a142 Reduce size of FileLine
Multiple tricks to reduce the size of class FileLine from 72 to 40
bytes:

- Reduce file name index from 32 to 16 bits. This still allows 64K
  unique input files, which is hopefully enough.
- Intern message/warning enable bitset and use a 16-bit index, again
  allowing 64K unique sets which is hopefully enough.
- Put the m_waive flag into the sign bit of one of the line numbers.
- Use explicit reference counting to avoid overhead of shared_ptr.

Added assertions to ensure interned data fits within it's index space.

This saves ~5-10% peak memory consumption at no measurable run-time cost
on various designs.
2022-09-24 20:16:21 +01:00
Geza Lore 47bce4157d
Introduce DFG based combinational logic optimizer (#3527)
Added a new data-flow graph (DFG) based combinational logic optimizer.
The capabilities of this covers a combination of V3Const and V3Gate, but
is also more capable of transforming combinational logic into simplified
forms and more.

This entail adding a new internal representation, `DfgGraph`, and
appropriate `astToDfg` and `dfgToAst` conversion functions. The graph
represents some of the combinational equations (~continuous assignments)
in a module, and for the duration of the DFG passes, it takes over the
role of AstModule. A bulk of the Dfg vertices represent expressions.
These vertex classes, and the corresponding conversions to/from AST are
mostly auto-generated by astgen, together with a DfgVVisitor that can be
used for dynamic dispatch based on vertex (operation) types.

The resulting combinational logic graph (a `DfgGraph`) is then optimized
in various ways. Currently we perform common sub-expression elimination,
variable inlining, and some specific peephole optimizations, but there
is scope for more optimizations in the future using the same
representation. The optimizer is run directly before and after inlining.
The pre inline pass can operate on smaller graphs and hence converges
faster, but still has a chance of substantially reducing the size of the
logic on some designs, making inlining both faster and less memory
intensive. The post inline pass can then optimize across the inlined
module boundaries. No optimization is performed across a module
boundary.

For debugging purposes, each peephole optimization can be disabled
individually via the -fno-dfg-peepnole-<OPT> option, where <OPT> is one
of the optimizations listed in V3DfgPeephole.h, for example
-fno-dfg-peephole-remove-not-not.

The peephole patterns currently implemented were mostly picked based on
the design that inspired this work, and on that design the optimizations
yields ~30% single threaded speedup, and ~50% speedup on 4 threads. As
you can imagine not having to haul around redundant combinational
networks in the rest of the compilation pipeline also helps with memory
consumption, and up to 30% peak memory usage of Verilator was observed
on the same design.

Gains on other arbitrary designs are smaller (and can be improved by
analyzing those designs). For example OpenTitan gains between 1-15%
speedup depending on build type.
2022-09-23 16:46:22 +01:00
Geza Lore 3a8a314566 Merge branch 'master' into develop-v5 2022-09-23 11:21:12 +01:00
Geza Lore 050060b139 Make enum constructors and operators constexpr 2022-09-23 11:10:28 +01:00
Geza Lore ddb678cc5b Merge branch 'master' into develop-v5 2022-09-22 17:33:36 +01:00
Geza Lore 63c694f65f Streamline dump control options
- Rename `--dump-treei` option to `--dumpi-tree`, which itself is now a
  special case of `--dumpi-<tag>` where tag can be a magic word, or a
  filename
- Control dumping via static `dump*()` functions, analogous to `debug()`
- Make dumping independent of the value of `debug()` (so dumping always
  works even without the debug flag)
- Add separate `--dumpi-graph` for dumping V3Graphs, which is again a
  special case of `--dumpi-<tag>`
- Alias `--dump-<tag>` to `--dumpi-<tag> 3` as before
2022-09-22 17:24:41 +01:00
github action 12093e6939 Apply 'make format' 2022-09-21 19:22:15 +00:00
Geza Lore 9949a6cd17 Generate AstGen::checkTreeiter to enforce Ast op*p use
Use astgen to generate a more thorough version of AstNode::checkTree,
which checks that operands are or consistent structure and type, as
described in the @astgen op directives. Also change checkTree to always
run when --debug-check is given.

Fix discovered fallout.
2022-09-21 18:12:11 +01:00
Geza Lore 4600932d8c Remove unused files 2022-09-21 14:16:20 +01:00
Geza Lore 95145038b4 Generate AstNode accessors via astgen
Introduce the @astgen directives parsed by astgen, currently used for
the generation child node (operand) accessors. Please see the updated
internal documentation for details.
2022-09-21 14:05:27 +01:00
Geza Lore ce03293128 Generate AstNode accessors via astgen
Introduce the @astgen directives parsed by astgen, currently used for
the generation child node (operand) accessors. Please see the updated
internal documentation for details.
2022-09-21 13:56:03 +01:00
Geza Lore 72e7271a14 Merge branch 'master' into develop-v5 2022-09-21 12:19:00 +01:00
Kamil Rakoczy 0b07679ff2
v3errorEnd: look for instance only when warning is not ignored (#3632)
This approach reduced total time of V3Undriven stage from 34,2s to 2,5s
in design containing almost 400 000 unused variables.

Signed-off-by: Kamil Rakoczy <krakoczy@antmicro.com>
2022-09-21 10:54:23 +01:00
Wilson Snyder d162619bd3 Merge branch 'master' into develop-v5 2022-09-20 20:06:21 -04:00
Wilson Snyder 5df14627fd Fix 32-bit build of previous commit 2022-09-20 18:23:44 -04:00
Mariusz Glebocki fc3ce29845
Improve Verilation memory by reducing V3Number size (#3521) 2022-09-20 16:46:47 -04:00
Yu-Sheng Lin bba800f2d6
Fix calling trace() after open() segfault (#3610) (#3627) 2022-09-20 16:45:09 -04:00
Ryszard Rozak fe2a1e1749
Remove assignments with strengths weaker than strongest non-tristate RHS (#3629) 2022-09-19 04:54:20 -04:00
Wilson Snyder fc4ffd454e Rename --bin to --build-dep-bin. 2022-09-18 10:32:43 -04:00
Geza Lore 7bc7b5372e Merge branch 'master' into develop-v5 2022-09-17 16:12:28 +01:00
Geza Lore 7d88e63bab astgen: generate type specific addNext, remove astNextNull
Generate type specific static overloads of Ast<Node>::addNext, which
return the correct sub-type of the 'this' they were invoked on.

Also remove AstNode::addNextNull, which is now only used in the parser,
implement in verilog.y directly as a template function.
2022-09-17 15:05:22 +01:00
Wilson Snyder a214fd1f78 Internals: Fix constructor syntax in new develop-v5 code 2022-09-17 08:56:41 -04:00
Wilson Snyder 79be097e34 Sort -V env variable output 2022-09-17 08:17:55 -04:00
Wilson Snyder 11b0d36ba2 Merge cleanups from 'develop-v5'. No functional change 2022-09-17 08:17:22 -04:00
Geza Lore af305bf280 Merge branch 'master' into develop-v5 2022-09-16 16:24:36 +01:00