Commit Graph

140 Commits

Author SHA1 Message Date
Wilson Snyder 3a5248a919 Internals: Mark structs final/VL_NOT_FINAL. No functional change intended. 2024-01-20 15:06:46 -05:00
Wilson Snyder a36a07c297 Internals: Favor UASSERT over v3fatalSrc. No functional change intended. 2024-01-05 18:00:06 -05:00
Wilson Snyder e76f29e5ba Copyright year update 2024-01-01 03:19:59 -05:00
Wilson Snyder 9fd5634778 Internals: Remove unneeded private's. No functional change 2023-11-13 21:37:45 -05:00
Geza Lore 3c144ada53
Delete AstNode user5 (#4638)
This saves about 5% memory. V3AstUserAllocator is appropriate for most use
cases, performance is marginally up as we are mostly D-cache bound on
large designs.
2023-10-29 01:12:27 +01:00
Wilson Snyder 7ba6647c4f Internals: Cleanup some V3Graph constructors/funcs and docs. No functional change. 2023-10-28 20:11:28 -04:00
Geza Lore 4c0edd2efb Improve --prof-exec infrastructure and report
Again --prof-exec have bit-rotted a little with all the recent changes
to the structure of the generated code. This patch contains a few
improvements:
- Repalce the eval/evl_loop begin/end events with generic
  section_push/section_pop events, that can be arbitrarily sprinkled
  into the generate code (so long as they are matched correctly) to
  measure various sections. The report then contains a nested profile
  of the sections, and the VCD trace shows the section names.
- Better handling of exec graphs
- Clearer overall statistics
2023-10-21 21:09:03 +01:00
Geza Lore 10d33238b9 Do not merge entry/exit MTasks during coarsening 2023-10-21 19:31:52 +01:00
Wilson Snyder b5828a7ce9 Fix header order botched by clang-format in recent commit. 2023-10-18 06:37:46 -04:00
github action 770cd24f27 Apply 'make format' 2023-10-18 02:50:27 +00:00
Wilson Snyder 431bb1ed16
Support compiling Verilator with gcc/clang precompiled headers (#4579) 2023-10-17 22:49:28 -04:00
Mariusz Glebocki 28bd7e5b19
Rework multithreading handling to separate by code units that use/never use it. (#4228) 2023-09-24 22:12:23 -04:00
Wilson Snyder c52ba28dd0 Tests: Fix commentary to unify issue references. 2023-09-15 18:12:11 -04:00
Ryszard Rozak 91227d26bb
Internals: Rename pure to dpiPure. No functional change. (#4461) 2023-09-08 08:51:19 +02:00
Krzysztof Bieganski ffbbd438ae
Internals: Use runtime type info instead of `dynamic_cast` for faster graph type checks (#4397) 2023-08-31 18:00:53 -04:00
Wilson Snyder d45deccc0a Internals: Favot std::array. No functional change intended. 2023-06-16 19:44:40 -04:00
Wilson Snyder 4ae80f9a9f Commentary 2023-05-05 22:50:28 -04:00
Wilson Snyder add68130b8 Internals: Rename to dumpLevel(), to avoid confusion with make-a-dump() 2023-05-03 18:04:10 -04:00
Kamil Rakoczy 65a484e00b
Internal: Update clang_check_annotations conditions (#4134) 2023-04-20 07:02:31 -04:00
Wilson Snyder 82e653a739 Internals: Avoid emit inheritance in V3EmitCBase. No functional change intended. 2023-03-18 12:23:17 -04:00
Kamil Rakoczy 798d7346cf
Internals: Add VL_MT_SAFE attribute to functions that requires locking. (#3805) 2023-03-17 20:24:15 -04:00
Wilson Snyder b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Wilson Snyder c0499da28b Spelling fixes 2022-12-23 11:32:38 -05:00
Wilson Snyder 51de2c9194 Remove reader task code which was non-functional (#3360) 2022-12-17 16:48:08 -05:00
Wilson Snyder 66d85b3381 Internals: Fix cppcheck warnings. No functional change intended. 2022-11-21 21:40:49 -05:00
Wilson Snyder f44cd9cd48 Internals: Fix constructor style. 2022-11-20 17:40:38 -05:00
Geza Lore 65e08f4dbf Make all expressions derive from AstNodeExpr (#3721).
Apart from the representational changes below, this patch renames
AstNodeMath to AstNodeExpr, and AstCMath to AstCExpr.

Now every expression (i.e.: those AstNodes that represent a [possibly
void] value, with value being interpreted in a very general sense) has
AstNodeExpr as a super class. This necessitates the introduction of an
AstStmtExpr, which represents an expression in statement position, e.g :
'foo();' would be represented as AstStmtExpr(AstCCall(foo)). In exchange
we can get rid of isStatement() in AstNodeStmt, which now really always
represent a statement

Peak memory consumption and verilation speed are not measurably changed.

Partial step towards #3420
2022-11-03 16:02:16 +00:00
HungMingWu 196f3292d5 Improve V3Ast function usage ergonomics (#3650)
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-10-21 14:12:12 +01:00
Wilson Snyder 10fc1f757c Internals: cppcheck cleanups. No functional change intended. 2022-10-02 23:04:55 -04:00
Geza Lore ddb678cc5b Merge branch 'master' into develop-v5 2022-09-22 17:33:36 +01:00
Geza Lore 63c694f65f Streamline dump control options
- Rename `--dump-treei` option to `--dumpi-tree`, which itself is now a
  special case of `--dumpi-<tag>` where tag can be a magic word, or a
  filename
- Control dumping via static `dump*()` functions, analogous to `debug()`
- Make dumping independent of the value of `debug()` (so dumping always
  works even without the debug flag)
- Add separate `--dumpi-graph` for dumping V3Graphs, which is again a
  special case of `--dumpi-<tag>`
- Alias `--dump-<tag>` to `--dumpi-<tag> 3` as before
2022-09-22 17:24:41 +01:00
Geza Lore 95145038b4 Generate AstNode accessors via astgen
Introduce the @astgen directives parsed by astgen, currently used for
the generation child node (operand) accessors. Please see the updated
internal documentation for details.
2022-09-21 14:05:27 +01:00
Geza Lore ce03293128 Generate AstNode accessors via astgen
Introduce the @astgen directives parsed by astgen, currently used for
the generation child node (operand) accessors. Please see the updated
internal documentation for details.
2022-09-21 13:56:03 +01:00
Wilson Snyder a214fd1f78 Internals: Fix constructor syntax in new develop-v5 code 2022-09-17 08:56:41 -04:00
Geza Lore af305bf280 Merge branch 'master' into develop-v5 2022-09-16 16:24:36 +01:00
Geza Lore 0c70a0dcbf Remove redundant 'virtual' keywords from overridden methods
'virtual' is redundant when 'override' is present, so keep only
'override'.

Add t/t_dist_cppstyle.pl to check for this.
2022-09-16 15:19:38 +01:00
Geza Lore 90ab746a42 Make it possible to parallelize ico and act scheduling sections
Small fixup patch so the 'ico' and 'act' scheduling sections could be
ordered as multi-threaded. However, we still only order these single
threaded at the moment (but switching them to multi-threaded now works).
2022-09-06 16:01:13 +01:00
Geza Lore 298f71f2b1 Merge branch 'master' into develop-v5 2022-09-02 12:19:35 +01:00
Geza Lore 5c828b7e60 V3Partition: use V3Lists to keep track of SiblingMCs
Replace std::set<SiblingMC> with V3Lists to keep track of SiblingMCs
associated with MTasks, use a std::set<LogicMTask*> for ensuring
uniqueness. This yields a bit more speed in PartContraction.
2022-09-01 19:40:44 +01:00
Geza Lore 4640bea31a V3Partition: More improvements for PartFixDataHazards
- Remove redundant loop through the MTask graph
- Gather variables directly from the OrderGraph, which is simpler and
  faster.
2022-09-01 16:30:04 +01:00
Geza Lore 875361d7ce
V3Partition: Reduce working set size of PartContraction (#3587)
This yields an additional 25% speedup of MT scheduling.
2022-09-01 16:29:40 +01:00
Geza Lore c0f9b0d8f6 V3Partition: Refactor initialization of MTask dependencies
No functional change
2022-08-31 16:54:04 +01:00
Geza Lore 505bba14eb Improve PartFixDataHazards for clarity and speed.
- Use modern C++
- Implement OrderLogicVertex->LogicMTask map with
  OrderLogicVertex::userp(), insteas of std::unordered_map
- Simplify data structures
- Simplify code and assert properties

No functional change.
2022-08-31 16:52:05 +01:00
Geza Lore ebbe24966c Remove unnecessary virtual methods 2022-08-31 16:52:05 +01:00
Geza Lore 881c3f6e40 Minor optimization of PartContraction
Remove rarely used debug code from initialization loop.
2022-08-31 16:52:05 +01:00
Geza Lore 5c356a4680 Merge branch 'master' into develop-v5 2022-08-22 14:32:06 +01:00
Geza Lore 9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Wilson Snyder ebb37b0156 Merge branch 'master' into develop-v5 2022-08-20 14:02:09 -04:00
Geza Lore 4d81eb021d Revert "Improve performance of MTask coarsening"
This reverts commit 83475008d9.
2022-08-19 18:03:45 +01:00
Geza Lore 83475008d9 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-19 16:59:20 +01:00