Commit Graph

42 Commits

Author SHA1 Message Date
Bartłomiej Chmiel ffe76717c6
Thread pool rewrite (#5161)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Signed-off-by: Bartłomiej Chmiel <bchmiel@antmicro.com>
Signed-off-by: Arkadiusz Kozdra <akozdra@antmicro.com>
Co-authored-by: Krzysztof Bieganski <kbieganski@antmicro.com>
Co-authored-by: Arkadiusz Kozdra <akozdra@antmicro.com>
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
2024-08-23 08:36:49 -04:00
Wilson Snyder e76f29e5ba Copyright year update 2024-01-01 03:19:59 -05:00
Mariusz Glebocki 28bd7e5b19
Rework multithreading handling to separate by code units that use/never use it. (#4228) 2023-09-24 22:12:23 -04:00
Wilson Snyder b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Geza Lore 9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Geza Lore 4d81eb021d Revert "Improve performance of MTask coarsening"
This reverts commit 83475008d9.
2022-08-19 18:03:45 +01:00
Geza Lore 83475008d9 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-19 16:59:20 +01:00
Geza Lore 96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore fac8e76923 Rework SortByValueMap for better performance
Keep a single std::set of key/value pairs, and a single unordered_map
from key to iterators into the set. Also improve some of the accessing
mechanisms using modern C++. This speeds up multi-threaded ordering by
about 10%.
2022-08-03 21:17:02 +01:00
Wilson Snyder 15b32dc140 Internals: cpplint cleanups. No functional change. 2022-01-08 12:01:39 -05:00
Wilson Snyder 24a0d2a0c9 Internals: Favor member assignment initialization. No functional change intended. 2022-01-01 11:46:49 -05:00
Wilson Snyder ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Wilson Snyder cd737065f2 Internals: More const. No functional change intended. 2021-11-26 17:55:36 -05:00
Wilson Snyder 8b00939f0c Improve performance of V3Scoreboard. Only performance change intended. 2021-11-03 22:16:18 -04:00
Wilson Snyder c26ce25cea Internals: Add more const. No functional change. 2021-11-03 17:49:19 -04:00
Wilson Snyder 512fe0a2d1 Internals: Add const. No functional change. 2021-06-20 18:33:13 -04:00
Wilson Snyder 3a55600913 Internals: Restyle with C++11 using replacing typedef 2021-03-12 18:10:45 -05:00
Wilson Snyder 404b323f8c Internals: Remove some unnecessary typedefs. No functional change. 2021-03-12 17:26:53 -05:00
Wilson Snyder be31fdcfe4 Use Google-style-guide header guard naming, to avoid __ prefix. 2021-03-03 21:57:07 -05:00
Wilson Snyder bd602d0e2d Copyright year update 2021-01-01 10:29:54 -05:00
Wilson Snyder b6ded59c2b Internals: Use and enforce class final for ~5% performance boost. 2020-11-18 21:32:16 -05:00
Wilson Snyder 1b0a48ea02 Internals: Use C++11 = default where obvious. No functional change intended. 2020-11-16 19:56:16 -05:00
Wilson Snyder ee9d6dd63f C++11: Favor auto, range for. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder 72d2cff0a1 C++11: Use member declaration initalizations. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder c0127599df C++11: Use nullptr. No functional change. 2020-08-16 11:44:05 -04:00
Wilson Snyder 7c54a451a9 C++11: Remove pre-c11 VL_OVERRIDE etc. No functional change. 2020-08-16 11:44:05 -04:00
Wilson Snyder 9927e8b3ee clang-format uses C++11 style. No functional change. 2020-08-15 09:48:08 -04:00
Wilson Snyder 17e7da77f0 Misc internal coverage improvements. 2020-05-16 18:02:54 -04:00
Wilson Snyder 57a937df03 Misc internal coverage cleanups 2020-05-16 07:43:22 -04:00
Wilson Snyder 5c966ec510 clang-format many files. No functional change.
Use nodist/clang_formatter to reformat files that are now clean.
2020-04-13 22:52:23 -04:00
Wilson Snyder 38a31ae168 Cleanup misc clang-tidy warnings. No functional change intended 2020-04-03 22:31:54 -04:00
Wilson Snyder 1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder 0aabe6ce00 Internals: Fix cppcheck warning including missing init. 2020-02-03 22:10:29 -05:00
Wilson Snyder f23fe8fd84 Update copyright year. 2020-01-06 18:05:53 -05:00
Wilson Snyder 5811ec07e6 Update URLs to https://verilator.org 2019-11-07 22:33:59 -05:00
Wilson Snyder 771a301f66 Commentary: Remove newlines, upsets some patches. No functional change. 2019-10-04 20:17:11 -04:00
Wilson Snyder 01ef7122e9 Internals: Add lcov code coverage markers. 2019-06-30 22:37:03 -04:00
Wilson Snyder 8a4aeddbb0 Copyright year update. 2019-01-03 19:17:22 -05:00
Wilson Snyder d87b9d25ca Internals: Cleanup and standardize include order. No functional change intended. 2018-10-14 13:59:40 -04:00
Kevin Kiningham b2ca4995ab Fix Mac OSX 10.13.6 / LLVM 9.1 compile issues, bug1348. 2018-09-17 18:35:23 -04:00
Wilson Snyder 0070520edb Fix cppcheck warnings. 2018-07-22 20:41:46 -04:00
Wilson Snyder e37dce9d85 Internals: Add new graph algs for future partitioning. 2018-07-15 22:09:27 -04:00