verilator

Commit Graph

Author	SHA1	Message	Date
Wilson Snyder	fd12ab3413	Fix interface exposure with `--public-depth` or `--trace-depth` (#5758 ).	2025-09-23 22:05:51 -04:00
Geza Lore	327d55d13d	Internals: Fix remaining cppcheck errors (#6319 ) Fixed the non const-related issue and added suppressions for the const ones. With that `make cppcheck` should be clean.	2025-08-21 09:43:37 +01:00
Geza Lore	d1f71f2342	Internals: Improve V3Rtti for cppcheck (#6312 ) Rewrite with much less running around in the templates. Use private methods only + friend functions that do the actual type check. This avoids cppcheck warnings.	2025-08-19 23:05:34 +01:00
Wilson Snyder	680236b03e	Internals: Redo post-error additional information to be part of error calls.	2025-05-10 16:20:12 -04:00
Wilson Snyder	8fbb725f34	Copyright year update.	2025-01-01 08:30:25 -05:00
Wilson Snyder	0c820c3068	Internals: Standardize template argument names. No functional change.	2024-11-29 20:20:38 -05:00
Bartłomiej Chmiel	ffe76717c6	Thread pool rewrite (#5161 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com> Signed-off-by: Bartłomiej Chmiel <bchmiel@antmicro.com> Signed-off-by: Arkadiusz Kozdra <akozdra@antmicro.com> Co-authored-by: Krzysztof Bieganski <kbieganski@antmicro.com> Co-authored-by: Arkadiusz Kozdra <akozdra@antmicro.com> Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>	2024-08-23 08:36:49 -04:00
Geza Lore	98206a4f04	Improve V3List user interface (#4996 )	2024-03-25 23:06:25 +00:00
Geza Lore	6ffff8565f	Use the same serial ordering within MTasks as we use in serial mode (#4994 ) The goal here is to use as single ordering heuristic (which can be improved later) within MTasks as we do for serial code ordering. The heuristic itself is factored out into the new OrderMoveGraphSerializer. This also yields slightly nicer ordering than the previously use GraphStream, so we end up with fewer trigger (domain) conditionals in the MTasks, this can be worth a few percent speedup. This has the somewhat nice side-effect of reusing OrderMoveGraphVertex for both serial and parallel mode, so MTaskMoveGraphVertex can be removed. Serial mode yields identical output.	2024-03-17 13:15:39 +00:00
Geza Lore	c0391990ad	Increase graph ParallelismReprot values to uint64_t	2024-03-10 18:56:31 +00:00
Geza Lore	a686e547cf	Factor out graph parallelism report into a generic algorithm (#4957 ) This is a generic algorithm parametrised by a cost function, so implement it as such for easy reuse.	2024-03-10 14:56:43 +00:00
Wilson Snyder	e76f29e5ba	Copyright year update	2024-01-01 03:19:59 -05:00
Wilson Snyder	9fd5634778	Internals: Remove unneeded private's. No functional change	2023-11-13 21:37:45 -05:00
Wilson Snyder	c8063e5732	Internals: Misc cleanups in V3Graph and V3Dead. No functional change.	2023-11-12 22:08:08 -05:00
Wilson Snyder	7ba6647c4f	Internals: Cleanup some V3Graph constructors/funcs and docs. No functional change.	2023-10-28 20:11:28 -04:00
Wilson Snyder	bcbe5059a9	Internal: V3Graph style cleanup. No functional change	2023-10-22 09:50:38 -04:00
Mariusz Glebocki	28bd7e5b19	Rework multithreading handling to separate by code units that use/never use it. (#4228 )	2023-09-24 22:12:23 -04:00
Wilson Snyder	d72f1b89fc	Internals: Minor internal code coverage cleanups	2023-09-10 18:53:51 -04:00
Krzysztof Bieganski	ffbbd438ae	Internals: Use runtime type info instead of `dynamic_cast` for faster graph type checks (#4397 )	2023-08-31 18:00:53 -04:00
Anthony Donlon	cf6566b9bc	Internal: Optimize program size by refactoring error reporting routines (#4446 )	2023-08-29 16:54:32 -04:00
Kamil Rakoczy	93d50c4499	Internals: Add mutex to V3Error (#3680 )	2023-02-09 22:15:37 -05:00
Wilson Snyder	b24d7c83d3	Copyright year update	2023-01-01 10:18:39 -05:00
Wilson Snyder	a0e7930036	docs: Fix spelling	2022-12-09 22:39:41 -05:00
Wilson Snyder	833780fac1	Internal: cppcheck fixes. No functional change intended.	2022-11-27 05:52:40 -05:00
Wilson Snyder	0c75d4eaca	Internals: Fix constructor style.	2022-11-10 22:58:27 -05:00
Geza Lore	050060b139	Make enum constructors and operators constexpr	2022-09-23 11:10:28 +01:00
Geza Lore	63c694f65f	Streamline dump control options - Rename `--dump-treei` option to `--dumpi-tree`, which itself is now a special case of `--dumpi-<tag>` where tag can be a magic word, or a filename - Control dumping via static `dump*()` functions, analogous to `debug()` - Make dumping independent of the value of `debug()` (so dumping always works even without the debug flag) - Add separate `--dumpi-graph` for dumping V3Graphs, which is again a special case of `--dumpi-<tag>` - Alias `--dump-<tag>` to `--dumpi-<tag> 3` as before	2022-09-22 17:24:41 +01:00
Geza Lore	38a8d7fb2e	Remove redundant 'inline' keywords from definitions Also add checks to t/t_dist_cppstyle	2022-09-16 15:52:25 +01:00
Geza Lore	9ac64d0b92	Improve performance of MTask coarsening Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs.	2022-08-20 21:18:50 +01:00
Geza Lore	4d81eb021d	Revert "Improve performance of MTask coarsening" This reverts commit `83475008d9`.	2022-08-19 18:03:45 +01:00
Geza Lore	83475008d9	Improve performance of MTask coarsening Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs.	2022-08-19 16:59:20 +01:00
Geza Lore	cd50949a7e	Reuse MTaskEdge instances in MT scheduling Instead of deleting then re-allocating MTaskEdge instances when merging two MTasks, just redirect the edged of the donor MTask to the recipient MTask. This is both faster as it avoids an allocation and a deletion, together with one update of the sibling maps, and also makes the algorithm more stable due to MergeCandidate IDs being stable and allocated up front for all MTaskEdges, before any SiblingMCs are allocated. Perturbations in output are expected as the IDs used to break ties between merge candidates with equal costs are not updated when redirecting an edge (on purpose). The relinking of only one end of the graph edges also perturbs the order in which they are enumerated, which does change candidate opportunities when the number of edges is larger than PART_SIBLING_EDGE_LIMIT. Confirmed output is identical when IDs are updated and edges are updated to appear in their original order.	2022-08-19 14:06:11 +01:00
Geza Lore	b436794773	Add specialized GraphStreamUnordered GraphStreamUnordered used to be GraphStream<std::less<const V3GraphVertex*>>, but a lot of performance improvements can be had by a specialized implementation, so added a highly optimized one. This helps a lot with --debug-partition.	2022-08-19 14:06:11 +01:00
Geza Lore	a2792785fe	Add V3GraphVertex::dotRank to add GraphViz ranks to graph dumps This is a simple debugging aid to allow constraining the graph layout via GraphViz rank directives. Note this is not related in any way to the vertex 'rank' attribute used by some of the graph algorithms. No functional change.	2022-05-02 10:27:26 +01:00
Geza Lore	2ba9eb4228	Speed up TSP sort implementation - More efficient comparison by pre-computing sorting keys. - Remove work items in algorithms known to be redundant earlier. This greatly reduces data structure sizes. - Use V3GraphVertex->user() for state tracking instead of unordered_map while both of these are constant time, they do add up. - In `makeMinSpanningTree`, instead of batch inserting outgoing edges of each visited vertex into an ordered set, keep an ordered set of sorted vectors of edges. This reduces the size of the ordered set significantly (it is now O(V) rather than O(E), and as the subject graph is a complete graph, V ~ sqrt(E), so this is a significant gain). - Use a vector + sorting in `perfectMatching` instead of an ordered set. This is faster on large working sets. This yields 3.8x speedup on the variable order pass and overall 14% verilation speed gain on a large design.	2022-01-07 12:05:52 +00:00
Wilson Snyder	ca42be982c	Copyright year update.	2022-01-01 08:26:40 -05:00
Geza Lore	987ce927eb	Remove unused code. No functional change.	2021-11-09 19:46:19 +00:00
Wilson Snyder	3a55600913	Internals: Restyle with C++11 using replacing typedef	2021-03-12 18:10:45 -05:00
Wilson Snyder	be31fdcfe4	Use Google-style-guide header guard naming, to avoid __ prefix.	2021-03-03 21:57:07 -05:00
Wilson Snyder	bd602d0e2d	Copyright year update	2021-01-01 10:29:54 -05:00
Wilson Snyder	c23de458ed	Misc internal coverage cleanups	2020-12-08 08:40:22 -05:00
Wilson Snyder	b6ded59c2b	Internals: Use and enforce class final for ~5% performance boost.	2020-11-18 21:32:16 -05:00
Wilson Snyder	1b0a48ea02	Internals: Use C++11 = default where obvious. No functional change intended.	2020-11-16 19:56:16 -05:00
Wilson Snyder	b67f1f0e94	Fix GCC warnings	2020-08-18 08:10:44 -04:00
Wilson Snyder	78aee6f4e7	C++11: Use sized enums (+4% performance).	2020-08-16 12:05:35 -04:00
Wilson Snyder	034737d2a8	C++11: Use member declaration initalizations (in nodes). No functional change intended.	2020-08-16 11:44:06 -04:00
Wilson Snyder	c0127599df	C++11: Use nullptr. No functional change.	2020-08-16 11:44:05 -04:00
Wilson Snyder	5c966ec510	clang-format many files. No functional change. Use nodist/clang_formatter to reformat files that are now clean.	2020-04-13 22:52:23 -04:00
Wilson Snyder	1ce360ed5b	Add SPDX license identifiers. No functional change.	2020-03-21 11:24:24 -04:00
Wilson Snyder	0aabe6ce00	Internals: Fix cppcheck warning including missing init.	2020-02-03 22:10:29 -05:00

1 2

100 Commits