verilator

Commit Graph

Author	SHA1	Message	Date
Geza Lore	03ac7ad730	Make PartPropagateCp specific to the MTask graph While keeping the client code abstract in PartPropagateCp is nice for testing, there is performance to be had removing the abstraction. As this code dominates in scheduling large designs, we eliminate the abstraction and re-work the testing to use the actual LogicMTask and MTaskEdge graph types. No functional change intended.	2022-08-19 14:06:11 +01:00
Geza Lore	cd50949a7e	Reuse MTaskEdge instances in MT scheduling Instead of deleting then re-allocating MTaskEdge instances when merging two MTasks, just redirect the edged of the donor MTask to the recipient MTask. This is both faster as it avoids an allocation and a deletion, together with one update of the sibling maps, and also makes the algorithm more stable due to MergeCandidate IDs being stable and allocated up front for all MTaskEdges, before any SiblingMCs are allocated. Perturbations in output are expected as the IDs used to break ties between merge candidates with equal costs are not updated when redirecting an edge (on purpose). The relinking of only one end of the graph edges also perturbs the order in which they are enumerated, which does change candidate opportunities when the number of edges is larger than PART_SIBLING_EDGE_LIMIT. Confirmed output is identical when IDs are updated and edges are updated to appear in their original order.	2022-08-19 14:06:11 +01:00
Geza Lore	f0040c7b9a	Remove reliance on pointer comparison in MT scheduling The critical path propagation used to rely on a pointer comparison to break equal scoring critical path updates. Use the corresponding mtask ids instead, which is deterministic across invocations.	2022-08-19 14:06:11 +01:00
Geza Lore	f8a0389e73	Do not use stepCost when gathering sibling merge candidates siblingPairFromRelatives gathers neighbours of a vertex, and sorts them. It then takes the N best nodes, and creates sibling merge candidates from them. We now use the unadjusted cost instead of the step cost of the vertices when sorting. This is both faster as we need not do the log-space rounding to compute stepCost, and will also make similar but yet cheaper nodes appear closer to the front as we don't lose precision in rounding, hence they are more likely to be entered as merge candidates. Note that when creating the merge candidate, we still use the stepCost, so it's purpose of reducing the propagation of critical path updates is maintained in full. In summary, this should make both Verilator and the generated model very slightly faster, at least in theory, and I have observed minor improvement in places.	2022-08-19 14:06:11 +01:00
Geza Lore	96a4b3e5a5	Update clang-format config and apply - Regroup and sort #include directives (like we used to, but automatic) - Set AlwaysBreakTemplateDeclarations to true	2022-08-05 12:00:24 +01:00
Geza Lore	fac8e76923	Rework SortByValueMap for better performance Keep a single std::set of key/value pairs, and a single unordered_map from key to iterators into the set. Also improve some of the accessing mechanisms using modern C++. This speeds up multi-threaded ordering by about 10%.	2022-08-03 21:17:02 +01:00
Geza Lore	b864f5f5ba	V3Partition: use static_cast with LogicMTaskVertex dynamic_cast is not free, and the mtask graph contains only LogicMTaskVertex vertices, use static_cast instead for some speedup.	2022-08-03 17:05:01 +01:00
Wilson Snyder	b9d7819faa	Internals: Fix some cppcheck issues. Some dump functions fixed.	2022-07-30 10:01:39 -04:00
Geza Lore	87f1e06c41	Small algorithmic improvement of PartContraction::siblingPairFromRelatives Use std::partial_sort for the non-exhaustive case. This is O(n) instead of O(nlog(n)) in the size of the candidate list being sorted. (It actually is O(nlog(k)), but k is constant 6 in the non-exhaustive case).	2022-07-12 19:10:01 +01:00
Geza Lore	7e8bafd217	Remove static data use from PartContraction::siblingPairFromRelatives Use std::sort with lambda rather than qsort with static function and static data. Verilation performance neutral.	2022-07-12 19:09:40 +01:00
HungMingWu	880a9be3b1	Internal: Add C++20ish reverse_view for range loops. No functional change (#3388 ). Signed-off-by: HungMingWu <u9089000@gmail.com>	2022-04-18 13:03:56 -04:00
Geza Lore	fbd568dc47	Prep for multiple AstExecGraph. No functional change.	2022-04-10 12:00:17 +01:00
Wilson Snyder	e02f97854c	Deprecate 'vluint64_t' and similar types (#3255 ).	2022-03-27 15:27:40 -04:00
Geza Lore	b1b5b5dfe2	Improve run-time profiling The --prof-threads option has been split into two independent options: 1. --prof-exec, for collecting verilator_gantt and other execution related profiling data, and 2. --prof-pgo, for collecting data needed for PGO The implementation of execution profiling is extricated from VlThreadPool and is now a separate class VlExecutionProfiler. This means --prof-exec can now be used for single-threaded models (though it does not measure a lot of things just yet). For consistency VerilatedProfiler is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are in verilated_profiler.{h/cpp}, but can be used completely independently. Also re-worked the execution profile format so it now only emits events without holding onto any temporaries. This is in preparation for some future optimizations that would be hindered by the introduction of function locals via AstText. Also removed the Barrier event. Clearing the profile buffers is not notably more expensive as the profiling records are trivially destructible.	2022-03-27 15:57:30 +02:00
Wilson Snyder	e6857df5c6	Internals: Rename Ast on non-node classes (#3262 ). No functional change. This commit has the following replacements applied: s/\bAstUserInUseBase\b/VNUserInUseBase/g; s/\bAstAttrType\b/VAttrType/g; s/\bAstBasicDTypeKwd\b/VBasicDTypeKwd/g; s/\bAstDisplayType\b/VDisplayType/g; s/\bAstNDeleter\b/VNDeleter/g; s/\bAstNRelinker\b/VNRelinker/g; s/\bAstNVisitor\b/VNVisitor/g; s/\bAstPragmaType\b/VPragmaType/g; s/\bAstType\b/VNType/g; s/\bAstUser1InUse\b/VNUser1InUse/g; s/\bAstUser2InUse\b/VNUser2InUse/g; s/\bAstUser3InUse\b/VNUser3InUse/g; s/\bAstUser4InUse\b/VNUser4InUse/g; s/\bAstUser5InUse\b/VNUser5InUse/g; s/\bAstVarType\b/VVarType/g;	2022-01-02 14:03:20 -05:00
Wilson Snyder	24a0d2a0c9	Internals: Favor member assignment initialization. No functional change intended.	2022-01-01 11:46:49 -05:00
Wilson Snyder	ca42be982c	Copyright year update.	2022-01-01 08:26:40 -05:00
Wilson Snyder	cd737065f2	Internals: More const. No functional change intended.	2021-11-26 17:55:36 -05:00
Wilson Snyder	010084201a	Internals: Remove dead code.	2021-11-26 16:15:08 -05:00
Wilson Snyder	05e12ab60e	Internals: More const. No functional change intended.	2021-11-26 10:52:45 -05:00
Wilson Snyder	37e3c6da70	Internals: Add more const. No functional change intended.	2021-11-13 13:50:44 -05:00
Geza Lore	e69a8e838d	Improve memory usage of V3Partition. Only performance change intended. (#3192 )	2021-11-05 22:08:54 -04:00
Wilson Snyder	61612582e6	Improve memory usage of V3Partition. Only performance change intended.	2021-11-04 07:39:28 -04:00
Wilson Snyder	da5644211f	Improve memory usage of V3Partition. Only performance change intended.	2021-11-03 22:01:40 -04:00
Wilson Snyder	c1d7bfa617	Internals: Skip some asserts in fastpath partitioning.	2021-11-03 19:19:23 -04:00
Wilson Snyder	c26ce25cea	Internals: Add more const. No functional change.	2021-11-03 17:49:19 -04:00
Wilson Snyder	9029da5ab8	Add profile-guided optmization of mtasks (#3150 ).	2021-09-26 22:51:11 -04:00
Wilson Snyder	c2819923c5	Verilator_gantt now shows the predicted mtask times, eval times, and additional statistics.	2021-09-23 22:59:36 -04:00
Wilson Snyder	68f1432a68	Gantt: Subtract common start in slowpath to reduce collection measurement error.	2021-09-23 19:43:20 -04:00
Wilson Snyder	8ecdc85cf7	Internals: C++11 style cleanups. No functional change.	2021-07-11 18:42:01 -04:00
Wilson Snyder	c7499133b2	Internals: C++11 for bool. No functional change.	2021-07-11 10:42:32 -04:00
Morten Borup Petersen	fd0446f481	Internals: Add .dot graph visualization of ThreadSchedule (#3048 ) * Move MTaskState to ThreadSchedule MTaskState does not concern itself with sandbagging, and thus solely contains information related to the finalized schedule, i.e., completion time, thread ID and next MTask on thread. * Add .dot graph visualization of ThreadSchedule Follow-up to #2779. This commit adds the creation of .dot files - used by GraphViz - to visualize how mtasks are statically scheduled across the set of specified threads. We visualize each thread as a row, with nodes of a row being the mtasks scheduled for the given thread. The width of the mtask nodes are proportional to their cost. MTask dependencies are shown using an edge between the source and sink mtasks.	2021-07-06 07:06:00 -04:00
Geza Lore	708abe0dd1	Introduce model interface class, make $root part or Syms (#3036 ) This patch implements #3032. Verilator creates a module representing the SystemVerilog $root scope (V3LinkLevel::wrapTop). Until now, this was called the "TOP" module, which also acted as the user instantiated model class. Syms used to hold a pointer to this root module, but hold instances of any submodule. This patch renames this root scope module from "TOP" to "$root", and introduces a separate model class which is now an interface class. As the root module is no longer the user interface class, it can now be made an instance of Syms, just like any other submodule. This allows absolute references into the root module to avoid an additional pointer indirection resulting in a potential speedup (about 1.5% on OpenTitan). The model class now also contains all non design specific generated code (e.g.: eval loops, trace config, etc), which additionally simplifies Verilator internals. Please see the updated documentation for the model interface changes.	2021-06-30 16:35:40 +01:00
Wilson Snyder	512fe0a2d1	Internals: Add const. No functional change.	2021-06-20 18:33:13 -04:00
Geza Lore	a8f83d5758	Construct AstExecGraph implementation outside of V3EmitC. (#3022 ) The goal of this patch is to move functionality related to constructing the thread entry points and then invoking them out of V3EmitC (and into V3Partition). The long term goal being enabling V3EmitC to emit functions partitioned based on header dependencies. V3EmitC having to deal with only AstCFunc instances and no other magic will facilitate this. In this patch: - We construct AstCFuncs for each thread entry point in V3Partition::finalize and move AstMTaskBody nodes under these functions. - Add the invocation of the threads as text statements within the AstExecGraph, so they are still invoked where the exec graph is located. (the entry point functions are still referenced via AstCCall or AstAddOrCFunc, so lazy declarations of referenced functions are created automatically). - Explicitly handle MTask state variables (VlMTaskVertex in verilated_threads.h) within Verilator, so no need to text bash a lot of these any more (some text refs still remain but they are all created next to each other within V3Partition.cpp). The effect of all this on the emitted code should be nothing but some identifier/ordering changes. No functional change intended.	2021-06-16 12:18:56 +01:00
Wilson Snyder	3a55600913	Internals: Restyle with C++11 using replacing typedef	2021-03-12 18:10:45 -05:00
Wilson Snyder	404b323f8c	Internals: Remove some unnecessary typedefs. No functional change.	2021-03-12 17:26:53 -05:00
Wilson Snyder	9650aefa42	Internals: Cleanup unneeded {}. No functional change	2021-02-21 21:25:21 -05:00
Wilson Snyder	bd602d0e2d	Copyright year update	2021-01-01 10:29:54 -05:00
Wilson Snyder	fa20614277	Fix Ubuntu 16.04 LTS warning	2020-12-02 19:20:03 -05:00
Wilson Snyder	b6ded59c2b	Internals: Use and enforce class final for ~5% performance boost.	2020-11-18 21:32:16 -05:00
Wilson Snyder	c0888c1b0f	Internals: Use newline instead of endl to avoid unneeded flush.	2020-11-18 21:03:23 -05:00
Wilson Snyder	1b0a48ea02	Internals: Use C++11 = default where obvious. No functional change intended.	2020-11-16 19:56:16 -05:00
Wilson Snyder	f4ef4ad9f3	Internals: Favor std::array where easy. No functional change intended.	2020-11-15 16:21:26 -05:00
Wilson Snyder	79d33bf1ee	Use C++11 for loops, from clang-migrate. No functional change intended	2020-11-10 22:10:38 -05:00
Wilson Snyder	44eb362a18	clang-tidy cleanups. No functional change intended.	2020-11-10 21:40:14 -05:00
Wilson Snyder	45eccaecaf	Fix Travis/GCC warnings.	2020-08-27 18:48:26 -04:00
Wilson Snyder	6013b54f7b	clang-tidy cleanups. No functional change intended.	2020-08-16 14:55:46 -04:00
Wilson Snyder	d75a8624c1	C++11: constexpr replacing defines. No functional change intended.	2020-08-16 14:19:12 -04:00
Wilson Snyder	ee9d6dd63f	C++11: Favor auto, range for. No functional change intended.	2020-08-16 11:44:06 -04:00

1 2

84 Commits