verilator

Commit Graph

Author	SHA1	Message	Date
Geza Lore	38a8d7fb2e	Remove redundant 'inline' keywords from definitions Also add checks to t/t_dist_cppstyle	2022-09-16 15:52:25 +01:00
Geza Lore	0c70a0dcbf	Remove redundant 'virtual' keywords from overridden methods 'virtual' is redundant when 'override' is present, so keep only 'override'. Add t/t_dist_cppstyle.pl to check for this.	2022-09-16 15:19:38 +01:00
Geza Lore	d16619fe86	astgen: Explicitly generate AstNode members Generate boilerplate members of AstNode sub-types directly via astgen. This is in preparation for generating additional members.	2022-09-16 11:18:20 +01:00
Wilson Snyder	2dc85a5acd	Internals: enum constructor cleanups. No functional change intended.	2022-09-15 19:58:10 -04:00
Kamil Rakoczy	dbe1348b4c	Tests: Fix earlier commit, add build jobs to stats (#3623 ) (#3626 )	2022-09-15 11:29:50 -04:00
Geza Lore	22846df03e	Merge branch 'master' into develop-v5	2022-09-15 14:01:19 +01:00
Wilson Snyder	d74536a4dc	Internals: Cleanup some constructors. No functional change intended.	2022-09-15 08:54:04 -04:00
Kamil Rakoczy	da20da264b	Add --build-jobs, and rework arguments for -j (#3623 )	2022-09-15 08:28:58 -04:00
Geza Lore	22b9dfb9c9	Split and re-order AstNode definitions (#3622 ) - Move DType representations into V3AstNodeDType.h - Move AstNodeMath and subclasses into V3AstNodeMath.h - Move any other AstNode subtypes into V3AstNodeOther.h - Fix up out-of-order definitions via inline methods and implementations in V3Inlines.h and V3AstNodes.cpp - Enforce declaration order of AstNode subtypes via astgen, which will now fail when definitions are mis-ordered.	2022-09-15 13:10:39 +01:00
Geza Lore	27031ed688	Merge branch 'master' into develop-v5	2022-09-15 10:28:35 +01:00
Wilson Snyder	d85b909054	Internals: Use std:: for mem and str functions.	2022-09-14 21:10:19 -04:00
Wilson Snyder	75fd71d7e5	Add --main to generate main() C++ (previously was experimental only) (#3265 ).	2022-09-14 20:18:40 -04:00
Ryszard Rozak	a3c58d7b70	Support IEEE constant signal strengths (#3601 ).	2022-09-14 07:39:27 -04:00
Kamil Rakoczy	ae466b1703	Internals: Improve Verilation peak memory usage in V3Subst (#3512 ).	2022-09-14 07:37:51 -04:00
Geza Lore	2564484429	astgen: Rewrite in a more OOP way, in preparation for extensions Rely less on strings and represent AstNode classes as a 'class Node', with all associated properties kept together, rather than distributed over multiple dictionaries or constructed at retrieval time. No functional change intended.	2022-09-13 21:54:12 +01:00
Kamil Rakoczy	93a044f587	Internals: Rework addFilesp towards parallel emit (#3620 ). No functional change intended.	2022-09-13 12:15:34 -04:00
Wilson Snyder	81fe35ee2e	Fix typedef'ed class conversion to boolean (#3616 ).	2022-09-12 18:03:56 -04:00
Geza Lore	08b6bdddf9	Update default --mod-prefix when --prefix is repeated Fixes #3603	2022-09-12 17:25:09 +01:00
Kamil Rakoczy	4d49db48a3	Internals: Remove usage of user1 from EmitCTrace (#3617 ). No Functional change intended.	2022-09-12 12:00:41 -04:00
Kamil Rakoczy	9b2266f68c	Internals: Remove usage of global state in V3EmitCFunc (#3615 ). No functional change intended.	2022-09-12 11:59:14 -04:00
Wilson Snyder	752f425025	Tests: Process/Semaphore/Mailbox testing (all fail until supported)	2022-09-11 13:05:24 -04:00
Gustav Svensk	47262cd4ec	Fix arguments in non-static method call (#3547 ) (#3582 )	2022-09-11 12:33:31 -04:00
Wilson Snyder	47e64535d6	Commentary	2022-09-11 12:25:44 -04:00
Geza Lore	90ab746a42	Make it possible to parallelize ico and act scheduling sections Small fixup patch so the 'ico' and 'act' scheduling sections could be ordered as multi-threaded. However, we still only order these single threaded at the moment (but switching them to multi-threaded now works).	2022-09-06 16:01:13 +01:00
Geza Lore	fd6275a62b	Merge branch 'master' into develop-v5	2022-09-05 17:03:43 +01:00
Krzysztof Bieganski	6b6790fc50	Preserve return type of `AstNode::addNext` via templating (#3597 ) No functional change intended. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 16:56:57 +01:00
Krzysztof Bieganski	fb931087ab	Add stats tracking for `V3Undriven`. (#3600 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 16:20:38 +01:00
Krzysztof Bieganski	a2e1b32a1c	Fix inlining of forks (#3594 ) Before this change, some forked processes were being inlined in `V3Timing` because they contained no `CAwait`s. This only works under the assumption that no `CAwait`s will be added there later, which is not true, as a function called by a forked process could be turned into a coroutine later. The call would be wrapped in a new `CAwait`, but the process itself would have already been inlined at this point. This commit moves the inlining to `transformForks` in `V3SchedTiming`, which is called at a point when all `CAwait`s are already in place. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 15:19:19 +01:00
Krzysztof Bieganski	54f89bce42	Move `SenExprBuilder` to a header. (#3598 ) No functional change intended. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 15:17:51 +01:00
Krzysztof Bieganski	8b19d02e3b	Fix `co_await VlNow{}` being added too many times (#3596 ) (or not at all) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-09-05 11:46:34 +01:00
Geza Lore	937e893b6d	Build verilator_bin with -O3 (#3592 ) This is consistently a few percent faster.	2022-09-03 22:10:07 +01:00
Geza Lore	d42a2d6494	Fix V3Gate crash on circular logic The recent patch to defer substitutions on V3Gate crashes on circular logic that has cycle length >= 3 with all inlineable signals (cycle length 2 is detected correctly and is not inlined). Fix by stopping recursion at the loop-back edge. Fixes #3543	2022-09-02 19:58:58 +01:00
Geza Lore	8e8f4b1e5c	Remove AstVarScope::valuep() and related code This is detritus from when V3TraceDecl used to run after V3Gate, today V3TraceDecl runs before V3Gate and this value has no function at all. No functional change intended.	2022-09-02 16:44:13 +01:00
Geza Lore	298f71f2b1	Merge branch 'master' into develop-v5	2022-09-02 12:19:35 +01:00
Geza Lore	2ba39b25f1	Replace dynamic_casts with static_casts dynamic_cast is not free. Replace obvious instances (where the result is unconditionally dereferenced) with static_cast in contexts with performance implications.	2022-09-02 12:08:34 +01:00
Geza Lore	5c828b7e60	V3Partition: use V3Lists to keep track of SiblingMCs Replace std::set<SiblingMC> with V3Lists to keep track of SiblingMCs associated with MTasks, use a std::set<LogicMTask*> for ensuring uniqueness. This yields a bit more speed in PartContraction.	2022-09-01 19:40:44 +01:00
Geza Lore	4640bea31a	V3Partition: More improvements for PartFixDataHazards - Remove redundant loop through the MTask graph - Gather variables directly from the OrderGraph, which is simpler and faster.	2022-09-01 16:30:04 +01:00
Geza Lore	875361d7ce	V3Partition: Reduce working set size of PartContraction (#3587 ) This yields an additional 25% speedup of MT scheduling.	2022-09-01 16:29:40 +01:00
Wilson Snyder	849bb5590a	Merge branch 'master' into develop-v5	2022-08-31 19:51:07 -04:00
Wilson Snyder	51daa64e9a	Fix --hierarchical with order-based pin connections (#3585 ).	2022-08-31 18:12:21 -04:00
Geza Lore	c0f9b0d8f6	V3Partition: Refactor initialization of MTask dependencies No functional change	2022-08-31 16:54:04 +01:00
Geza Lore	505bba14eb	Improve PartFixDataHazards for clarity and speed. - Use modern C++ - Implement OrderLogicVertex->LogicMTask map with OrderLogicVertex::userp(), insteas of std::unordered_map - Simplify data structures - Simplify code and assert properties No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	ebbe24966c	Remove unnecessary virtual methods	2022-08-31 16:52:05 +01:00
Geza Lore	881c3f6e40	Minor optimization of PartContraction Remove rarely used debug code from initialization loop.	2022-08-31 16:52:05 +01:00
Geza Lore	546aeab9f2	V3Order: Minor refactoring for clarity Refactor ProcessMoveBuildGraph utilizing the fact that OrderGraph is a bipartite graph, also remove unnecessary unordered_map and distribute variable domain map. No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	8de21e9bb7	Document and ensure OrderGraph is bipartite Minor refactoring and documentation. No functional change.	2022-08-31 16:52:05 +01:00
Geza Lore	2ecda74471	Merge branch 'master' into develop-v5	2022-08-31 10:45:18 +01:00
Aleksander Kiryk	2136afde6b	Support negated properties (#3572 )	2022-08-30 06:33:42 -04:00
Wilson Snyder	ea55db7286	Internals: Cleanup some string constructors. No functional change.	2022-08-30 01:02:39 -04:00
Wilson Snyder	819e8741cc	Merge branch 'master' into develop-v5	2022-08-30 00:20:21 -04:00
Wilson Snyder	6a5f77b278	Internals: Cleanup some string/model constructors. No functional change.	2022-08-29 23:50:32 -04:00
Wilson Snyder	8658a0d7dc	Internals: Constructor format update. No functional change.	2022-08-29 23:05:52 -04:00
Wilson Snyder	c335aad25f	Fix --hierarchical with order-based pin connections (#3583 ).	2022-08-29 22:49:19 -04:00
Wilson Snyder	9d9d647c1f	Fix indentation of --protect import function SV code.	2022-08-29 22:28:02 -04:00
Wilson Snyder	d47a37fb76	Internals: Cleanup constructors etc. No functional change.	2022-08-29 22:17:27 -04:00
Aleksander Kiryk	24ec84851a	Support $sampled (#3569 )	2022-08-29 08:39:41 -04:00
Arkadiusz Kozdra	0a3a15a66e	Support class parameters (#2231 ) (#3541 )	2022-08-28 10:24:55 -04:00
Krzysztof Bieganski	2af5304884	Fix tracing of slow coroutines (#3576 part) (#3579 )	2022-08-26 05:11:44 -05:00
Varun Koyyalagunta	5869fdf7f6	Fix $dump systemtask with --output-split-cfuncs (#3495 ) (#3497 )	2022-08-25 18:29:11 -05:00
Krzysztof Bieganski	1a1d2ecfd9	Enable tracing in generated main (#3578 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-25 14:55:37 +01:00
Geza Lore	5c356a4680	Merge branch 'master' into develop-v5	2022-08-22 14:32:06 +01:00
Krzysztof Bieganski	39af5d020e	Timing support (#3363 ) Adds timing support to Verilator. It makes it possible to use delays, event controls within processes (not just at the start), wait statements, and forks. Building a design with those constructs requires a compiler that supports C++20 coroutines (GCC 10, Clang 5). The basic idea is to have processes and tasks with delays/event controls implemented as C++20 coroutines. This allows us to suspend and resume them at any time. There are five main runtime classes responsible for managing suspended coroutines: * `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle` with move semantics and automatic cleanup. * `VlDelayScheduler`, for coroutines suspended by delays. It resumes them at a proper simulation time. * `VlTriggerScheduler`, for coroutines suspended by event controls. It resumes them if its corresponding trigger was set. * `VlForkSync`, used for syncing `fork..join` and `fork..join_any` blocks. * `VlCoroutine`, the return type of all verilated coroutines. It allows for suspending a stack of coroutines (normally, C++ coroutines are stackless). There is a new visitor in `V3Timing.cpp` which: * scales delays according to the timescale, * simplifies intra-assignment timing controls and net delays into regular timing controls and assignments, * simplifies wait statements into loops with event controls, * marks processes and tasks with timing controls in them as suspendable, * creates delay, trigger scheduler, and fork sync variables, * transforms timing controls and fork joins into C++ awaits There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`) that integrate static scheduling with timing. This involves providing external domains for variables, so that the necessary combinational logic gets triggered after coroutine resumption, as well as statements that need to be injected into the design eval function to perform this resumption at the correct time. There is also a function that transforms forked processes into separate functions. See the comments in `verilated_timing.h`, `verilated_timing.cpp`, `V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals documentation for more details. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-22 13:26:32 +01:00
Geza Lore	9ac64d0b92	Improve performance of MTask coarsening Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs.	2022-08-20 21:18:50 +01:00
Wilson Snyder	7cc89b8b42	Merge branch 'master' into develop-v5	2022-08-20 14:19:45 -04:00
Wilson Snyder	c6607724cb	Fix clang warning.	2022-08-20 14:19:00 -04:00
Wilson Snyder	ebb37b0156	Merge branch 'master' into develop-v5	2022-08-20 14:02:09 -04:00
Wilson Snyder	90dc04cf93	Add --future0 and --future1 options.	2022-08-20 14:01:13 -04:00
Krzysztof Bieganski	10cf492946	Add support for expressions in event controls (#3550 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-19 20:18:38 +02:00
Geza Lore	4d81eb021d	Revert "Improve performance of MTask coarsening" This reverts commit `83475008d9`.	2022-08-19 18:03:45 +01:00
Geza Lore	83475008d9	Improve performance of MTask coarsening Various optimizations to speed up MTasks coarsening (which is the long pole in the multi-threaded scheduling of very large designs). The biggest impact ones: - Use efficient hand written Pairing Heaps for implementing priority queues and the scoreboard, instead of the old SortByValueMap. This helps us avoid having to sort a lot of merge candidates that we will never actually consider and helps a lot in performance. - Remove unnecessary associative containers and store data structures (the heap nodes in particular) directly in the object they relate to. This eliminates a huge amount of lookups and helps a lot in performance. - Distribute storage for SiblingMC instances into the LogicMTask instances, and combine with the sibling maps. This again eliminates hash table lookups and makes storage structures smaller. - Remove some now bidirectional edge maps, keep only the forward map. There are also some other smaller optimizations: - Replaced more unnecessary dynamic_casts with static_casts - Templated some functions/classes to reduce the number of static branches in loops. - Improves sorting of edges for sibling candidate creation - Various micro-optimizations here and there This speeds up MTask coarsening by 3.8x on a large design, which translates to a 2.5x speedup of the ordering pass in multi-threaded mode. (Combined with the earlier optimizations, ordering is now 3x faster.) Due to the elimination of a lot of the auxiliary data structures, and ensuring a minimal size for the necessary ones, memory consumption of the MTask coarsening is also reduced (measured up to 4.4x reduction though the accuracy of this is low). The algorithm is identical except for minor alterations of the order some candidates are added or removed, this can cause perturbation in the output due to tied scores being broken based on IDs.	2022-08-19 16:59:20 +01:00
Geza Lore	03ac7ad730	Make PartPropagateCp specific to the MTask graph While keeping the client code abstract in PartPropagateCp is nice for testing, there is performance to be had removing the abstraction. As this code dominates in scheduling large designs, we eliminate the abstraction and re-work the testing to use the actual LogicMTask and MTaskEdge graph types. No functional change intended.	2022-08-19 14:06:11 +01:00
Geza Lore	cd50949a7e	Reuse MTaskEdge instances in MT scheduling Instead of deleting then re-allocating MTaskEdge instances when merging two MTasks, just redirect the edged of the donor MTask to the recipient MTask. This is both faster as it avoids an allocation and a deletion, together with one update of the sibling maps, and also makes the algorithm more stable due to MergeCandidate IDs being stable and allocated up front for all MTaskEdges, before any SiblingMCs are allocated. Perturbations in output are expected as the IDs used to break ties between merge candidates with equal costs are not updated when redirecting an edge (on purpose). The relinking of only one end of the graph edges also perturbs the order in which they are enumerated, which does change candidate opportunities when the number of edges is larger than PART_SIBLING_EDGE_LIMIT. Confirmed output is identical when IDs are updated and edges are updated to appear in their original order.	2022-08-19 14:06:11 +01:00
Geza Lore	f0040c7b9a	Remove reliance on pointer comparison in MT scheduling The critical path propagation used to rely on a pointer comparison to break equal scoring critical path updates. Use the corresponding mtask ids instead, which is deterministic across invocations.	2022-08-19 14:06:11 +01:00
Geza Lore	f8a0389e73	Do not use stepCost when gathering sibling merge candidates siblingPairFromRelatives gathers neighbours of a vertex, and sorts them. It then takes the N best nodes, and creates sibling merge candidates from them. We now use the unadjusted cost instead of the step cost of the vertices when sorting. This is both faster as we need not do the log-space rounding to compute stepCost, and will also make similar but yet cheaper nodes appear closer to the front as we don't lose precision in rounding, hence they are more likely to be entered as merge candidates. Note that when creating the merge candidate, we still use the stepCost, so it's purpose of reducing the propagation of critical path updates is maintained in full. In summary, this should make both Verilator and the generated model very slightly faster, at least in theory, and I have observed minor improvement in places.	2022-08-19 14:06:11 +01:00
Geza Lore	b436794773	Add specialized GraphStreamUnordered GraphStreamUnordered used to be GraphStream<std::less<const V3GraphVertex*>>, but a lot of performance improvements can be had by a specialized implementation, so added a highly optimized one. This helps a lot with --debug-partition.	2022-08-19 14:06:11 +01:00
Geza Lore	1404319b28	Merge branch 'master' into develop-v5	2022-08-19 13:39:44 +01:00
Geza Lore	90d22cbec6	Fix `AstNode::exists` return type	2022-08-19 13:22:06 +01:00
Krzysztof Bieganski	33e2acfe61	Fix `AstNode::forall` return type (#3559 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-08-19 12:33:17 +01:00
Ryszard Rozak	db5fdfb0ee	Fix === with some tristate constants (#3551 ).	2022-08-18 07:03:05 -04:00
Krzysztof Bieganski	951cd73fe0	Handle MemberSel in V3EmitV.cpp (#3555 )	2022-08-18 06:33:45 -04:00
Arkadiusz Kozdra	0eeb40b975	Fix converting subclasses to string (#3552 )	2022-08-17 18:08:43 -04:00
Wilson Snyder	f435d96241	Fix case statement comparing string literal (#3544 ).	2022-08-15 21:56:09 -04:00
github action	d32e3f042f	Apply 'make format'	2022-08-12 10:56:12 +00:00
Mostafa Gamal	df5f95a5bd	Fix nested default assignment for struct pattern (#3511 ) (#3524 )	2022-08-12 06:55:07 -04:00
Drew Ranck	b0c475205b	Fix void-cast queue pop_front or pop_back (#3542 ) (#3364 ) Fix compile error for queue method usage, if it is the first statement in a block of code, and the return value is not used. Example: > if (foo) > void'(bar.pop_front());	2022-08-12 06:51:25 -04:00
Wilson Snyder	cbe1b8e266	Fix segfault exporting non-existant package (#3535 ).	2022-08-08 17:53:50 -04:00
Mariusz Glebocki	2b12fe5773	Internals: Construct V3Number with correct type instead of changing it manually. (#3529 )	2022-08-08 08:17:02 -04:00
Yutetsu TAKATSUKASA	d20f22beb1	Fix tristate logic when reading inout port in a module #3399 (#3523 ) * Tests: Add a test to reproduce #3399 * Fix #3399. When reading an inout port in a module, it should refer the original inout port, not the generated MODTEMP.	2022-08-07 21:12:57 +09:00
Mariusz Glebocki	122e89ffde	Fix V3Number::isMsbXZ(). (#3530 )	2022-08-05 19:12:52 +01:00
Geza Lore	c266739e9f	Merge branch 'master' into develop-v5	2022-08-05 12:17:57 +01:00
Geza Lore	96a4b3e5a5	Update clang-format config and apply - Regroup and sort #include directives (like we used to, but automatic) - Set AlwaysBreakTemplateDeclarations to true	2022-08-05 12:00:24 +01:00
Geza Lore	7403226a97	Merge branch 'master' into develop-v5	2022-08-04 10:03:38 +01:00
Geza Lore	fac8e76923	Rework SortByValueMap for better performance Keep a single std::set of key/value pairs, and a single unordered_map from key to iterators into the set. Also improve some of the accessing mechanisms using modern C++. This speeds up multi-threaded ordering by about 10%.	2022-08-03 21:17:02 +01:00
Geza Lore	b864f5f5ba	V3Partition: use static_cast with LogicMTaskVertex dynamic_cast is not free, and the mtask graph contains only LogicMTaskVertex vertices, use static_cast instead for some speedup.	2022-08-03 17:05:01 +01:00
Geza Lore	f9f66d787e	Fix integer overflow in V3Unroll (#3451 )	2022-08-03 09:41:30 +01:00
Geza Lore	bd211c87aa	astgen: split 'visit' method declarations from definitions Add definitions to V3Ast.cpp, and use static_cast. This fixes a lot of clang-tidy noise.	2022-08-02 17:53:19 +01:00
Geza Lore	6fc25dae9e	Fix clang-tidy warnings (#3522 )	2022-08-02 15:58:48 +01:00
Kamil Rakoczy	cfb6fd8b34	Reduce max RSS usage (#3483 ) By constant folding nodes earlier in V3Expand, we can save some max RSS on large designs.	2022-08-02 13:36:14 +01:00
Geza Lore	39d1a62f9e	Fix change detection on unpacked arrays Expand array assignment when creating the trigger, as V3Expand might mangle it otherwise.	2022-08-02 13:01:41 +01:00
Geza Lore	ba66fa7200	Merge branch 'master' into develop-v5	2022-08-02 11:16:35 +01:00
Geza Lore	cb60663d49	V3Gate: Defer substitutions until required as well Similarly to the earlier patch that defers constant folding on optimized logic, now we also defer the variable substitutions as well. This again eliminates a lot of traversals, and yields another ~10x speedup of V3Gate on a design where V3Gate used to dominate while producing identical results.	2022-08-01 12:54:41 +01:00
Geza Lore	0d2bf23d82	V3Gate: Defer constant folding until required Rather than constant folding each logic block after every substitution, only constant fold updated blocks when re-analysed, or at the end. This removes a lot of invocations of V3Const on large blocks that can be optimized well, and should yield the same result. This speeds up V3Gate by ~4x on a design where V3Gate dominates.	2022-07-31 20:42:04 +01:00
Geza Lore	682a60e325	Cleanup V3Gate, no functional change	2022-07-31 20:07:54 +01:00
Geza Lore	2ab6272cc7	Use AstNode::foreach in V3Gate This yields a little speedup.	2022-07-31 20:05:25 +01:00
Geza Lore	152a6cd886	Improve AstNode::foreach (also exists and forall) Speed improvements: - Use a direct, recursion-free implementation - Improve pre-fetching Functionality: - Support remove/replace of currently iterated node	2022-07-31 19:07:32 +01:00
Wilson Snyder	12925cd8b0	Internals: clang-tidy cleanups. No functional change intended.	2022-07-30 12:49:30 -04:00
Wilson Snyder	daac7cb90d	Merge branch 'master' into develop-v5	2022-07-30 12:09:05 -04:00
Wilson Snyder	a2d26b45bb	Internals: Fix some clang-tidy issues. No functional change intended.	2022-07-30 11:54:28 -04:00
Geza Lore	38e5b6c1ad	Replace __gcov_flush with __gcov_dump __gcov_flush was a private function and was removed from later GCC versions (at least from 11.2.0, possibly earlier). Replace with the documented public __gcov_dump.	2022-07-30 16:02:03 +01:00
Wilson Snyder	4859f5e1fa	Merge branch 'master' into develop-v5	2022-07-30 10:26:16 -04:00
Wilson Snyder	b9d7819faa	Internals: Fix some cppcheck issues. Some dump functions fixed.	2022-07-30 10:01:39 -04:00
Geza Lore	ad2fbfe62d	Merge branch 'master' into develop-v5	2022-07-29 12:04:24 +01:00
Yutetsu TAKATSUKASA	1f9323d086	Set correct dtype in replaceShiftSame() (#3520 ) * Tests: Add a test to reproduce bug3399 * Fix3399. Set the correct dtype in replaceShiftSame(). * Tests: update stats. * Update Changes	2022-07-29 07:05:04 +09:00
Geza Lore	574dbfded1	V3MergeCond: Fix incorrect merge of assignments to the condition	2022-07-28 15:50:02 +01:00
github action	e871cd8a44	Apply 'make format'	2022-07-25 21:47:29 +00:00
Mostafa Gamal	7b431b37c7	Fix struct pattern assignment (#2328 ) (#3517 ).	2022-07-25 17:46:22 -04:00
Geza Lore	ac4ec87942	Respect clang's default -fbracket-depth by default Set default value of --comp-limit-parens to 240, to respect default maximum nesting of parentheses in clang (which is controlled by -fbracket-depth and defaults to 256). For code generation consistency, also use the same default with gcc.	2022-07-25 12:59:26 +01:00
Geza Lore	290c2e0388	Mark FileLine::v3errorEndFatal as noreturn	2022-07-25 12:51:02 +01:00
Geza Lore	89924bda51	Always type '$clog2' as signed 32	2022-07-25 12:48:13 +01:00
Yutetsu TAKATSUKASA	60eab3eb8c	Fix wrong result of bit op tree optimization #3509 (#3516 ) * Tests: Add a test to reproduce #3509 * Tests: Compile without tautological-compare check because bit op tree optimization is disabled in the test. * Internals: Dedup code. No functional change is intended. * Fix #3509. "2'b10 == (2'b11 & {1'b0, val[0]})" and "2'b10 != (2'b11 & {1'b0, val[0]})" were wrongly optimized to "!val[0]" and "val[0]" respectively. Now properly optimize them to 1'b0 and 1'b1. * Commentary * Commentary: Update Changes	2022-07-24 19:54:37 +09:00
Geza Lore	31abe537a0	Fix DPI export trigger sensitivity in 'nba' Fixes #3508	2022-07-21 17:43:03 +01:00
Geza Lore	f9ecbdc70b	Merge branch 'master' into develop-v5	2022-07-21 09:56:14 +01:00
Arkadiusz Kozdra	542e324869	Wildcard index type support for associative arrays (#3501 ). Associative arrays that specify a wildcard index type may be indexed by integral expressions of any size, with leading zeros removed automatically. A natural representation for such expressions is a string, especially that the standard explicitly specifies automatic casts from string indices to bit vectors of equivalent size. The automatic cast part is done implicitly by the existing type system. A simpler way to just make this work would be to convert wildcard index type to a string type directly in the parser code, but several new AST classes are needed to make sure illegal method calls are detected. The verilated data structure implementation is reused, because there is no need for differentiating the behavior on C++ side.	2022-07-20 15:01:36 +02:00
Geza Lore	1c5e5704f5	Fix iteration fixup in AstNode::addHereThisAsNext Previous version broke verialor_ext_tests due to iteration order mismatch after `3fc8249429`	2022-07-20 13:08:51 +01:00
Geza Lore	1d400dd98c	Configure tracing at run-time, instead of compile time (#3504 ) All remaining use of conditional compilation in the tracing implementation of the run-time library are replaced with the use of VerilatedModel::traceConfig, and is now done at run-time.	2022-07-20 11:27:10 +01:00
Geza Lore	af70db88db	Remove unused method	2022-07-19 11:32:16 +01:00
Geza Lore	7ef033f876	Ensure generated Makefile for hierarchical build is stable. Avoid iterating unordered_map. Iterate sorted blocks instead.	2022-07-19 11:32:01 +01:00
Geza Lore	db59c07f27	Implement trace offloading with fewer ifdefs Step towards a proper run-time library. Reduce the amount of ifdefs in the implementation of offloaded tracing. There are still a very small number of ifdefs left, which will need more careful changes in order to keep user API compatibility.	2022-07-19 11:31:35 +01:00
Geza Lore	9085e34d70	Pass VerilatedModel at trace registration time	2022-07-19 11:00:09 +01:00
Arkadiusz Kozdra	0dfa7d3af5	Internals: const-qualify findDType function. No functional change. (#3502 )	2022-07-18 18:58:55 +02:00
Geza Lore	c28bf9ce24	Fix change detection over unpacked arrays.	2022-07-18 12:25:22 +01:00
Geza Lore	5a1f1796d7	Fix t/t_public_{clk,src}.pl after merge of master	2022-07-15 16:48:22 +01:00
Todd Strader	b0e796ca83	Public combo propagation issues (#2905 )	2022-07-15 11:44:32 -04:00
Geza Lore	3773e2ef95	Simplify primary input checks	2022-07-15 16:18:41 +01:00
Geza Lore	00c1f67c57	Make trigger dumping functions always Slow code	2022-07-14 16:28:09 +01:00
Geza Lore	3f19ba1554	Improve handling of extra trigges in V3Sched. Add utility class for allocation, and add human readable text to debug code.	2022-07-14 16:06:15 +01:00
Geza Lore	f37cc2353d	Fix standard library incldues	2022-07-14 15:49:00 +01:00
Geza Lore	6a7bda6910	Correctly schedule combinational logic driven from DPI exports. Fixes #3429.	2022-07-14 15:35:49 +01:00
Geza Lore	ff1b9930fc	Handle multiple external domains in V3Order Make the external domains provider of ordering populate an output vector, which then allows us to add multiple external sensitivities to combinational logic.	2022-07-14 11:09:40 +01:00
Geza Lore	582da6df9a	Merge branch 'master' into develop-v5	2022-07-14 10:08:52 +01:00
Geza Lore	3bd830eacf	Minor clean up of initialization	2022-07-13 18:24:48 +01:00
Geza Lore	f4efcbde5c	Remove simple use of static data from V3OutFormatter::indentSpaces	2022-07-13 16:15:21 +01:00
Geza Lore	658819bb71	Trivial static const -> constexpr	2022-07-13 16:01:03 +01:00
Geza Lore	3fc8249429	Use AstNode::addHereThisAsNext in a few places	2022-07-13 13:57:00 +01:00
Geza Lore	e0a38ce2c2	Remove unnecessary AstNode::clearIter()	2022-07-13 13:57:00 +01:00
Geza Lore	178e1789b5	Make AstNode::addHereThisAsNext always O(1) Using unlinkFrBackWithNext is O(n) in the size of the list if unlinking from the middle, so addHereThisAsNext also had this complexity. This patch implements addHereThisAsNext directly, which is always O(1).	2022-07-13 12:13:40 +01:00
William D. Jones	108c900387	Fix unique_ptr memory header for MinGW64 (#3493 ).	2022-07-13 06:38:03 -04:00
Wilson Snyder	63507e8e29	Internals: Favor UASSERT_OBJ when have object.	2022-07-12 18:02:57 -04:00
Geza Lore	87f1e06c41	Small algorithmic improvement of PartContraction::siblingPairFromRelatives Use std::partial_sort for the non-exhaustive case. This is O(n) instead of O(nlog(n)) in the size of the candidate list being sorted. (It actually is O(nlog(k)), but k is constant 6 in the non-exhaustive case).	2022-07-12 19:10:01 +01:00
Geza Lore	7e8bafd217	Remove static data use from PartContraction::siblingPairFromRelatives Use std::sort with lambda rather than qsort with static function and static data. Verilation performance neutral.	2022-07-12 19:09:40 +01:00
Geza Lore	457ad07ade	Remove unnecessary static state from V3EmitCFunc	2022-07-12 17:51:17 +01:00
Geza Lore	c9ac9a75a6	Merge branch 'master' into develop-v5	2022-07-12 17:29:45 +01:00
Geza Lore	79c901c220	Tighten signatures/implementaion of VerilatedModel abstract methods.	2022-07-12 16:06:08 +01:00
Geza Lore	b61d819fcb	Move contextp() under VerilatedModel	2022-07-12 16:06:08 +01:00
Geza Lore	f4038e3674	Move thread pool and execution profiler into the context. (#3477 ) Fixes #3454	2022-07-12 11:41:15 +01:00
Arkadiusz Kozdra	8377514127	Add support for $test$plusargs(expr) (#3489 )	2022-07-11 06:21:35 -04:00
Wilson Snyder	5f3316d3dc	* Fix empty string arguments to display (#3484 ).	2022-07-09 08:30:57 -04:00
Wilson Snyder	a4fddb3fbe	Fix table misoptimizing away display (#3488 ).	2022-07-09 07:55:46 -04:00
Wilson Snyder	3d71716a8a	Internals: Constructor style cleanup. No functional change.	2022-07-09 07:40:07 -04:00
Yutetsu TAKATSUKASA	9f37cef1bb	Fix #3470 of incorrect bit op tree optimization (#3476 ) * Tests: Add a test to reproduce #3470 * Update LSB during return path of traversal. No functional change is intended. * Introduce LeafInfo::m_msb * Update LeafInfo::m_msb when visitin AstCCast * Internals: Add comment, reorder. No functional change is intended. * Delete explicit from copy constructor to fix build error. * Update Changes * Internals: Remove unused parameter. No functional change is intended. * Tests: Add explanation to t_const_opt.	2022-07-06 08:33:37 +09:00
Geza Lore	0de1bbc85b	Add and use VL_CONSTEXPR_CXX17	2022-07-05 14:21:28 +01:00
Wilson Snyder	b25b798dbe	Merge branch 'master' into develop-v5	2022-07-04 13:20:03 -04:00
Mariusz Glebocki	2873dbe154	Optimize file writing by using a memory buffer. (#3461 )	2022-07-04 10:23:31 -04:00
Yutetsu TAKATSUKASA	ced39d0982	Internals: preparation for fixing #3470 (#3475 ) * Internals: Let LeafInfo class. No functional change is intended. * Internals: Rename LeafInfo::width -> LeafInfo::varWidth(). No functional change is intende.	2022-06-27 22:41:33 +09:00
Wilson Snyder	fc4d6a62af	Remove VL_PROFILER ifdef. Partial (#3454 ).	2022-06-22 20:06:23 -04:00
Unai Martinez-Corral	11032b1936	Fix bisonpre for MSYS2 (#3471 )	2022-06-20 11:59:27 -04:00
Wilson Snyder	e7ca4a69e3	Merge branch 'master' into develop-v5	2022-06-19 15:22:09 -04:00
Wilson Snyder	4f93ac6477	Internals: Style modernization. No functional change intended.	2022-06-15 18:49:32 -04:00
Krzysztof Bieganski	f7533010c6	Internals: Add `setNoopt()` function to `LifeVisitor` (#3468 )	2022-06-15 18:11:03 -04:00
Todd Strader	47b650d821	Fix public unpacked input ports (#3465 )	2022-06-15 07:41:59 -04:00
Geza Lore	0c2c097377	Add -fno-merge-cond-motion option This disables code motion during V3MergeCond, for debugging.	2022-06-13 14:16:11 +01:00
Kevin Kiningham	ea8aaa21e8	Fix compile error under strict C++11 mode (#3463 )	2022-06-13 12:14:02 +01:00
Kamil Rakoczy	660d1059b0	With --no-decoration, remove output whitespace (#3460 ) Signed-off-by: Kamil Rakoczy <krakoczy@antmicro.com>	2022-06-10 07:26:33 -04:00
Wilson Snyder	e7dc2de14b	Fix BLKANDNBLK on $readmem/$writemem (#3379 ).	2022-06-04 12:43:18 -04:00
github action	aca9fd3bed	Apply 'make format'	2022-06-04 16:30:41 +00:00
Wilson Snyder	09f3f40462	Fix clang-discovered missing comma.	2022-06-04 12:27:44 -04:00
Wilson Snyder	0f324c8309	Merge branch 'master' into develop-v5	2022-06-04 11:59:49 -04:00
Wilson Snyder	59dc2853e3	Support concat assignment to packed array (#3446 ).	2022-06-03 21:32:13 -04:00
Wilson Snyder	ada58465b2	Add -f<optimization> options to replace -O<letter> options (#3436 ).	2022-06-03 20:43:16 -04:00
Wilson Snyder	173f57c636	Changed --no-merge-const-pool to -fno-merge-const-pool (#3436 ).	2022-06-03 19:41:59 -04:00
Yutetsu TAKATSUKASA	d64f979f99	Fix BitOpTree optimization to consider polarity of frozen node (#3445 ) (#3459 ) * Tests: add a test to another failing case of #3445 * Consider polarity as lsb in BitOpTree optimization.	2022-06-01 09:26:16 +09:00
Yutetsu TAKATSUKASA	26b7452178	Fix #3445 of BitOpTreeOpt (#3453 ) * Tests: Check BitOpTree statistics in t_const_opt. * Tests: Add a test to reproduce #3445 * Fix #3445. Don't forget LSB of frozen node in BitOpTreeOpt. * Apply suggestions from code review Co-authored-by: Geza Lore <gezalore@gmail.com>	2022-05-30 19:33:06 +09:00
Geza Lore	b51f887567	Perform VCD tracing in parallel when using --threads (#3449 ) VCD tracing is now parallelized using the same thread pool as the model. We achieve this by breaking the top level trace functions into multiple top level functions (as many as --threads), and after emitting the time stamp to the VCD file on the main thread, we execute the tracing functions in parallel on the same thread pool as the model (which we pass to the trace file during registration), tracing into a secondary per thread buffer. The main thread will then stitch (memcpy) the buffers together into the output file. This makes the `--trace-threads` option redundant with `--trace`, which now only affects `--trace-fst`. FST tracing uses the previous offloading scheme. This obviously helps a lot in VCD tracing performance, and I have seen better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on OpenTitan 4T).	2022-05-29 19:08:39 +01:00
Geza Lore	0722f47539	Improve V3MergeCond by reordering statements (#3125 ) V3MergeCond merges consecutive conditional `_ = cond ? _ : _` and `if (cond) ...` statements. This patch adds an analysis and ordering phase that moves statements with identical conditions closer to each other, in order to enable more merging opportunities. This in turn eliminates a lot of repeated conditionals which reduced dynamic branch count and branch misprediction rate. Observed 6.5% improvement on multi-threaded large designs, at the cost of less than 2% increase in Verilation speed.	2022-05-27 16:57:51 +01:00
Geza Lore	3af5e7e8da	Remove scope pointer from OrderEitherVertex. For ordering, only the scope of logic vertices should be relevant, so remove the scope pointer from OrderEitherVertex and move it into OrderLogicVertex. This does not change single-threaded scheduling at all. Theoretically, multi-threaded scheduling should not be affected either though due to some implementation quirk depending on vertex order in a graph the MT schedule is perturbed by this change, but the performance effect of this is negligible on all benchmarks I have access to. No functional change intended. Fixes #3442	2022-05-25 20:32:32 +01:00
Geza Lore	160f3ee4a7	Remove dead code, no functional change	2022-05-25 19:11:20 +01:00
Krzysztof Bieganski	d7a75dc026	Merge branch 'master' into develop-v5	2022-05-25 11:06:38 +02:00
github action	a372e010bd	Apply 'make format'	2022-05-25 04:51:51 +00:00
Wilson Snyder	530817191e	Support non-ANSI interface port declarations (#3439 ).	2022-05-25 00:50:50 -04:00
Geza Lore	c7610ed044	Fix FST tracing thread in CMake build	2022-05-20 17:04:46 +01:00
Geza Lore	b130a8cfeb	Add -DVM_TRACE_VCD in model builds with Make with --trace	2022-05-20 16:44:38 +01:00
Geza Lore	551bd284dd	Rename some internals related to multi-threaded tracing Rename the implementation internals of current multi-threaded tracing to be "offload mode". No functional change, nor user interface change intended.	2022-05-20 16:44:35 +01:00
Krzysztof Bieganski	9edccfdffa	Initial support for intra-assignment timing controls, net delays (#3427 ) This is a pre-PR to #3363. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-05-17 19:19:44 +01:00
Geza Lore	1a056f6db9	Fix invalid conditional merging when starting at 'c = c ? a : b' Fixes #3409.	2022-05-17 18:36:40 +01:00
Krzysztof Bieganski	e018eb7bac	Support AstClass::repairCache() after V3Class (#3431 ) This is a pre-PR to #3363. Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-05-17 09:22:43 -04:00
Geza Lore	282887d9c6	Fix code coverage holes Fixes #3422	2022-05-16 21:22:21 +01:00
Krzysztof Bieganski	3f7a248ed4	Refactor some of the Begin handling to a separate function (#3426 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-05-16 20:45:33 +01:00
Krzysztof Bieganski	ecaa07a72a	Rename AstTimingControl to AstEventControl (#3425 ) Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>	2022-05-16 20:44:41 +01:00
Geza Lore	0e62cd11da	Don't issue DEPRECATED for now no-op clock_enable attribute Fixes #3421	2022-05-16 18:57:51 +01:00
Geza Lore	599d23697d	IEEE compliant scheduler (#3384 ) This is a major re-design of the way code is scheduled in Verilator, with the goal of properly supporting the Active and NBA regions of the SystemVerilog scheduling model, as defined in IEEE 1800-2017 chapter 4. With this change, all internally generated clocks should simulate correctly, and there should be no more need for the `clock_enable` and `clocker` attributes for correctness in the absence of Verilator generated library models (`--lib-create`). Details of the new scheduling model and algorithm are provided in docs/internals.rst. Implements #3278	2022-05-15 16:03:32 +01:00

... 2 3 4 5 6 ...

3638 Commits