Commit Graph

3638 Commits

Author SHA1 Message Date
Wilson Snyder 6a5f77b278 Internals: Cleanup some string/model constructors. No functional change. 2022-08-29 23:50:32 -04:00
Wilson Snyder 8658a0d7dc Internals: Constructor format update. No functional change. 2022-08-29 23:05:52 -04:00
Wilson Snyder c335aad25f Fix --hierarchical with order-based pin connections (#3583). 2022-08-29 22:49:19 -04:00
Wilson Snyder 9d9d647c1f Fix indentation of --protect import function SV code. 2022-08-29 22:28:02 -04:00
Wilson Snyder d47a37fb76 Internals: Cleanup constructors etc. No functional change. 2022-08-29 22:17:27 -04:00
Aleksander Kiryk 24ec84851a
Support $sampled (#3569) 2022-08-29 08:39:41 -04:00
Arkadiusz Kozdra 0a3a15a66e
Support class parameters (#2231) (#3541) 2022-08-28 10:24:55 -04:00
Krzysztof Bieganski 2af5304884
Fix tracing of slow coroutines (#3576 part) (#3579) 2022-08-26 05:11:44 -05:00
Varun Koyyalagunta 5869fdf7f6
Fix $dump systemtask with --output-split-cfuncs (#3495) (#3497) 2022-08-25 18:29:11 -05:00
Krzysztof Bieganski 1a1d2ecfd9
Enable tracing in generated main (#3578)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-25 14:55:37 +01:00
Geza Lore 5c356a4680 Merge branch 'master' into develop-v5 2022-08-22 14:32:06 +01:00
Krzysztof Bieganski 39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00
Geza Lore 9ac64d0b92 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-20 21:18:50 +01:00
Wilson Snyder 7cc89b8b42 Merge branch 'master' into develop-v5 2022-08-20 14:19:45 -04:00
Wilson Snyder c6607724cb Fix clang warning. 2022-08-20 14:19:00 -04:00
Wilson Snyder ebb37b0156 Merge branch 'master' into develop-v5 2022-08-20 14:02:09 -04:00
Wilson Snyder 90dc04cf93 Add --future0 and --future1 options. 2022-08-20 14:01:13 -04:00
Krzysztof Bieganski 10cf492946
Add support for expressions in event controls (#3550)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-19 20:18:38 +02:00
Geza Lore 4d81eb021d Revert "Improve performance of MTask coarsening"
This reverts commit 83475008d9.
2022-08-19 18:03:45 +01:00
Geza Lore 83475008d9 Improve performance of MTask coarsening
Various optimizations to speed up MTasks coarsening (which is the long
pole in the multi-threaded scheduling of very large designs).

The biggest impact ones:
- Use efficient hand written Pairing Heaps for implementing priority
  queues and the scoreboard, instead of the old SortByValueMap. This
  helps us avoid having to sort a lot of merge candidates that we will
  never actually consider and helps a lot in performance.
- Remove unnecessary associative containers and store data structures
  (the heap nodes in particular) directly in the object they relate to.
  This eliminates a huge amount of lookups and helps a lot in
  performance.
- Distribute storage for SiblingMC instances into the LogicMTask
  instances, and combine with the sibling maps. This again eliminates
  hash table lookups and makes storage structures smaller.
- Remove some now bidirectional edge maps, keep only the forward map.

There are also some other smaller optimizations:
- Replaced more unnecessary dynamic_casts with static_casts
- Templated some functions/classes to reduce the number of static
  branches in loops.
- Improves sorting of edges for sibling candidate creation
- Various micro-optimizations here and there

This speeds up MTask coarsening by 3.8x on a large design, which
translates to a 2.5x speedup of the ordering pass in multi-threaded
mode. (Combined with the earlier optimizations, ordering is now 3x
faster.)

Due to the elimination of a lot of the auxiliary data structures, and
ensuring a minimal size for the necessary ones, memory consumption of
the MTask coarsening is also reduced (measured up to 4.4x reduction
though the accuracy of this is low).

The algorithm is identical except for minor alterations of the order
some candidates are added or removed, this can cause perturbation in the
output due to tied scores being broken based on IDs.
2022-08-19 16:59:20 +01:00
Geza Lore 03ac7ad730 Make PartPropagateCp specific to the MTask graph
While keeping the client code abstract in PartPropagateCp is nice for
testing, there is performance to be had removing the abstraction. As
this code dominates in scheduling large designs, we eliminate the
abstraction and re-work the testing to use the actual LogicMTask and
MTaskEdge graph types. No functional change intended.
2022-08-19 14:06:11 +01:00
Geza Lore cd50949a7e Reuse MTaskEdge instances in MT scheduling
Instead of deleting then re-allocating MTaskEdge instances when merging
two MTasks, just redirect the edged of the donor MTask to the recipient
MTask. This is both faster as it avoids an allocation and a deletion,
together with one update of the sibling maps, and also makes the
algorithm more stable due to MergeCandidate IDs being stable and
allocated up front for all MTaskEdges, before any SiblingMCs are
allocated.

Perturbations in output are expected as the IDs used to break ties
between merge candidates with equal costs are not updated when
redirecting an edge (on purpose). The relinking of only one end of the
graph edges also perturbs the order in which they are enumerated, which
does change candidate opportunities when the number of edges is larger
than PART_SIBLING_EDGE_LIMIT. Confirmed output is identical when
IDs are updated and edges are updated to appear in their original order.
2022-08-19 14:06:11 +01:00
Geza Lore f0040c7b9a Remove reliance on pointer comparison in MT scheduling
The critical path propagation used to rely on a pointer comparison to
break equal scoring critical path updates. Use the corresponding mtask
ids instead, which is deterministic across invocations.
2022-08-19 14:06:11 +01:00
Geza Lore f8a0389e73 Do not use stepCost when gathering sibling merge candidates
siblingPairFromRelatives gathers neighbours of a vertex, and sorts them.
It then takes the N best nodes, and creates sibling merge candidates
from them. We now use the unadjusted cost instead of the step cost of
the vertices when sorting. This is both faster as we need not do the
log-space rounding to compute stepCost, and will also make similar but
yet cheaper nodes appear closer to the front as we don't lose precision
in rounding, hence they are more likely to be entered as merge
candidates. Note that when creating the merge candidate, we still use
the stepCost, so it's purpose of reducing the propagation of critical
path updates is maintained in full. In summary, this should make both
Verilator and the generated model very slightly faster, at least in
theory, and I have observed minor improvement in places.
2022-08-19 14:06:11 +01:00
Geza Lore b436794773 Add specialized GraphStreamUnordered
GraphStreamUnordered used to be GraphStream<std::less<const
V3GraphVertex*>>, but a lot of performance improvements can be had by a
specialized implementation, so added a highly optimized one. This helps
a lot with --debug-partition.
2022-08-19 14:06:11 +01:00
Geza Lore 1404319b28 Merge branch 'master' into develop-v5 2022-08-19 13:39:44 +01:00
Geza Lore 90d22cbec6 Fix `AstNode::exists` return type 2022-08-19 13:22:06 +01:00
Krzysztof Bieganski 33e2acfe61
Fix `AstNode::forall` return type (#3559)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-19 12:33:17 +01:00
Ryszard Rozak db5fdfb0ee
Fix === with some tristate constants (#3551). 2022-08-18 07:03:05 -04:00
Krzysztof Bieganski 951cd73fe0
Handle MemberSel in V3EmitV.cpp (#3555) 2022-08-18 06:33:45 -04:00
Arkadiusz Kozdra 0eeb40b975
Fix converting subclasses to string (#3552) 2022-08-17 18:08:43 -04:00
Wilson Snyder f435d96241 Fix case statement comparing string literal (#3544). 2022-08-15 21:56:09 -04:00
github action d32e3f042f Apply 'make format' 2022-08-12 10:56:12 +00:00
Mostafa Gamal df5f95a5bd
Fix nested default assignment for struct pattern (#3511) (#3524) 2022-08-12 06:55:07 -04:00
Drew Ranck b0c475205b
Fix void-cast queue pop_front or pop_back (#3542) (#3364)
Fix compile error for queue method usage, if it is the
first statement in a block of code, and the return
value is not used. Example:

>  if (foo)
>    void'(bar.pop_front());
2022-08-12 06:51:25 -04:00
Wilson Snyder cbe1b8e266 Fix segfault exporting non-existant package (#3535). 2022-08-08 17:53:50 -04:00
Mariusz Glebocki 2b12fe5773
Internals: Construct V3Number with correct type instead of changing it manually. (#3529) 2022-08-08 08:17:02 -04:00
Yutetsu TAKATSUKASA d20f22beb1
Fix tristate logic when reading inout port in a module #3399 (#3523)
* Tests: Add a test to reproduce #3399

* Fix #3399. When reading an inout port in a module, it should refer the
original inout port, not the generated MODTEMP.
2022-08-07 21:12:57 +09:00
Mariusz Glebocki 122e89ffde
Fix V3Number::isMsbXZ(). (#3530) 2022-08-05 19:12:52 +01:00
Geza Lore c266739e9f Merge branch 'master' into develop-v5 2022-08-05 12:17:57 +01:00
Geza Lore 96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Geza Lore 7403226a97 Merge branch 'master' into develop-v5 2022-08-04 10:03:38 +01:00
Geza Lore fac8e76923 Rework SortByValueMap for better performance
Keep a single std::set of key/value pairs, and a single unordered_map
from key to iterators into the set. Also improve some of the accessing
mechanisms using modern C++. This speeds up multi-threaded ordering by
about 10%.
2022-08-03 21:17:02 +01:00
Geza Lore b864f5f5ba V3Partition: use static_cast with LogicMTaskVertex
dynamic_cast is not free, and the mtask graph contains only
LogicMTaskVertex vertices, use static_cast instead for some speedup.
2022-08-03 17:05:01 +01:00
Geza Lore f9f66d787e Fix integer overflow in V3Unroll (#3451) 2022-08-03 09:41:30 +01:00
Geza Lore bd211c87aa astgen: split 'visit' method declarations from definitions
Add definitions to V3Ast.cpp, and use static_cast.
This fixes a lot of clang-tidy noise.
2022-08-02 17:53:19 +01:00
Geza Lore 6fc25dae9e Fix clang-tidy warnings (#3522) 2022-08-02 15:58:48 +01:00
Kamil Rakoczy cfb6fd8b34
Reduce max RSS usage (#3483)
By constant folding nodes earlier in V3Expand, we can save some max RSS on large designs.
2022-08-02 13:36:14 +01:00
Geza Lore 39d1a62f9e Fix change detection on unpacked arrays
Expand array assignment when creating the trigger, as V3Expand might
mangle it otherwise.
2022-08-02 13:01:41 +01:00
Geza Lore ba66fa7200 Merge branch 'master' into develop-v5 2022-08-02 11:16:35 +01:00
Geza Lore cb60663d49 V3Gate: Defer substitutions until required as well
Similarly to the earlier patch that defers constant folding on optimized
logic, now we also defer the variable substitutions as well. This again
eliminates a lot of traversals, and yields another ~10x speedup of V3Gate
on a design where V3Gate used to dominate while producing identical
results.
2022-08-01 12:54:41 +01:00
Geza Lore 0d2bf23d82 V3Gate: Defer constant folding until required
Rather than constant folding each logic block after every substitution,
only constant fold updated blocks when re-analysed, or at the end. This
removes a lot of invocations of V3Const on large blocks that can be
optimized well, and should yield the same result.

This speeds up V3Gate by ~4x on a design where V3Gate dominates.
2022-07-31 20:42:04 +01:00
Geza Lore 682a60e325 Cleanup V3Gate, no functional change 2022-07-31 20:07:54 +01:00
Geza Lore 2ab6272cc7 Use AstNode::foreach in V3Gate
This yields a little speedup.
2022-07-31 20:05:25 +01:00
Geza Lore 152a6cd886 Improve AstNode::foreach (also exists and forall)
Speed improvements:
- Use a direct, recursion-free implementation
- Improve pre-fetching

Functionality:
- Support remove/replace of currently iterated node
2022-07-31 19:07:32 +01:00
Wilson Snyder 12925cd8b0 Internals: clang-tidy cleanups. No functional change intended. 2022-07-30 12:49:30 -04:00
Wilson Snyder daac7cb90d Merge branch 'master' into develop-v5 2022-07-30 12:09:05 -04:00
Wilson Snyder a2d26b45bb Internals: Fix some clang-tidy issues. No functional change intended. 2022-07-30 11:54:28 -04:00
Geza Lore 38e5b6c1ad Replace __gcov_flush with __gcov_dump
__gcov_flush was a private function and was removed from later GCC
versions (at least from 11.2.0, possibly earlier). Replace with the
documented public __gcov_dump.
2022-07-30 16:02:03 +01:00
Wilson Snyder 4859f5e1fa Merge branch 'master' into develop-v5 2022-07-30 10:26:16 -04:00
Wilson Snyder b9d7819faa Internals: Fix some cppcheck issues. Some dump functions fixed. 2022-07-30 10:01:39 -04:00
Geza Lore ad2fbfe62d Merge branch 'master' into develop-v5 2022-07-29 12:04:24 +01:00
Yutetsu TAKATSUKASA 1f9323d086
Set correct dtype in replaceShiftSame() (#3520)
* Tests: Add a test to reproduce bug3399

* Fix3399. Set the correct dtype in replaceShiftSame().

* Tests: update stats.

* Update Changes
2022-07-29 07:05:04 +09:00
Geza Lore 574dbfded1 V3MergeCond: Fix incorrect merge of assignments to the condition 2022-07-28 15:50:02 +01:00
github action e871cd8a44 Apply 'make format' 2022-07-25 21:47:29 +00:00
Mostafa Gamal 7b431b37c7
Fix struct pattern assignment (#2328) (#3517). 2022-07-25 17:46:22 -04:00
Geza Lore ac4ec87942 Respect clang's default -fbracket-depth by default
Set default value of --comp-limit-parens to 240, to respect default
 maximum nesting of parentheses in clang (which is controlled by
 -fbracket-depth and defaults to 256). For code generation consistency,
 also use the same default with gcc.
2022-07-25 12:59:26 +01:00
Geza Lore 290c2e0388 Mark FileLine::v3errorEndFatal as noreturn 2022-07-25 12:51:02 +01:00
Geza Lore 89924bda51 Always type '$clog2' as signed 32 2022-07-25 12:48:13 +01:00
Yutetsu TAKATSUKASA 60eab3eb8c
Fix wrong result of bit op tree optimization #3509 (#3516)
* Tests: Add a test to reproduce #3509

* Tests: Compile without tautological-compare check because bit op tree optimization is disabled in the test.

* Internals: Dedup code. No functional change is intended.

* Fix #3509.

"2'b10 == (2'b11 & {1'b0, val[0]})"  and "2'b10 != (2'b11 & {1'b0, val[0]})" were
wrongly optimized to "!val[0]" and "val[0]" respectively.
Now properly optimize them to 1'b0 and 1'b1.

* Commentary

* Commentary: Update Changes
2022-07-24 19:54:37 +09:00
Geza Lore 31abe537a0 Fix DPI export trigger sensitivity in 'nba'
Fixes #3508
2022-07-21 17:43:03 +01:00
Geza Lore f9ecbdc70b Merge branch 'master' into develop-v5 2022-07-21 09:56:14 +01:00
Arkadiusz Kozdra 542e324869
Wildcard index type support for associative arrays (#3501).
Associative arrays that specify a wildcard index type may be indexed by
integral expressions of any size, with leading zeros removed
automatically.  A natural representation for such expressions is a
string, especially that the standard explicitly specifies automatic
casts from string indices to bit vectors of equivalent size.
The automatic cast part is done implicitly by the existing type system.

A simpler way to just make this work would be to convert wildcard index
type to a string type directly in the parser code, but several new AST
classes are needed to make sure illegal method calls are detected.
The verilated data structure implementation is reused, because there is
no need for differentiating the behavior on C++ side.
2022-07-20 15:01:36 +02:00
Geza Lore 1c5e5704f5 Fix iteration fixup in AstNode::addHereThisAsNext
Previous version broke verialor_ext_tests due to iteration order
mismatch after 3fc8249429
2022-07-20 13:08:51 +01:00
Geza Lore 1d400dd98c
Configure tracing at run-time, instead of compile time (#3504)
All remaining use of conditional compilation in the tracing
implementation of the run-time library are replaced with the use of
VerilatedModel::traceConfig, and is now done at run-time.
2022-07-20 11:27:10 +01:00
Geza Lore af70db88db Remove unused method 2022-07-19 11:32:16 +01:00
Geza Lore 7ef033f876 Ensure generated Makefile for hierarchical build is stable.
Avoid iterating unordered_map. Iterate sorted blocks instead.
2022-07-19 11:32:01 +01:00
Geza Lore db59c07f27 Implement trace offloading with fewer ifdefs
Step towards a proper run-time library. Reduce the amount of ifdefs in
the implementation of offloaded tracing. There are still a very small
number of ifdefs left, which will need more careful changes in order to
keep user API compatibility.
2022-07-19 11:31:35 +01:00
Geza Lore 9085e34d70 Pass VerilatedModel at trace registration time 2022-07-19 11:00:09 +01:00
Arkadiusz Kozdra 0dfa7d3af5
Internals: const-qualify findDType function. No functional change. (#3502) 2022-07-18 18:58:55 +02:00
Geza Lore c28bf9ce24 Fix change detection over unpacked arrays. 2022-07-18 12:25:22 +01:00
Geza Lore 5a1f1796d7 Fix t/t_public_{clk,src}.pl after merge of master 2022-07-15 16:48:22 +01:00
Todd Strader b0e796ca83
Public combo propagation issues (#2905) 2022-07-15 11:44:32 -04:00
Geza Lore 3773e2ef95 Simplify primary input checks 2022-07-15 16:18:41 +01:00
Geza Lore 00c1f67c57 Make trigger dumping functions always Slow code 2022-07-14 16:28:09 +01:00
Geza Lore 3f19ba1554 Improve handling of extra trigges in V3Sched.
Add utility class for allocation, and add human readable text to debug
code.
2022-07-14 16:06:15 +01:00
Geza Lore f37cc2353d Fix standard library incldues 2022-07-14 15:49:00 +01:00
Geza Lore 6a7bda6910 Correctly schedule combinational logic driven from DPI exports.
Fixes #3429.
2022-07-14 15:35:49 +01:00
Geza Lore ff1b9930fc Handle multiple external domains in V3Order
Make the external domains provider of ordering populate an output
vector, which then allows us to add multiple external sensitivities to
combinational logic.
2022-07-14 11:09:40 +01:00
Geza Lore 582da6df9a Merge branch 'master' into develop-v5 2022-07-14 10:08:52 +01:00
Geza Lore 3bd830eacf Minor clean up of initialization 2022-07-13 18:24:48 +01:00
Geza Lore f4efcbde5c Remove simple use of static data from V3OutFormatter::indentSpaces 2022-07-13 16:15:21 +01:00
Geza Lore 658819bb71 Trivial static const -> constexpr 2022-07-13 16:01:03 +01:00
Geza Lore 3fc8249429 Use AstNode::addHereThisAsNext in a few places 2022-07-13 13:57:00 +01:00
Geza Lore e0a38ce2c2 Remove unnecessary AstNode::clearIter() 2022-07-13 13:57:00 +01:00
Geza Lore 178e1789b5 Make AstNode::addHereThisAsNext always O(1)
Using unlinkFrBackWithNext is O(n) in the size of the list if unlinking
from the middle, so addHereThisAsNext also had this complexity. This
patch implements addHereThisAsNext directly, which is always O(1).
2022-07-13 12:13:40 +01:00
William D. Jones 108c900387
Fix unique_ptr memory header for MinGW64 (#3493). 2022-07-13 06:38:03 -04:00
Wilson Snyder 63507e8e29 Internals: Favor UASSERT_OBJ when have object. 2022-07-12 18:02:57 -04:00
Geza Lore 87f1e06c41 Small algorithmic improvement of PartContraction::siblingPairFromRelatives
Use std::partial_sort for the non-exhaustive case. This is O(n) instead
of O(n*log(n)) in the size of the candidate list being sorted. (It
actually is O(n*log(k)), but k is constant 6 in the non-exhaustive
case).
2022-07-12 19:10:01 +01:00
Geza Lore 7e8bafd217 Remove static data use from PartContraction::siblingPairFromRelatives
Use std::sort with lambda rather than qsort with static function and
static data. Verilation performance neutral.
2022-07-12 19:09:40 +01:00
Geza Lore 457ad07ade Remove unnecessary static state from V3EmitCFunc 2022-07-12 17:51:17 +01:00
Geza Lore c9ac9a75a6 Merge branch 'master' into develop-v5 2022-07-12 17:29:45 +01:00
Geza Lore 79c901c220 Tighten signatures/implementaion of VerilatedModel abstract methods. 2022-07-12 16:06:08 +01:00
Geza Lore b61d819fcb Move contextp() under VerilatedModel 2022-07-12 16:06:08 +01:00
Geza Lore f4038e3674
Move thread pool and execution profiler into the context. (#3477)
Fixes #3454
2022-07-12 11:41:15 +01:00
Arkadiusz Kozdra 8377514127
Add support for $test$plusargs(expr) (#3489) 2022-07-11 06:21:35 -04:00
Wilson Snyder 5f3316d3dc * Fix empty string arguments to display (#3484). 2022-07-09 08:30:57 -04:00
Wilson Snyder a4fddb3fbe Fix table misoptimizing away display (#3488). 2022-07-09 07:55:46 -04:00
Wilson Snyder 3d71716a8a Internals: Constructor style cleanup. No functional change. 2022-07-09 07:40:07 -04:00
Yutetsu TAKATSUKASA 9f37cef1bb
Fix #3470 of incorrect bit op tree optimization (#3476)
* Tests: Add a test to reproduce #3470

* Update LSB during return path of traversal. No functional change is intended.

* Introduce LeafInfo::m_msb

* Update LeafInfo::m_msb when visitin AstCCast

* Internals: Add comment, reorder. No functional change is intended.

* Delete explicit from copy constructor to fix build error.

* Update Changes

* Internals: Remove unused parameter. No functional change is intended.

* Tests: Add explanation to t_const_opt.
2022-07-06 08:33:37 +09:00
Geza Lore 0de1bbc85b Add and use VL_CONSTEXPR_CXX17 2022-07-05 14:21:28 +01:00
Wilson Snyder b25b798dbe Merge branch 'master' into develop-v5 2022-07-04 13:20:03 -04:00
Mariusz Glebocki 2873dbe154
Optimize file writing by using a memory buffer. (#3461) 2022-07-04 10:23:31 -04:00
Yutetsu TAKATSUKASA ced39d0982
Internals: preparation for fixing #3470 (#3475)
* Internals: Let LeafInfo class. No functional change is intended.

* Internals: Rename LeafInfo::width -> LeafInfo::varWidth(). No functional change is intende.
2022-06-27 22:41:33 +09:00
Wilson Snyder fc4d6a62af Remove VL_PROFILER ifdef. Partial (#3454). 2022-06-22 20:06:23 -04:00
Unai Martinez-Corral 11032b1936
Fix bisonpre for MSYS2 (#3471) 2022-06-20 11:59:27 -04:00
Wilson Snyder e7ca4a69e3 Merge branch 'master' into develop-v5 2022-06-19 15:22:09 -04:00
Wilson Snyder 4f93ac6477 Internals: Style modernization. No functional change intended. 2022-06-15 18:49:32 -04:00
Krzysztof Bieganski f7533010c6
Internals: Add `setNoopt()` function to `LifeVisitor` (#3468) 2022-06-15 18:11:03 -04:00
Todd Strader 47b650d821
Fix public unpacked input ports (#3465) 2022-06-15 07:41:59 -04:00
Geza Lore 0c2c097377 Add -fno-merge-cond-motion option
This disables code motion during V3MergeCond, for debugging.
2022-06-13 14:16:11 +01:00
Kevin Kiningham ea8aaa21e8 Fix compile error under strict C++11 mode (#3463) 2022-06-13 12:14:02 +01:00
Kamil Rakoczy 660d1059b0
With --no-decoration, remove output whitespace (#3460)
Signed-off-by: Kamil Rakoczy <krakoczy@antmicro.com>
2022-06-10 07:26:33 -04:00
Wilson Snyder e7dc2de14b Fix BLKANDNBLK on $readmem/$writemem (#3379). 2022-06-04 12:43:18 -04:00
github action aca9fd3bed Apply 'make format' 2022-06-04 16:30:41 +00:00
Wilson Snyder 09f3f40462 Fix clang-discovered missing comma. 2022-06-04 12:27:44 -04:00
Wilson Snyder 0f324c8309 Merge branch 'master' into develop-v5 2022-06-04 11:59:49 -04:00
Wilson Snyder 59dc2853e3 Support concat assignment to packed array (#3446). 2022-06-03 21:32:13 -04:00
Wilson Snyder ada58465b2 Add -f<optimization> options to replace -O<letter> options (#3436). 2022-06-03 20:43:16 -04:00
Wilson Snyder 173f57c636 Changed --no-merge-const-pool to -fno-merge-const-pool (#3436). 2022-06-03 19:41:59 -04:00
Yutetsu TAKATSUKASA d64f979f99
Fix BitOpTree optimization to consider polarity of frozen node (#3445) (#3459)
* Tests: add a test to another failing case of #3445

* Consider polarity as lsb in BitOpTree optimization.
2022-06-01 09:26:16 +09:00
Yutetsu TAKATSUKASA 26b7452178
Fix #3445 of BitOpTreeOpt (#3453)
* Tests: Check BitOpTree statistics in t_const_opt.

* Tests: Add a test to reproduce #3445

* Fix #3445. Don't forget LSB of frozen node in BitOpTreeOpt.

* Apply suggestions from code review

Co-authored-by: Geza Lore <gezalore@gmail.com>
2022-05-30 19:33:06 +09:00
Geza Lore b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore 0722f47539
Improve V3MergeCond by reordering statements (#3125)
V3MergeCond merges consecutive conditional `_ = cond ? _ : _` and
`if (cond) ...` statements. This patch adds an analysis and ordering
phase that moves statements with identical conditions closer to each
other, in order to enable more merging opportunities. This in turn
eliminates a lot of repeated conditionals which reduced dynamic branch
count and branch misprediction rate. Observed 6.5% improvement on
multi-threaded large designs, at the cost of less than 2% increase in
Verilation speed.
2022-05-27 16:57:51 +01:00
Geza Lore 3af5e7e8da Remove scope pointer from OrderEitherVertex.
For ordering, only the scope of logic vertices should be relevant, so
remove the scope pointer from OrderEitherVertex and move it into
OrderLogicVertex. This does not change single-threaded scheduling at
all. Theoretically, multi-threaded scheduling should not be affected
either though due to some implementation quirk depending on vertex order
in a graph the MT schedule is perturbed by this change, but the
performance effect of this is negligible on all benchmarks I have access
to.

No functional change intended.

Fixes #3442
2022-05-25 20:32:32 +01:00
Geza Lore 160f3ee4a7 Remove dead code, no functional change 2022-05-25 19:11:20 +01:00
Krzysztof Bieganski d7a75dc026 Merge branch 'master' into develop-v5 2022-05-25 11:06:38 +02:00
github action a372e010bd Apply 'make format' 2022-05-25 04:51:51 +00:00
Wilson Snyder 530817191e Support non-ANSI interface port declarations (#3439). 2022-05-25 00:50:50 -04:00
Geza Lore c7610ed044 Fix FST tracing thread in CMake build 2022-05-20 17:04:46 +01:00
Geza Lore b130a8cfeb Add -DVM_TRACE_VCD in model builds with Make with --trace 2022-05-20 16:44:38 +01:00
Geza Lore 551bd284dd Rename some internals related to multi-threaded tracing
Rename the implementation internals of current multi-threaded tracing to
be "offload mode". No functional change, nor user interface change
intended.
2022-05-20 16:44:35 +01:00
Krzysztof Bieganski 9edccfdffa
Initial support for intra-assignment timing controls, net delays (#3427)
This is a pre-PR to #3363.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-05-17 19:19:44 +01:00
Geza Lore 1a056f6db9 Fix invalid conditional merging when starting at 'c = c ? a : b'
Fixes #3409.
2022-05-17 18:36:40 +01:00
Krzysztof Bieganski e018eb7bac
Support AstClass::repairCache() after V3Class (#3431)
This is a pre-PR to #3363.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-05-17 09:22:43 -04:00
Geza Lore 282887d9c6 Fix code coverage holes
Fixes #3422
2022-05-16 21:22:21 +01:00
Krzysztof Bieganski 3f7a248ed4
Refactor some of the Begin handling to a separate function (#3426)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-05-16 20:45:33 +01:00
Krzysztof Bieganski ecaa07a72a
Rename AstTimingControl to AstEventControl (#3425)
Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-05-16 20:44:41 +01:00
Geza Lore 0e62cd11da Don't issue DEPRECATED for now no-op clock_enable attribute
Fixes #3421
2022-05-16 18:57:51 +01:00
Geza Lore 599d23697d
IEEE compliant scheduler (#3384)
This is a major re-design of the way code is scheduled in Verilator,
with the goal of properly supporting the Active and NBA regions of the
SystemVerilog scheduling model, as defined in IEEE 1800-2017 chapter 4.

With this change, all internally generated clocks should simulate
correctly, and there should be no more need for the `clock_enable` and
`clocker` attributes for correctness in the absence of Verilator
generated library models (`--lib-create`).

Details of the new scheduling model and algorithm are provided in
docs/internals.rst.

Implements #3278
2022-05-15 16:03:32 +01:00
Wilson Snyder 3c4131d45d Fix 'with' operator with type casting (#3387). 2022-05-15 09:53:48 -04:00
Wilson Snyder ae8d8ee1ac Fix crash with misuse of display. 2022-05-15 09:29:45 -04:00
Geza Lore 89ec3d16dc Allow const nodes in VNRef
No functional change.
2022-05-15 13:30:07 +01:00
HungMingWu 560efb2c9e
Internals: Fix memory leak in V3FileLine (#3407) (#3408). No functional change intended. 2022-05-14 18:15:38 -04:00
Wilson Snyder 38438b3373 Internals: Cleanup some defaults. No functional change. 2022-05-12 23:30:39 -04:00
Wilson Snyder 71dedccbbe Support compile time trace signal selection with tracing_on/off (#3323). 2022-05-12 22:28:08 -04:00
Wilson Snyder bdfdc737a0 Internals: Cleanup V3Config. No functional change intended. 2022-05-11 00:47:52 -04:00
Wilson Snyder 3d045c3aee Internals: Cleanup some verilog.y formatting. No functional change. 2022-05-09 00:37:51 -04:00
HungMingWu 9583f152ee
Fix compile error when enable VL_LEAK_CHECKS (#3411).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-05-08 20:49:13 -04:00
Wilson Snyder 5b2755d28d Untabify verilog.y (#3412). No functional change. 2022-05-08 20:46:18 -04:00
Kamil Rakoczy 9378259779
Fix UNOPTFLAT warning from initial static var (#3406)
Signed-off-by: Kamil Rakoczy <krakoczy@antmicro.com>
2022-05-06 10:24:03 +02:00
Wilson Snyder 3d762282b9 Fix hang with large case statement optimization (#3405). 2022-05-05 07:02:52 -04:00
Geza Lore a2792785fe Add V3GraphVertex::dotRank to add GraphViz ranks to graph dumps
This is a simple debugging aid to allow constraining the graph layout
via GraphViz rank directives. Note this is not related in any way to the
vertex 'rank' attribute used by some of the graph algorithms.

No functional change.
2022-05-02 10:27:26 +01:00
Geza Lore 49c90ecbce Issue consistent INITIALDLY/COMBDLY/BLKSEQ warnings
Some cases of warnings about the use of blocking and non-blocking
assignments in combinational vs sequential processes were suppressed in
a way that is inconsistent with the *actual* current execution model of
Verilator. Turning these back on to, well, warn the user that these might
cause unexpected results. V5 will clean these up, but until then err on
the side of caution.

Fixes #864.
2022-04-29 17:05:44 +01:00
Geza Lore 8395004d25 Add AstNode::exists and AstNode::forall predicates 2022-04-29 15:44:22 +01:00
Kamil Rakoczy 5de1c619c8
Fix foreach segmentation fault (#3400). 2022-04-28 06:11:31 -04:00
Yoda Lee a6d678d41d
Fix hang in generate symbol references (#3391) (#3398) 2022-04-27 18:40:36 -04:00
Aliaksei Chapyzhenka 2b91d764b5
Added missing #include <memory> (#3392)
Fixes #3390
2022-04-23 20:11:46 +01:00
Geza Lore 9abab2c366 Add separate AstInitialStatic node for static initializers
Static variable initializers run before initial blocks, so use an
explicitly different procedure type for them. This also enables us to
now raise errors for assignments to const variables in initial blocks.
2022-04-23 15:12:49 +01:00
Geza Lore b22e368b25 Add default parameters to some Ast nodes for convenience
Also update usage to utilize. No functional change.
2022-04-23 14:47:16 +01:00
Geza Lore a9cd2998e5 Don't mangle run-time library method names. 2022-04-23 14:47:16 +01:00
Geza Lore f1ea30f257 Use iterate*Const V3EmitV visitors. No functional change. 2022-04-23 14:47:12 +01:00
Geza Lore 0b74e9b354 Ensure topological ordering of module list.
At the end of V3Param, fix up the module list to be topologically
sorted. We need to do this at the end as a later instantiation of a
recursive module might instantiate an earlier specialization, which we
cannot know until we processed everything. The rest of the compiler
depends on the module list being topologically sorted.

Fixes #3393
2022-04-23 13:25:27 +01:00
Geza Lore 8189416d0c Partial cleanup of V3Param. No functional change. 2022-04-23 13:03:52 +01:00
Geza Lore 5f0e1fae7f Simplify and clarify reporting of enclosing instance
Rename AstNodeModule::hierName -> someInstanceName and explain that this
is only used for user messages.

Rename AstNode::locationStr -> instanceStr and simplify implementation.
In particular, do not report an instance if we can't find a reasonable
guess.
2022-04-22 23:38:23 +01:00
HungMingWu 880a9be3b1
Internal: Add C++20ish reverse_view for range loops. No functional change (#3388).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-18 13:03:56 -04:00
Wilson Snyder 7bfc1a00a7 Fix tracing interfaces inside interfaces (#3309). 2022-04-14 09:14:44 -04:00
Julien Margetts baff64a43d
Add VK_USER_OBJS dependency to --create-lib library (#3370) (#3382). 2022-04-12 07:04:31 -04:00
github action b7f2bb0e80 Apply 'make format' 2022-04-12 10:54:48 +00:00
HungMingWu 08e0a397d3
Fix debugi-V3Param null pointer fault (#3380) (#3381)
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-12 06:53:52 -04:00
Wilson Snyder 5f333be947 Internals: Dump TraceDecl codes. 2022-04-10 19:40:27 -04:00
Wilson Snyder f5f4e15ce2 Fix filenames with dots overwriting debug .vpp files (#3373). 2022-04-10 10:33:16 -04:00
Geza Lore fbd568dc47 Prep for multiple AstExecGraph. No functional change. 2022-04-10 12:00:17 +01:00
Geza Lore c79ea88576 Fix incorrect localization when encountering non-leaf functions.
Fixes #3286.
2022-04-09 20:30:39 +01:00
Wilson Snyder 9be4e7b576 Fix Bison 3.8.2 error (#3366). 2022-03-31 19:14:13 -04:00
Wilson Snyder 33105f017c Commentary 2022-03-30 20:17:59 -04:00
Wilson Snyder e02f97854c Deprecate 'vluint64_t' and similar types (#3255). 2022-03-27 15:27:40 -04:00
Wilson Snyder 3f7bf3d2dc Fix MSVC localtime_s (#3124). 2022-03-27 13:59:18 -04:00
Geza Lore f9e69984ff Set vlSymsp in modules at construction time.
This ensures it's available from very early on. No functional change.
2022-03-27 16:10:20 +01:00
Geza Lore b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Wilson Snyder 4eaa6fdd06 Internals: Use python pass appropriately. No functional change intended. 2022-03-26 15:57:52 -04:00
Yutetsu TAKATSUKASA 47226236f4
Internals: Resolve potential SEGV risk (#3350) 2022-03-13 18:13:51 +09:00
Drew Ranck 90fb2e5487
Fix ++/-- tree fix in case statements (#3346) (#3349). 2022-03-12 11:24:32 -05:00
Wilson Snyder f211616a4c Fix missing debug, and code cleanup in V3LinkInc. 2022-03-11 07:34:11 -05:00
github action 181b9a5795 Apply 'make format' 2022-03-06 22:17:42 +00:00
Wilson Snyder 9baf9c55c2 Commentary 2022-03-06 17:16:41 -05:00
Yutetsu TAKATSUKASA 999751c422
Count non-empty always blocks in V3Split (#3337)
"Optimizations, Split always" in stats now means the number of newly added always.
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
2022-03-06 12:56:34 +09:00
Wilson Snyder 22656d6fdd Fix Vdeeptemp error with --threads and --compiler clang (#3338). 2022-03-05 20:17:36 -05:00
Wilson Snyder 90c61c79d6 Fix unnamedblk error on foreach (#3321). 2022-03-05 17:04:52 -05:00
Wilson Snyder 4ba3bff87f Fix class stringification on wide arrays (#3312). 2022-03-05 16:32:30 -05:00
Wilson Snyder c3dd6f5344 Fix public function arguments that are arrayed (#3316). 2022-03-05 16:19:53 -05:00
Geza Lore 3737d209f6 Keep recursive module list topologically (#3324).
Fixes (#3324).
2022-03-05 15:04:13 +00:00
Todd Strader 29c4b0a141
Fix cast to array types (#3333) 2022-03-03 07:48:04 -05:00
Geza Lore 5b9806ae6d Improve V3Combine
- Always use a fast function to replace a slow one if available
- Iterate to fixed point (i.e.: if combining made more functions
identical, combine those too). This will be more useful in the future.
- Use only single, const traversal
2022-02-27 20:40:58 +00:00
Geza Lore 665fa140a8 V3Combine: Fix crash if CCall in expression position 2022-02-27 12:52:40 +00:00
Yutetsu TAKATSUKASA 32f843a214
Internals: Don't show "Split always" statistics twice. (Split and Reorder were shown). (#3328) 2022-02-27 20:33:54 +09:00
github action 47069dfe52 Apply 'make format' 2022-02-27 07:53:05 +00:00
HungMingWu 43a84d7ad8
Internals: Fix VL_RESTORER behavior on passing a lvalue reference (#3326)
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-02-27 07:52:11 +00:00
Geza Lore decfa6bd7a V3Order: Use unique ordinals per function name
This helps diffing generated code after reordering output, otherwise no
functional change.
2022-02-16 18:36:40 +00:00
Geza Lore 8931bd37e2 Cleanup V3Changed and V3GenClk 2022-02-16 18:09:19 +00:00
Geza Lore 4b79d23d00 Replace SenTreeSet with generic collection
Introduce VNRef that can be used to wrap AstNode keys in STL
collections, resulting in equality comparisons rather than identity
comparisons. This can then replace the SenTreeSet data-structure.
2022-02-16 18:09:19 +00:00
github action 77fe7c426e Apply 'make format' 2022-02-16 05:11:38 +00:00
Raynard Qiao 331c2244fc
Fixed signed number operation (#3294) (#3308) 2022-02-16 00:10:34 -05:00
Wilson Snyder 77e68acf54 Suppress WIDTH warning on negate using carry bit (#2395). [Peter Monsson] 2022-02-13 15:27:31 -05:00
Wilson Snyder 7a355d448a Fix skipping public enum values with four-state values (#3303). 2022-02-10 19:27:28 -05:00
Geza Lore fb9119ff49 Rename AstCFunc attribute for clarity.
'formCallTree' -> 'isFinal'. No functional change.
2022-01-28 16:18:50 +00:00
Geza Lore 26bdfc3474 Commentary 2022-01-21 05:53:42 +00:00
Wilson Snyder 0e91d8a10e Internal: Rename for clarity. No functional change. 2022-01-19 19:14:09 -05:00
Wilson Snyder 434c3c3ef3 Removed the deprecated "fl" attribute in XML output; use "loc" attribute instead. 2022-01-17 16:22:07 -05:00
Wilson Snyder 21e05c43dd Removed the deprecated lint_off flag -msg; use -rule instead. 2022-01-17 16:04:06 -05:00
Geza Lore f8c0169e82 Implement 'forceable' attribute
Using the 'forceable' directive in a configuration file, or the /*
verilator forceable */ metacomment on a variable declaration will
generate additional public signals that allow the specified signals to
be forced/released from the C++ code.
2022-01-16 15:31:37 +00:00
Geza Lore 539c9d4c63 Merge alternate 'force'/'release' implementation
- Add more tests, including for tracing.
- Apply some cleaner, more generic abstractions in the implementation.
- Use clearer AstRelease which is not an assignment.
2022-01-16 15:31:37 +00:00
Geza Lore b4d8220cbb
Deprecate --cdc (#3279) 2022-01-16 15:30:44 +00:00
Wilson Snyder e931c6230a Run EmitV test after all stages, and fix resulting fallout 2022-01-09 18:11:24 -05:00
Geza Lore 64a6e1ac8b
Add AstNode::foreach method for simple pre-order traversal (#3276) 2022-01-09 22:34:10 +00:00
Wilson Snyder 50094ca296 Internals: Add cpplint control file and related cleanups 2022-01-09 16:49:38 -05:00
Wilson Snyder 15b32dc140 Internals: cpplint cleanups. No functional change. 2022-01-08 12:01:39 -05:00
Wilson Snyder 441ecfedc9 Internals: Make all .h files compilable 2022-01-08 11:18:23 -05:00
HungMingWu 78147ee8d7 Fix compile error at GCC11
Fixes #3273

Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-01-08 10:40:51 +00:00
Geza Lore 8c58612a3b Improve V3Inline speed and memory consumption
Avoid cloning the module when inlining the last instance that references
that module. This saves a lot of memory because it saves cloning
singleton modules (those with a single instance), which we always
inline. The top few levels of the hierarchy are often simple wrappers,
including the one added by Verilator in V3LinkLevel::wrapTop. Cloning
these and putting off deleting the originals can be very expensive
because they often have a lot of contents inlined into them, so each
layer of wrapper that is inlined would essentially add a whole new clone
of the large top-level. Directly inlining the module for the last cell
without cloning saves us from all this duplicate memory consumption and
also from having to create the clones in the first place.

Also added minor traversal speedups

This reduces the memory consumption of V3Inline by 80% and peak memory
consumption of Verilator by about 66% on a large design, while speeding
up the V3Inline pass by ~3.5x and the whole of Verilator by ~8% while
producing identical output.
2022-01-07 12:11:10 +00:00
Geza Lore 56f9d244de Cleanup V3Inline. No functional change. 2022-01-07 12:08:17 +00:00
Geza Lore 2ba9eb4228 Speed up TSP sort implementation
- More efficient comparison by pre-computing sorting keys.
- Remove work items in algorithms known to be redundant earlier.
  This greatly reduces data structure sizes.
- Use V3GraphVertex->user() for state tracking instead of unordered_map
  while both of these are constant time, they do add up.
- In `makeMinSpanningTree`, instead of batch inserting outgoing edges of
  each visited vertex into an ordered set, keep an ordered set of sorted
  vectors of edges. This reduces the size of the ordered set
  significantly (it is now O(V) rather than O(E), and as the subject
  graph is a complete graph, V ~ sqrt(E), so this is a significant gain).
- Use a vector + sorting in `perfectMatching` instead of an ordered set.
  This is faster on large working sets.

This yields 3.8x speedup on the variable order pass and overall 14%
verilation speed gain on a large design.
2022-01-07 12:05:52 +00:00
Geza Lore 9a8c878f2d Avoid repeated traversal for SC text sections in emit when not needed
Repeatedly traversing whole modules in emit (due to file splitting)
looking for `systemc_* sections can add up to a lot of time on large
designs that have been flattened and need to be split into many files.
Assuming `systemc_* is a rarely used feature, just don't bother if we
don't need to. This gain 9% verilation speed improvement on a large
benchmark.
2022-01-07 12:05:50 +00:00
Wilson Snyder 41a563bdc8 Internal cleanups towards recursive functions (#3267) 2022-01-04 20:19:58 -05:00
Yutetsu TAKATSUKASA 4e5f30858b
Fix #3258 of internal error with inout port (#3268)
* Tests: Modify t_tri_inout to reproduce #3258

* Set direction of __en accorting to its main signal direction

* Update Changes
2022-01-05 08:37:20 +09:00
Wilson Snyder b989ac6db5 Internals: Support linking recursive function calls (but not later stages) 2022-01-03 18:50:41 -05:00
Wilson Snyder 4d1f4bbf49 Backout last commit; is unstable. 2022-01-03 13:04:47 -05:00
Wilson Snyder e9ad665d32 Internals: Support linking recursive function calls (but not later stages) 2022-01-03 12:25:50 -05:00
Wilson Snyder ebf5c11e03 Internals: In astgen text output, pickup missing node references 2022-01-02 20:54:39 -05:00
Wilson Snyder 7e355e211c Fix dangling node on error 2022-01-02 20:54:13 -05:00
Wilson Snyder f36461e696 Internals: Remove dead code 2022-01-02 18:38:07 -05:00
github action 88d7ca01b0 Apply 'make format' 2022-01-02 20:13:16 +00:00
Wilson Snyder 2e2b82c052 Support class static members (#2233). 2022-01-02 15:09:07 -05:00
Wilson Snyder f1bb0544be Internals: Cleanups towards static class members. No functional change intended. 2022-01-02 15:03:57 -05:00
Wilson Snyder e6857df5c6 Internals: Rename Ast on non-node classes (#3262). No functional change.
This commit has the following replacements applied:

	s/\bAstUserInUseBase\b/VNUserInUseBase/g;
        s/\bAstAttrType\b/VAttrType/g;
        s/\bAstBasicDTypeKwd\b/VBasicDTypeKwd/g;
        s/\bAstDisplayType\b/VDisplayType/g;
        s/\bAstNDeleter\b/VNDeleter/g;
        s/\bAstNRelinker\b/VNRelinker/g;
        s/\bAstNVisitor\b/VNVisitor/g;
        s/\bAstPragmaType\b/VPragmaType/g;
        s/\bAstType\b/VNType/g;
        s/\bAstUser1InUse\b/VNUser1InUse/g;
        s/\bAstUser2InUse\b/VNUser2InUse/g;
        s/\bAstUser3InUse\b/VNUser3InUse/g;
        s/\bAstUser4InUse\b/VNUser4InUse/g;
        s/\bAstUser5InUse\b/VNUser5InUse/g;
        s/\bAstVarType\b/VVarType/g;
2022-01-02 14:03:20 -05:00
github action 73374a0303 Apply 'make format' 2022-01-02 18:36:52 +00:00
Wilson Snyder e334740dd6 Add AstInitialAutomatic as prep for static class members 2022-01-02 12:35:44 -05:00
Wilson Snyder 84ee833ea7 Ignore --x-initial unique inside classes. 2022-01-02 12:26:10 -05:00
Wilson Snyder b7ad1e6d61 Internals: Rename some non-nodes to avoid Ast prefix. No functional change. 2022-01-02 10:37:20 -05:00
github action 340efe3a3a Apply 'make format' 2022-01-02 14:46:15 +00:00