Commit Graph

21 Commits

Author SHA1 Message Date
Veripool API Bot 1f67080a1f Tests: Verilog format 2026-03-09 21:39:16 -04:00
Geza Lore 505d33b35a
Support #0 delays with IEEE-1800 compliant semantics (#7079)
This patch adds IEEE-1800 compliant scheduling support for the Inactive
scheduling region used for #0 delays.

Implementing this requires that **all** IEEE-1800 active region events
are placed in the internal 'act' section. This has simulation
performance implications. It prevents some optimizations (e.g.
V3LifePost), which reduces single threaded performance. It also reduces
the available work and parallelism in the internal 'nba' section, which
reduced the effectiveness of multi-threading severely.

Performance impact on RTLMeter when using scheduling adjusted to support
proper #0 delays is ~10-20% slowdown in single-threaded mode, and ~100%
(2x slower) with --threads 4.

To avoid paying this performance penalty unconditionally, the scheduling
is only adjusted if either:
1. The input contains a statically known #0 delay
2. The input contains a variable #x delay unknown at compile time

If no #0 is present, but #x variable delays are, a ZERODLY warning is
issued advising the use of '--no-sched-zero-delay' which is a promise
by the user that none of the variable delays will evaluate to a zero
delay at run-time. This warning is turned off if '--sched-zero-delay'
is explicitly given. This is similar to the '--timing' option.

If '--no-sched-zero-delay' was used at compile time, then executing
a zero delay will fail at runtime.

A ZERODLY warning is also issued if a static #0 if found, but the user
specified '--no-sched-zero-delay'. In this case the scheduling is not
adjusted to support #0, so executing it will fail at runtime. Presumably
the user knows it won't be executed.

The intended behaviour with all this is the following:

No #0, no #var in the design (#constant is OK)
-> Same as current behaviour, scheduling not adjusted,
   same code generated as before

Has static #0 and '--no-sched-zero-delay' is NOT given:
-> No warnings, scheduling adjusted so it just works, runs slow

Has static #0 and '--no-sched-zero-delay' is given:
-> ZERODLY on the #0, scheduling not adjusted, fails at runtime if hit

No static #0, but has #var and no option is given:
-> ZERODLY on the #var advising use of '--no-sched-zero-delay' or
   '--sched-zero-delay' (similar to '--timing'), scheduling adjusted
   assuming it can be a zero delay and it just works

No static #0, but has #var and '--no-sched-zero-delay' is given:
-> No warning, scheduling not adjusted, fails at runtime if zero delay

No static #0, but has #var and '--sched-zero-delay' is given:
-> No warning, scheduling adjusted so it just works
2026-02-16 03:55:55 +00:00
Igor Zaworski 446bec3d1a
Fix event triggering (#6932) 2026-02-11 10:35:59 -08:00
Geza Lore 2e502aead8
Internals: Make all scheduling region use a single trigger vector. (#6620)
The 'act' region used to have 2 trigger vectors ('act' and 'pre'), now
it uses a single "extended" trigger vector where the top bits are what
used to be the used bits in the 'pre' trigger vector. Please see the
description above `TriggerKit`. Also move the extra triggers from the
low end to the high end in the trigger vectors.
2025-11-01 15:43:20 +00:00
Geza Lore 922223a9c3
Internals: Replace VlTriggerVec with unpacked array (#6616)
Removed the VlTriggerVec type, and refactored to use an unpacked array
of 64-bit words instead. This means the trigger vector and its
operations are now the same as for any other unpacked array. The few
special functions required for operating on a trigger vector are now
generated in V3SchedTrigger as regular AstCFunc if needed.

No functional change intended, performance should be the same.
2025-10-31 18:29:11 +00:00
Geza Lore 636a6b8cd2
Optimize complex combinational logic in DFG (#6298)
This patch adds DfgLogic, which is a vertex that represents a whole,
arbitrarily complex combinational AstAlways or AstAssignW in the
DfgGraph.

Implementing this requires computing the variables live at entry to the
AstAlways (variables read by the block), so there is a new
ControlFlowGraph data structure and a classical data-flow analysis based
live variable analysis to do that at the variable level (as opposed to
bit/element level).

The actual CFG construction and live variable analysis is best effort,
and might fail for currently unhandled constructs or data types. This
can be extended later.

V3DfgAstToDfg is changed to convert the Ast into an initial DfgGraph
containing only DfgLogic, DfgVertexSplice and DfgVertexVar vertices.

The DfgLogic are then subsequently synthesized into primitive operations
by the new V3DfgSynthesize pass, which is a combination of the old
V3DfgAstToDfg conversion and new code to handle AstAlways blocks with
complex flow control.

V3DfgSynthesize by default will synthesize roughly the same constructs
as V3DfgAstToDfg used to handle before, plus any logic that is part of a
combinational cycle within the DfgGraph. This enables breaking up these
cycles, for which there are extensions to V3DfgBreakCycles in this patch
as well. V3DfgSynthesize will then delete all non synthesized or non
synthesizable DfgLogic vertices and the rest of the Dfg pipeline is
identical, with minor changes to adjust for the changed representation.

Because with this change we can now eliminate many more UNOPTFLAT, DFG
has been disabled in all the tests that specifically target testing the
scheduling and reporting of circular combinational logic.
2025-08-19 15:06:38 +01:00
Wilson Snyder 15ebbd309f Fix always processes ignoring $finish (#5971). 2025-05-02 07:36:42 -04:00
Geza Lore 3bc09d49fb
Generate one trigger per SenItem instead of per SenTree (#5483) 2024-09-25 10:35:50 +01:00
Krzysztof Bieganski afb8428db4
Support IEEE-compliant intra-assign delays (#3711) (#5441) 2024-09-06 18:13:52 -04:00
Jinyan Xu 4650105d90
Fix conflicted namespace for coroutines (#4701) (#4707) 2023-11-20 21:02:10 -05:00
Wilson Snyder 9a0254a118
Optimize timing-delayed queue (#4584). (#4669) 2023-11-11 10:04:10 -05:00
Geza Lore 2cba167634 Make eval loop construction more unified and the output more readable 2023-10-28 08:48:04 +01:00
Geza Lore a09506a0ad Trivial simplification of V3EmitCModel
Still some remains of the --threads 0 mode. Remove unnecessary complexity
from V3EmitCModel. (Also don't pretend there is an MTask in single
threaded mode, when there really isn't.)
2023-10-21 20:41:46 +01:00
Wilson Snyder 6d5dde8645 Fix duplicate Vfork functions (#4418) 2023-08-30 17:59:25 -04:00
Aleksander Kiryk 7c7c92d2dd
Fix coroutine handle movement during queue manipulation (#4431) 2023-08-21 10:22:09 -04:00
Ryszard Rozak 4522834f7a
Fix duplicate fork names (#4295) 2023-06-22 06:51:53 -04:00
Adam Bagley 003a8cfe75
Add lint warning on always_comb multidriven (#3888) (#3939) 2023-02-23 05:36:28 -05:00
Kamil Rakoczy d6126c4b32
Remove --no-threads; require --threads 1 for single threaded (#3703). 2022-11-05 08:47:34 -04:00
Krzysztof Bieganski 5688d1a935
Internals: Add `V3UniqueNames` consistency assertion (#3692) 2022-10-21 07:05:38 -04:00
Wilson Snyder 916a3d9066 Fix --main --trace missing initial timestep (#3678). 2022-10-15 13:24:38 -04:00
Krzysztof Bieganski 39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00