Commit Graph

52 Commits

Author SHA1 Message Date
Mariusz Glebocki 28bd7e5b19
Rework multithreading handling to separate by code units that use/never use it. (#4228) 2023-09-24 22:12:23 -04:00
Yinan Xu b4b74d72f0
Add prepareClone and atClone APIs for Verilated models (#3503) (#4444)
This API is used if the user copies the process using `fork`
and similar OS-level mechanisms. The `at_clone` member function
ensures that all model-allocated resources are re-allocated, such
that the copied child process/model can simulate correctly.

A typical allocated resource is the thread pool, which every model
has its own pool.
2023-08-30 07:02:55 -04:00
Wilson Snyder dc25be536c Internals: Deprecate VL_ATTR_ALIGNED, use alignas instead. 2023-05-02 21:21:10 -04:00
Wilson Snyder f2aac8c49a Internals: Use VNVisitorConst where possible, for better performance. No functional change indended. 2023-03-18 12:23:17 -04:00
Wilson Snyder b24d7c83d3 Copyright year update 2023-01-01 10:18:39 -05:00
Wilson Snyder 352d0b4582 Internals: Fix constructor style. 2022-11-20 13:11:01 -05:00
Wilson Snyder 3cd2c8532d Internals: Cleanup spacing of Vi for loops. 2022-10-15 18:47:10 -04:00
Krzysztof Bieganski 1a8188e1b4
Fix linker errors in user-facing timing functions (#3657)
Before this change, a design verilated with `--timing` that does not
actually use timing features would be emitted with `eventsPending` and
`nextTimeSlot` declared in the top class. However, their definitions
would be missing, leading to linker errors during design compilation.
This patch makes Verilator always emit the definitions, which prevents
linker errors. Trying to use `nextTimeSlot` without delays in the design
will result in an error at runtime.
2022-10-05 18:16:05 -04:00
Krzysztof Bieganski 9c2ead90d5
Add custom memory management for verilated classes (#3595)
This change introduces a custom reference-counting pointer class that
allows creating such pointers from 'this'. This lets us keep the
receiver object around even if all references to it outside of a class
method no longer exist. Useful for coroutine methods, which may outlive
all external references to the object.

The deletion of objects is deferred until the next time slot. This is to
make clearing the triggered flag on named events in classes safe
(otherwise freed memory could be accessed).
2022-09-28 18:54:18 -04:00
Geza Lore ddb678cc5b Merge branch 'master' into develop-v5 2022-09-22 17:33:36 +01:00
Geza Lore 63c694f65f Streamline dump control options
- Rename `--dump-treei` option to `--dumpi-tree`, which itself is now a
  special case of `--dumpi-<tag>` where tag can be a magic word, or a
  filename
- Control dumping via static `dump*()` functions, analogous to `debug()`
- Make dumping independent of the value of `debug()` (so dumping always
  works even without the debug flag)
- Add separate `--dumpi-graph` for dumping V3Graphs, which is again a
  special case of `--dumpi-<tag>`
- Alias `--dump-<tag>` to `--dumpi-<tag> 3` as before
2022-09-22 17:24:41 +01:00
Wilson Snyder d162619bd3 Merge branch 'master' into develop-v5 2022-09-20 20:06:21 -04:00
Yu-Sheng Lin bba800f2d6
Fix calling trace() after open() segfault (#3610) (#3627) 2022-09-20 16:45:09 -04:00
Geza Lore af305bf280 Merge branch 'master' into develop-v5 2022-09-16 16:24:36 +01:00
Geza Lore 0c70a0dcbf Remove redundant 'virtual' keywords from overridden methods
'virtual' is redundant when 'override' is present, so keep only
'override'.

Add t/t_dist_cppstyle.pl to check for this.
2022-09-16 15:19:38 +01:00
Geza Lore 27031ed688 Merge branch 'master' into develop-v5 2022-09-15 10:28:35 +01:00
Wilson Snyder d85b909054 Internals: Use std:: for mem and str functions. 2022-09-14 21:10:19 -04:00
Krzysztof Bieganski 39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00
Geza Lore c266739e9f Merge branch 'master' into develop-v5 2022-08-05 12:17:57 +01:00
Geza Lore 96a4b3e5a5 Update clang-format config and apply
- Regroup and sort #include directives (like we used to, but automatic)
- Set AlwaysBreakTemplateDeclarations to true
2022-08-05 12:00:24 +01:00
Wilson Snyder 4859f5e1fa Merge branch 'master' into develop-v5 2022-07-30 10:26:16 -04:00
Wilson Snyder b9d7819faa Internals: Fix some cppcheck issues. Some dump functions fixed. 2022-07-30 10:01:39 -04:00
Geza Lore f9ecbdc70b Merge branch 'master' into develop-v5 2022-07-21 09:56:14 +01:00
Geza Lore 1d400dd98c
Configure tracing at run-time, instead of compile time (#3504)
All remaining use of conditional compilation in the tracing
implementation of the run-time library are replaced with the use of
VerilatedModel::traceConfig, and is now done at run-time.
2022-07-20 11:27:10 +01:00
Geza Lore 9085e34d70 Pass VerilatedModel at trace registration time 2022-07-19 11:00:09 +01:00
Geza Lore c9ac9a75a6 Merge branch 'master' into develop-v5 2022-07-12 17:29:45 +01:00
Geza Lore 79c901c220 Tighten signatures/implementaion of VerilatedModel abstract methods. 2022-07-12 16:06:08 +01:00
Geza Lore b61d819fcb Move contextp() under VerilatedModel 2022-07-12 16:06:08 +01:00
Geza Lore f4038e3674
Move thread pool and execution profiler into the context. (#3477)
Fixes #3454
2022-07-12 11:41:15 +01:00
Geza Lore 599d23697d
IEEE compliant scheduler (#3384)
This is a major re-design of the way code is scheduled in Verilator,
with the goal of properly supporting the Active and NBA regions of the
SystemVerilog scheduling model, as defined in IEEE 1800-2017 chapter 4.

With this change, all internally generated clocks should simulate
correctly, and there should be no more need for the `clock_enable` and
`clocker` attributes for correctness in the absence of Verilator
generated library models (`--lib-create`).

Details of the new scheduling model and algorithm are provided in
docs/internals.rst.

Implements #3278
2022-05-15 16:03:32 +01:00
Geza Lore b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Wilson Snyder ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Yutetsu TAKATSUKASA 0658a7654f
Add tests of tracing SystemC model verilated with --hierarchical (#3252)
* Tests: Add t_hier_block_sc_trace(fst|vcd) that tests tracing hierarchical block on SystemC.

* Add a check that elaboration is done before a trace file is opened.

* Add a check that elaboration is done before trace() is called to verilated SystemC model.

* Tests: call sc_core::sc_start(sc_core::SC_ZERO_TIME) before opening a trace file

* Tests: Fix t_trace_two_sc to call sc_start before opening trace

* Use vl_fatal as suggested in PR review.
2021-12-23 08:41:11 +09:00
Geza Lore ff425369ac
Reduce .rodata footprint of trace initialization (#3250)
Trace initialization (tracep->decl* functions) used to explicitly pass
the complete hierarchical names of signals as string constants. This
contains a lot of redundancy (path prefixes), does not scale well with
large designs and resulted in .rodata sections (the string constants) in
ELF executables being extremely large.

This patch changes the API of trace initialization that allows pushing
and popping name prefixes as we walk the hierarchy tree, which are
prepended to declared signal names at run-time during trace
initialization. This in turn allows us to emit repeat path/name
components only once, effectively removing all duplicate path prefixes.
On SweRV EH1 this reduces the .rodata section in a --trace build by 94%.

Additionally, trace declarations are now emitted in lexical order by
hierarchical signal names, and the top level trace initialization
function respects --output-split-ctrace.
2021-12-19 15:15:07 +00:00
Wilson Snyder 04e0c7e4f1 Support tracing through --hierarchical/--lib-create libraries (#3200). 2021-11-27 17:07:27 -05:00
Wilson Snyder cd737065f2 Internals: More const. No functional change intended. 2021-11-26 17:55:36 -05:00
Wilson Snyder cfa401e01d Internals: Add lib-create const and improve comments. 2021-11-14 12:01:02 -05:00
Wilson Snyder d6195e354f Add some missing COLD attributes. 2021-11-14 11:05:20 -05:00
Geza Lore 381c87a1eb Remove VN_CAST_CONST and VN_AS_CONST.
The _CONST suffix on these macros is only lexical notation, pointer
constness can be preserved by overloading the underlying
implementations appropriately. Given that the compiler will catch
invalid const usage (trying to assign a non-const pointer to a const
pointer variable, etc.), and that the declarations of symbols should
make their constness obvious, I see no reason to keep the _CONST
flavours.
2021-10-24 11:43:48 +01:00
Wilson Snyder c2819923c5 Verilator_gantt now shows the predicted mtask times, eval times, and additional statistics. 2021-09-23 22:59:36 -04:00
Wilson Snyder 68f1432a68 Gantt: Subtract common start in slowpath to reduce collection measurement error. 2021-09-23 19:43:20 -04:00
Wilson Snyder 97d8d32049 Commentary 2021-09-17 18:52:12 -04:00
Wilson Snyder b90fce55f4 Includes: Refactor verilated.h and deprecate verilated_heavy.h (#2701). 2021-07-24 10:00:33 -04:00
Wilson Snyder 13933743ad Suppress creating change_request if not needed. 2021-07-22 20:50:03 -04:00
Wilson Snyder fa64f8fb32 Internals: C++11 cleanup. No functional change intended. 2021-07-22 20:05:54 -04:00
Geza Lore ab4063f098 Emit implementations into separate files based on required headers.
This patch partitions AstCFuncs under an AstNodeModule based on which
header files they require for their implementation, and emits them
into separate files based on the distinct dependency sets. This helps
with incremental recompilation of the output C++.
2021-07-22 18:01:07 +01:00
Geza Lore 4081a1a539 Internals: Separate emitting of C++ headers and implementation
Internal AstNodeModule headers (.h) and implementation (.cpp) files are
now emitted separately in V3EmitC::emitcHeaders() and
V3EmitC::emitcImp() respectively. No functional change intended
2021-07-13 17:43:44 +01:00
Geza Lore 17cc452f79 Add V3VariableOrder pass
A separate V3VariableOrder pass is now used to order module variables
before Emit. All variables are now ordered together, without
consideration for whether they are ports, signals form the design, or
additional internal variables added by Verilator (which used to be
ordered and emitted as separate groups in Emit). For single threaded
models, this is performance neutral. For multi-threaded models, the
MTask affinity based sorting was slightly modified, so variables with no
MTask affinity are emitted last, otherwise the MTask affinity sets are
sorted using the TSP sorter as before, but again, ports, signals, and
internal variables are not differentiated. This yields a 2%+ speedup for
the multithreaded model on OpenTitan.
2021-07-12 14:53:40 +01:00
Wilson Snyder 8ecdc85cf7 Internals: C++11 style cleanups. No functional change. 2021-07-11 18:42:01 -04:00
Geza Lore 76b3776fa3 Change generated tracing routines to use snake_case
For consistency with the rest of the generated code, generated methods
related to tracing now use snake_case instead of camelCase. No
functional change intended.
2021-07-08 02:08:09 +01:00