Commit Graph

19 Commits

Author SHA1 Message Date
Wilson Snyder 5bee8ef15b Remove long deprecated verilated_heavy.h 2023-06-17 08:45:13 -04:00
Krzysztof Bieganski 39af5d020e
Timing support (#3363)
Adds timing support to Verilator. It makes it possible to use delays,
event controls within processes (not just at the start), wait
statements, and forks.

Building a design with those constructs requires a compiler that
supports C++20 coroutines (GCC 10, Clang 5).

The basic idea is to have processes and tasks with delays/event controls
implemented as C++20 coroutines. This allows us to suspend and resume
them at any time.

There are five main runtime classes responsible for managing suspended
coroutines:
* `VlCoroutineHandle`, a wrapper over C++20's `std::coroutine_handle`
  with move semantics and automatic cleanup.
* `VlDelayScheduler`, for coroutines suspended by delays. It resumes
  them at a proper simulation time.
* `VlTriggerScheduler`, for coroutines suspended by event controls. It
  resumes them if its corresponding trigger was set.
* `VlForkSync`, used for syncing `fork..join` and `fork..join_any`
  blocks.
* `VlCoroutine`, the return type of all verilated coroutines. It allows
  for suspending a stack of coroutines (normally, C++ coroutines are
  stackless).

There is a new visitor in `V3Timing.cpp` which:
  * scales delays according to the timescale,
  * simplifies intra-assignment timing controls and net delays into
    regular timing controls and assignments,
  * simplifies wait statements into loops with event controls,
  * marks processes and tasks with timing controls in them as
    suspendable,
  * creates delay, trigger scheduler, and fork sync variables,
  * transforms timing controls and fork joins into C++ awaits

There are new functions in `V3SchedTiming.cpp` (used by `V3Sched.cpp`)
that integrate static scheduling with timing. This involves providing
external domains for variables, so that the necessary combinational
logic gets triggered after coroutine resumption, as well as statements
that need to be injected into the design eval function to perform this
resumption at the correct time.

There is also a function that transforms forked processes into separate
functions.

See the comments in `verilated_timing.h`, `verilated_timing.cpp`,
`V3Timing.cpp`, and `V3SchedTiming.cpp`, as well as the internals
documentation for more details.

Signed-off-by: Krzysztof Bieganski <kbieganski@antmicro.com>
2022-08-22 13:26:32 +01:00
Geza Lore 3aa8624658 Set 'threads' in tests via parameter to compile
This is in preparation to #3454.
2022-07-05 12:33:41 +01:00
Geza Lore 383e384739 Remove always true cfg_with_threaded from test driver 2022-06-27 15:23:32 +01:00
Geza Lore b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Wilson Snyder 9029da5ab8 Add profile-guided optmization of mtasks (#3150). 2021-09-26 22:51:11 -04:00
Wilson Snyder f1b8b1d99b Format: perltidy spacing cleanup. No functional change. 2021-09-08 18:45:25 -04:00
Wilson Snyder b90fce55f4 Includes: Refactor verilated.h and deprecate verilated_heavy.h (#2701). 2021-07-24 10:00:33 -04:00
Tim Snyder a57262d6e7
Fix use /usr/bin/env perl in lieu of /usr/bin/perl (#2306)
Enables scripts to work where perl is not installed at /usr/bin/perl
2020-05-04 18:42:15 -04:00
Geza Lore c52f3349d1
Initial implementation of generic multithreaded tracing (#2269)
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).

Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:

--trace     ->  --trace --trace-threads 1      +22%
--trace-fst ->  --trace-fst --trace-threads 1  +38% (as --trace-fst-thread)
--trace-fst ->  --trace-fst --trace-threads 2  +93%

Speed relative to --trace with no threaded tracing:
--trace                                 1.00 x
--trace --trace-threads 1               0.82 x
--trace-fst                             1.79 x
--trace-fst --trace-threads 1           1.23 x
--trace-fst --trace-threads 2           0.87 x

This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
2020-04-21 23:49:07 +01:00
Wilson Snyder 1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder f0cdae129e Removed --trace-lxt2, use --trace-fst instead. 2018-12-06 19:06:20 -05:00
Sergi Granell a5aa0e2b0a Add GTKWave FST native tracing, bug1356.
Signed-off-by: Wilson Snyder <wsnyder@wsnyder.org>
2018-10-02 18:42:53 -04:00
johnjohnlin acf4a3fa99 Add GTKWave LXT2 native tracing, bug1333.
Signed-off-by: Wilson Snyder <wsnyder@wsnyder.org>
2018-08-28 06:41:17 -04:00
Wilson Snyder ec8dbbffed MAJOR: Add multithreaded model generation. 2018-07-22 20:54:28 -04:00
John Coiner c124fe6d0c Add clones of std::unordered_map and std::unordered_set for pre-C++11 compilers. 2018-06-11 22:05:15 -04:00
Wilson Snyder c29e7619eb Tests: Support multiple scenario testing. 2018-05-07 20:42:28 -04:00
Wilson Snyder a265417727 Tests: Cleanup Perl indentations. No functional change. 2018-05-06 22:39:18 -04:00
Wilson Snyder e403db5877 tests: Simplify includes; HARNESS_UPDATE_GOLDEN to whitespace fix 2017-10-24 19:58:52 -04:00