Commit Graph

115 Commits

Author SHA1 Message Date
Geza Lore f4038e3674
Move thread pool and execution profiler into the context. (#3477)
Fixes #3454
2022-07-12 11:41:15 +01:00
Wilson Snyder fc4d6a62af Remove VL_PROFILER ifdef. Partial (#3454). 2022-06-22 20:06:23 -04:00
Geza Lore b51f887567
Perform VCD tracing in parallel when using --threads (#3449)
VCD tracing is now parallelized using the same thread pool as the model.
We achieve this by breaking the top level trace functions into multiple
top level functions (as many as --threads), and after emitting the time
stamp to the VCD file on the main thread, we execute the tracing
functions in parallel on the same thread pool as the model (which we
pass to the trace file during registration), tracing into a secondary
per thread buffer. The main thread will then stitch (memcpy) the buffers
together into the output file.

This makes the `--trace-threads` option redundant with `--trace`, which
now only affects `--trace-fst`. FST tracing uses the previous offloading
scheme.

This obviously helps a lot in VCD tracing performance, and I have seen
better than Amdahl speedup, namely I get 3.9x on XiangShan 4T (2.7x on
OpenTitan 4T).
2022-05-29 19:08:39 +01:00
Geza Lore b130a8cfeb Add -DVM_TRACE_VCD in model builds with Make with --trace 2022-05-20 16:44:38 +01:00
Geza Lore 551bd284dd Rename some internals related to multi-threaded tracing
Rename the implementation internals of current multi-threaded tracing to
be "offload mode". No functional change, nor user interface change
intended.
2022-05-20 16:44:35 +01:00
HungMingWu 880a9be3b1
Internal: Add C++20ish reverse_view for range loops. No functional change (#3388).
Signed-off-by: HungMingWu <u9089000@gmail.com>
2022-04-18 13:03:56 -04:00
Julien Margetts baff64a43d
Add VK_USER_OBJS dependency to --create-lib library (#3370) (#3382). 2022-04-12 07:04:31 -04:00
Geza Lore b1b5b5dfe2 Improve run-time profiling
The --prof-threads option has been split into two independent options:
1. --prof-exec, for collecting verilator_gantt and other execution
related profiling data, and
2. --prof-pgo, for collecting data needed for PGO

The implementation of execution profiling is extricated from
VlThreadPool and is now a separate class VlExecutionProfiler. This means
--prof-exec can now be used for single-threaded models (though it does
not measure a lot of things just yet). For consistency VerilatedProfiler
is renamed VlPgoProfiler. Both VlExecutionProfiler and VlPgoProfiler are
in verilated_profiler.{h/cpp}, but can be used completely independently.

Also re-worked the execution profile format so it now only emits events
without holding onto any temporaries. This is in preparation for some
future optimizations that would be hindered by the introduction of function
locals via AstText.

Also removed the Barrier event. Clearing the profile buffers is not
notably more expensive as the profiling records are trivially
destructible.
2022-03-27 15:57:30 +02:00
Wilson Snyder ca42be982c Copyright year update. 2022-01-01 08:26:40 -05:00
Wilson Snyder cd737065f2 Internals: More const. No functional change intended. 2021-11-26 17:55:36 -05:00
Wilson Snyder 899de9a282 Add --lib-create, similar to --protect-lib but without protections (#3200). 2021-11-14 09:39:31 -05:00
Wilson Snyder 37e3c6da70 Internals: Add more const. No functional change intended. 2021-11-13 13:50:44 -05:00
Geza Lore dae9fa5053 Use VN_AS wherever possible and obvious. No functional change. 2021-10-22 14:06:00 +01:00
Wilson Snyder c7499133b2 Internals: C++11 for bool. No functional change. 2021-07-11 10:42:32 -04:00
Wilson Snyder 36599133bf Add --prof-c to pass profiling to compiler (#3059). 2021-07-07 19:12:52 -04:00
Wilson Snyder 512fe0a2d1 Internals: Add const. No functional change. 2021-06-20 18:33:13 -04:00
Àlex Torregrosa a29ac44af9
Add FST SystemC tracing (#2806) 2021-04-06 16:18:58 -04:00
Wilson Snyder 9483ebefae Internal code coverage cleanups. 2021-03-07 21:05:15 -05:00
Wilson Snyder 9650aefa42 Internals: Cleanup unneeded {}. No functional change 2021-02-21 21:25:21 -05:00
Wilson Snyder bcf9abf490 Internals: Var rename. No functional change. 2021-01-11 22:42:14 -05:00
Wilson Snyder bd602d0e2d Copyright year update 2021-01-01 10:29:54 -05:00
Wilson Snyder b7a533109d Fix cppcheck warnings. No functional change intended. 2020-12-23 15:22:02 -05:00
Wilson Snyder b6ded59c2b Internals: Use and enforce class final for ~5% performance boost. 2020-11-18 21:32:16 -05:00
Wilson Snyder 1b0a48ea02 Internals: Use C++11 = default where obvious. No functional change intended. 2020-11-16 19:56:16 -05:00
Wilson Snyder 79d33bf1ee Use C++11 for loops, from clang-migrate. No functional change intended 2020-11-10 22:10:38 -05:00
Wilson Snyder 44eb362a18 clang-tidy cleanups. No functional change intended. 2020-11-10 21:40:14 -05:00
Markus Krause 0a9ae154be
introduce define for FST tracing (#2592)
This is to allow C++ verilator toplevel to support
multiple modes of waveform tracing
VM_TRACE_FST can be used inside a #if VM_TRACE
section to switch between classic .vcd tracing and the
more compact .fst format supported by GTKWAVE
2020-10-10 21:17:39 -04:00
Yutetsu TAKATSUKASA 70eb99b050
Fix double-free on shared protect-lib (#2526)
* Add a test to use shared object of protect-lib
* Add a guard to call ctor/dtor just once even when a protec-lib is shared object.
* Pass .a to linker in leaf-last order for older ld.
* Add -flat_namespace for mac
2020-08-31 08:22:31 -04:00
Yutetsu TAKATSUKASA 4f88ec3518
Fix hier test failure on mac (#2524)
* No need to link the intermediate .so in hierarchical verilation

* apply clang-format

* run all tests in ci instead of cron. DONT_MERGE

* Add -undefined dynamic_lookup for mac environment when linking a shared protect-lib.

* Let's just check on mac for now. DONT_MERGE

* Revert "Let's just check on mac for now. DONT_MERGE"

This reverts commit 533fac6f9f.

* Revert "run all tests in ci instead of cron. DONT_MERGE"

This reverts commit fb4ac1fb42.
2020-08-29 07:56:06 -04:00
Wilson Snyder ea9b65fe6d Hardcode VM_C11 as always need C++11 now 2020-08-16 15:10:43 -04:00
Wilson Snyder ac04e85a1c C++11: More range for. No functional change intended. 2020-08-16 12:54:32 -04:00
Wilson Snyder ee9d6dd63f C++11: Favor auto, range for. No functional change intended. 2020-08-16 11:44:06 -04:00
Wilson Snyder 72d2cff0a1 C++11: Use member declaration initalizations. No functional change intended. 2020-08-16 11:44:06 -04:00
Yutetsu TAKATSUKASA 953a442827
Support hierarchical verilation using protect lib (#2206) 2020-08-15 09:43:53 -04:00
Wilson Snyder 6de78d58fa Add new UNSUPPORTED error code to replace most previous Unsupported: messages. 2020-06-09 19:20:16 -04:00
Wilson Snyder 279f21bb5b Configure now enables SystemC if it is installed as a system headers. 2020-05-28 18:51:46 -04:00
Geza Lore 72858175a2 Only emit VM_PARALLEL_BUILDS=1 iff --output-split caused a split.
Previously we set VM_PARALLEL_BUILDS=1 if the --output-split option was
provided. Now we only do it iff it actually causes a split.
2020-05-26 01:22:10 +01:00
Geza Lore dd967f7769 Improve trace buffer memory utilization and performance.
Convert trace buffer to 32-bit entries, rather than a union containing a
pointer type. Also tweaked trace entry layouts for a bit more
performance. This gains another 10% on SweRV EH1 CoreMark.
2020-04-27 19:00:17 +01:00
Geza Lore 6ed10b7fde
Fix --protect-lib generated library link rules (#2279)
We used to include a .cpp file on the link line for the shared library,
which was ignored, but generated a .d file for the .so which contained
the header files required by the .cpp file. This then caused a rebuild
where we included the .d in verilated.mk to included in the .h headers
among the prerequisites of the .so, yielding a clang error about treating
.h files as c++-header rather than c-header... Long story short, we don't
do that anymore. This used t cause t_a4_examples to fail on occasion.

Note there is no need for a separate compilation rule for the
<--protect-lib>.cpp, as it will jsut pick up the standard OPT_FAST rule.
2020-04-23 17:30:23 -04:00
Geza Lore c52f3349d1
Initial implementation of generic multithreaded tracing (#2269)
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).

Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:

--trace     ->  --trace --trace-threads 1      +22%
--trace-fst ->  --trace-fst --trace-threads 1  +38% (as --trace-fst-thread)
--trace-fst ->  --trace-fst --trace-threads 2  +93%

Speed relative to --trace with no threaded tracing:
--trace                                 1.00 x
--trace --trace-threads 1               0.82 x
--trace-fst                             1.79 x
--trace-fst --trace-threads 1           1.23 x
--trace-fst --trace-threads 2           0.87 x

This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
2020-04-21 23:49:07 +01:00
Wilson Snyder f3308d236b clang-format remaining sources. No functional change. 2020-04-15 07:58:34 -04:00
Wilson Snyder 9fdb026e95 Add VM_C11 for future need of C++11 2020-04-04 20:48:03 -04:00
Wilson Snyder 1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder 8ccc17f30b Add setting VM_PARALLEL_BUILDS=1 when using --output-split, #2185. 2020-03-08 09:03:29 -04:00
Wilson Snyder 30a33a6104 Add support for and , #2126. 2020-03-01 21:39:23 -05:00
Wilson Snyder 609a5dc26d Fix cppcheck warnings. No functional change intended. 2020-02-03 23:21:56 -05:00
Wilson Snyder a4e8d39932 Spelling fixes 2020-01-24 20:10:44 -05:00
Geza Lore 220daa5f33 Internals: Restore AstNode naming property. #2133.
The intention was that all subclasses of AstNode which are
intermediate must be abstract as well and called AstNode*. This was
violated recently by 28b9db1903. This
patch restores that property by:
- Renaming AstFile to AstNodeFile
- Introducing AstNodeSimpleText as the common base of AstText and
  AstTextBlock, rather than AstTextBlock deriving from AstText.
2020-01-21 19:54:14 -05:00
Wilson Snyder f23fe8fd84 Update copyright year. 2020-01-06 18:05:53 -05:00
Wilson Snyder 5811ec07e6 Update URLs to https://verilator.org 2019-11-07 22:33:59 -05:00