Commit Graph

134 Commits

Author SHA1 Message Date
Geza Lore 95534fa5c5
Remove unused headers (#2389) 2020-05-31 20:21:07 +01:00
Wilson Snyder 5089ac6119 Remove VL_ULL as ULL now in MSVC & C++11 2020-05-28 20:32:07 -04:00
Wilson Snyder 773ed97504 Internals Most VerilatedLockGuard can be const. No functional change intended. 2020-05-28 18:23:46 -04:00
Wilson Snyder 6a882f9dc6 Internal code coverage improvements. No functional change intended. 2020-05-23 10:34:58 -04:00
Wilson Snyder c4f31d3bb6 Tracing: Remove dead code. No functional change intended. 2020-05-17 09:52:03 -04:00
Wilson Snyder 6fd7f45cef Internals: Remove dead needHInlines code 2020-05-16 07:53:27 -04:00
Wilson Snyder 57a937df03 Misc internal coverage cleanups 2020-05-16 07:43:22 -04:00
Wilson Snyder 35a53d9adb Add t_trace_c_api test. 2020-05-15 20:38:08 -04:00
Geza Lore aa9cde22c8
Use SIMD intrinsics to render VCD traces (#2289)
Use SIMD intrinsics to render VCD traces.

I have measured 10-40% single threaded performance increase with VCD
tracing on SweRV EH1 and lowRISC Ibex using SSE2 intrinsics to render
the trace. Also helps a tiny bit with FST, but now almost all of the FST
overhead is in the FST library.

I have reworked the tracing routines to use more precisely sized
arguments. The nice thing about this is that the performance without the
intrinsics is pretty much the same as it was before, as we do at most 2x
as much work as necessary, but in exchange there are no data dependent
branches at all.
2020-04-30 00:09:09 +01:00
Geza Lore b79ef672e1 Various minor optimizations of VCD trace routines
- Change templated trace routines to branch table.

Removed templating from trace chgBus and fullBus and replaced them with
a branch table like the other there is a very small (< 1%) penalty for
this on SwerRV EH1 CoreMark, but this is less than the variability of
disk IO so it's worth it to keep the code simpler and smaller.

- Prefetch VCD suffix buffer at the top of emit*

- Increase ILP in VCD emit* routines

- Use a 64-bit unaligned store to emit the VCD suffix (on x86 only)

The performance difference with these is very small, but the changes
hopefully make this code more performance-portable across various
micro-architectures.
2020-04-27 18:44:53 +01:00
Geza Lore 9991b19610 Another attempt at flushing threaded VCD correctly. 2020-04-25 18:40:09 +01:00
Geza Lore c1665818b9
Fix missing flush with threaded VCD tracing. (#2282)
VerilatedVcdC::openNext() failed to flush the tracing thread before
opening the next output file, which caused t_trace_cat.pl to fail
with --vltmt on occasion.
2020-04-24 03:09:26 +01:00
Geza Lore c52f3349d1
Initial implementation of generic multithreaded tracing (#2269)
The --trace-threads option can now be used to perform tracing on a
thread separate from the main thread when using VCD tracing (with
--trace-threads 1). For FST tracing --trace-threads can be 1 or 2, and
--trace-fst --trace-threads 1 is the same a what --trace-fst-threads
used to be (which is now deprecated).

Performance numbers on SweRV EH1 CoreMark, clang 6.0.0, Intel i7-3770 @
3.40GHz, IO to ramdisk, with numactl set to schedule threads on different
physical cores. Relative speedup:

--trace     ->  --trace --trace-threads 1      +22%
--trace-fst ->  --trace-fst --trace-threads 1  +38% (as --trace-fst-thread)
--trace-fst ->  --trace-fst --trace-threads 2  +93%

Speed relative to --trace with no threaded tracing:
--trace                                 1.00 x
--trace --trace-threads 1               0.82 x
--trace-fst                             1.79 x
--trace-fst --trace-threads 1           1.23 x
--trace-fst --trace-threads 2           0.87 x

This means FST tracing with 2 extra threads is now faster than single
threaded VCD tracing, and is on par with threaded VCD tracing. You do
pay for it in total compute though as --trace-fst --trace-threads 2 uses
about 240% CPU vs 150% for --trace-fst --trace-threads 1, and 155% for
--trace --trace threads 1. Still for interactive use it should be
helpful with large designs.
2020-04-21 23:49:07 +01:00
Geza Lore 39d903375b
Factor out trace implementation common to all formats. (#2268)
This patch de-duplicates common functionality between the VCD and FST
trace implementation. It also enables adding new trace formats more
easily and consistently.

No functional nor performance change intended.
2020-04-19 23:57:36 +01:00
Wilson Snyder d4f7f5297a
Support IEEE time units and time precisions, #234. (#2253)
Includes `timescale, $printtimescale, $timeformat.
VL_TIME_MULTIPLIER, VL_TIME_PRECISION, VL_TIME_UNIT have been removed
and the time precision must now match the SystemC time precision.
To get closer behavior to older versions, use e.g. --timescale-override
"1ps/1ps".
2020-04-15 19:39:03 -04:00
Geza Lore dc5c259069
Improve tracing performance. (#2257)
* Improve tracing performance.

Various tactics used to improve performance of both VCD and FST tracing:
- Both: Change tracing functions to templates to take variable widths as
  template parameters. For VCD, subsequently specialize these to the
  values used by Verilator. This avoids redundant instructions and hard
  to predict branches.
- Both: Check for value changes via direct pointer access into the
  previous signal value buffer. This eliminates a lot of simple pointer
  arithmetic instructions form the tracing code.
- Both: Verilator provides clean input, no need to mask out used bits.
- VCD: pre-compute identifier codes and use memory copy instead of
  re-computing them every time a code is emitted. This saves a lot of
  instructions and hard to predict branches. The added D-cache misses
  are cheaper than the removed branches/instructions.
- VCD: re-write the routines emitting the changes to be more efficient.
- FST: Use previous signal value buffer the same way as the VCD tracing
  code, and only call the FST API when a change is detected.

Performance as measured on SweRV EH1, with the pre-canned CoreMark
benchmark running from DCCM/ICCM, clang 6.0.0, Intel i7-3770 @ 3.40GHz,
and IO to ramdisk:

            +--------------+---------------+----------------------+
            | VCD          | FST           | FST separate thread  |
            | (--trace)    | (--trace-fst) | (--trace-fst-thread) |
------------+-----------------------------------------------------+
Before      |  30.2 s      | 121.1 s       |  69.8 s              |
============+==============+===============+======================+
After       |  24.7 s      |  45.7 s       |  32.4 s              |
------------+--------------+---------------+----------------------+
Speedup     |    22 %      |   256 %       |   215 %              |
------------+--------------+---------------+----------------------+
Rel. to VCD |     1 x      |  1.85 x       |  1.31 x              |
------------+--------------+---------------+----------------------+

In addition, FST trace size for the above reduced by 48%.
2020-04-14 00:13:10 +01:00
Geza Lore 05f213c266
VCD tracing speed improvements (#2246)
* Don't inline VCD dump functions

Improves model speed with tracing. Measured on SweRW cmark:
- GCC 5.5      ~3% faster
- Clang 6.0    ~12% faster (!)

* Remove redundant test from VCD bit tracing.

Improves model speed with tracing. Measured on SweRW cmark:
 - GCC 5.5      ~7.5% faster
 - Clang 6.0    ~1.5% faster
2020-04-09 08:19:26 -04:00
Wilson Snyder e07e9390f6 Internals: clang-format cleanups. No functional change. 2020-04-04 14:09:21 -04:00
Wilson Snyder a13eab55f5 Internals: Add missing VL_DO_CLEARs. No functional change. 2020-04-04 13:06:31 -04:00
Wilson Snyder 38a31ae168 Cleanup misc clang-tidy warnings. No functional change intended 2020-04-03 22:31:54 -04:00
Wilson Snyder 08a51e3e09 Fix VCD open with empty filename, #2198. 2020-03-24 17:32:47 -04:00
Wilson Snyder 1ce360ed5b Add SPDX license identifiers. No functional change. 2020-03-21 11:24:24 -04:00
Wilson Snyder 6c6d70a5e5 Fix FST when multi-model tracing. 2020-03-07 18:39:58 -05:00
Wilson Snyder 30a33a6104 Add support for and , #2126. 2020-03-01 21:39:23 -05:00
Wilson Snyder 0ca0e07354 Internals: Vcd doesn't need code when registering. No functional change intended. 2020-02-29 20:42:52 -05:00
Wilson Snyder 9978cbfa5c Fix tracing -1 index arrays. Closes #2090. 2020-01-08 07:32:31 -05:00
Wilson Snyder f23fe8fd84 Update copyright year. 2020-01-06 18:05:53 -05:00
Wilson Snyder 35e9489f33 Includes: Allow VCD trace of 64 bit array. 2019-12-05 22:25:30 -05:00
Maarten De Braekeleer 977a767477 Avoid tabs in C output.
Signed-off-by: Wilson Snyder <wsnyder@wsnyder.org>
2019-10-04 21:10:53 -04:00
Wilson Snyder 6ffbb7cabf Internals: Remove extra semicolons. No functional change. 2019-06-11 18:31:06 -04:00
Wilson Snyder 96c70ea2df Internals: Fix some long lines in include files. No functional change.
When merging, recommend using "git merge -Xignore-all-space"
2019-05-14 22:49:32 -04:00
Wilson Snyder 8a4aeddbb0 Copyright year update. 2019-01-03 19:17:22 -05:00
Wilson Snyder 304a24d03a Internals: Fix many clang-tidy issues. No functional change intended. 2018-10-14 18:39:33 -04:00
Wilson Snyder 595419b370 Internals: Sort includes for clang-tidy. No functional change intended. 2018-10-14 07:04:18 -04:00
Wilson Snyder e8b8b33ff6 Internals: Cleanup find with chars for clang-tidy. No functional change. 2018-10-13 22:28:59 -04:00
Wilson Snyder f8ae08c0c2 Internals: Fix whitespace. 2018-10-02 06:31:11 -04:00
Wilson Snyder 4684f1d4cc Include sys/types to make sure __WORDSIZE is set, bug1333. 2018-08-31 06:52:12 -04:00
Wilson Snyder 3ed8d968ff Commentary and spacing. No functional change. 2018-08-28 06:30:30 -04:00
Wilson Snyder 0b84222500 Includes: Fix spacing and style in prep for new LXT2. No functional change. 2018-08-22 19:14:06 -04:00
Wilson Snyder 0eb1d0a84e Fix cppcheck warnings. No functional change intended. 2018-06-14 18:59:24 -04:00
Wilson Snyder 02a22c12ea Internals: Add missing [] to delete call in verilated_vcd_c.cpp, bug1309 2018-05-17 07:08:11 -04:00
Wilson Snyder fe917ba7f4 include: Merge misc thread runtime support. 2018-05-13 19:30:51 -04:00
Wilson Snyder 770045676f Internals: Split some extremely long lines. No functional change. 2018-03-10 16:32:04 -05:00
Wilson Snyder 8e65d93d6d Copyright year update. No functional change. 2018-01-02 18:05:06 -05:00
Wilson Snyder add5cc8b56 Internals: Add VL_UNCOPYABLE to make classes uncopyable. No functional change intended. 2017-11-01 18:51:41 -04:00
Wilson Snyder f91bac7b31 Rewrite include libraries to support VL_THREADED towards future threading 2017-10-26 21:51:51 -04:00
Wilson Snyder 5ead61dc7b Unify format of VL_DEBUG print messages 2017-10-24 22:56:58 -04:00
Wilson Snyder 32874fa848 Internals: Misc VCD code cleanups. No functional change. 2017-10-21 17:53:23 -04:00
Wilson Snyder f4b00d3c64 Call VL_PRINTF/vl_stop/vl_finish/vl_fatal through wrappers as hook for future MT use. 2017-10-19 19:40:51 -04:00
Wilson Snyder d21824cbae Internals: Refactor verilated_vcd to move singleton into only .cpp. No functional change intended. 2017-10-14 13:00:25 -04:00
Wilson Snyder de35c90847 Fix float-conversion warning, bug1229. 2017-10-11 19:01:37 -04:00
Wilson Snyder ac4690d74b Internals: Add const's. No functional change 2017-10-10 22:22:17 -04:00
Wilson Snyder 0ac116bb4e Internals: Favor C++ cast style. No functional change intended. 2017-10-07 15:01:19 -04:00
Wilson Snyder 89ac6ab594 Fix memory leak in VerilatedVcd dumps, bug1222 partial. 2017-10-02 18:49:00 -04:00
Wilson Snyder 3f14e649b5 Internals: Use singleton in place of global 2017-09-23 07:44:52 -04:00
Wilson Snyder c2e8062f84 Verilated headers no longer "use namespace std;" 2017-09-23 07:32:37 -04:00
Wilson Snyder 074689b5de SystemPerl mode (-sp-deprecated) has been removed. 2017-09-07 21:08:49 -04:00
Wilson Snyder 70daadf987 Fix cpp-check warnings; support XML format 2 2017-07-06 20:25:59 -04:00
Wilson Snyder e6d7e7e329 Version bump 2017-01-15 12:13:13 -05:00
Wilson Snyder b738d1960a Copyright year update 2016-01-06 20:36:41 -05:00
Wilson Snyder 1891cfd79a Fix rounding in trace , bug946. 2015-07-21 13:22:08 -04:00
Wilson Snyder c0df07c86f Commentary: Update contributor list 2015-03-13 07:38:17 -04:00
Wilson Snyder a0fd065dcf Add VerilatedVcdFile to allow real-time waveforms, bug890. 2015-03-05 08:54:57 -05:00
Wilson Snyder 8323092a0c Fix cppcheck warnings. No functional change. 2015-02-09 21:05:27 -05:00
Wilson Snyder 4c91ade61d Copyright year update 2015-01-07 18:25:53 -05:00
Wilson Snyder d33ad7600b Commentary. Cleanup stale SystemPerl references. 2014-11-23 22:00:00 -05:00
Wilson Snyder 3234fa15ef Fix trace overflow on huge arrays, bug834. 2014-11-05 22:22:27 -05:00
Wilson Snyder 27af9b6b06 Fix clang warnings, bug818. 2014-09-11 21:28:53 -04:00
Wilson Snyder 6cf50e6579 Fix string corruption, bug780. 2014-06-08 21:36:18 -04:00
Wilson Snyder 4422de0c6c Copyright year update. 2014-01-06 19:28:57 -05:00
Wilson Snyder e6808a787c Fix opening a VerilatedVcdC file multiple times, msg1021. 2013-02-23 21:10:25 -05:00
Wilson Snyder f07f6a26a8 cppcheck fixes 2013-02-03 13:27:37 -05:00
Wilson Snyder a8bbf7231b Copyright year update. 2013-01-01 09:42:59 -05:00
Wilson Snyder 50edef4ab2 Add Emacs indentation line. No functional change 2012-04-12 21:08:20 -04:00
Wilson Snyder c2c7c7bd9a Copyright year update 2012-01-15 10:26:28 -05:00
Wilson Snyder 71c1f00ec2 Copyright year update 2011-01-01 18:21:19 -05:00
Wilson Snyder 57d00946be Fix MSVC compile issues 2010-04-10 06:46:24 -04:00
Wilson Snyder ef51de72c9 Fix word size to match uint64_t on -m64 systems, bug238. 2010-04-09 21:51:15 -04:00
Wilson Snyder 495585830d Fix trace files with empty modules crashing some viewers. 2010-03-22 18:38:24 -04:00
Wilson Snyder d780d0aabb Fix flushing VCD buffers on . 2010-03-12 20:00:08 -05:00
Wilson Snyder e40fbf1470 Verilated: Add missing VL_UNLIKELYs 2010-02-03 06:56:20 -05:00
Wilson Snyder 2fd5a6d47f Internals: enum size doesn't apply to non-SystemPerl 2010-02-01 18:37:18 -05:00
Wilson Snyder 8dca56521b Fix MinGW compilation printing %lls, bug214 2010-02-01 09:28:53 -05:00
Wilson Snyder 11e702c430 SystemPerl is no longer required for tracing.
Applications must use VerilatedVcdC class in place of SpTraceVcdC.
2010-01-24 18:37:01 -05:00