The -G option now correctly parses simple integer literals as signed
numbers, which is in line with the standard and is significant when
overriding parameters without a type specifier.
Fixes#3060
* Move MTaskState to ThreadSchedule
MTaskState does not concern itself with sandbagging, and thus solely contains information related to the finalized schedule, i.e., completion time, thread ID and next MTask on thread.
* Add .dot graph visualization of ThreadSchedule
Follow-up to #2779.
This commit adds the creation of .dot files - used by GraphViz - to visualize how mtasks are statically scheduled across the set of specified threads.
We visualize each thread as a row, with nodes of a row being the mtasks scheduled for the given thread. The width of the mtask nodes are proportional to their cost. MTask dependencies are shown using an edge between the source and sink mtasks.
Tests used to silently pass when vcddiff aborted. Now fixed. Updated
large array trace reference files for FST, added same reference files
for VCD.
Developers need to update their local vcddiff.
We incorrectly treated two different struct types the same when passed
as an actual parameter to a `parameter type` parameter in an instance,
if the actual parameter expression both hash to the same value and the
structs have the same struct name. This is now corrected.
Fixes#3055.
This commit adds the '--simbenchmark' option to the regression test compile command.
The option is not intended as a fully-fledged benchmarking infrastructure, but rather a
utility for easily generating cycle- and execution time information when executing a verilated test.
As an example use case, the included test file shows how optimization level is varied across
three different builds+simulations, with the statistics for each run output to the same file in
the output directory.
Future work:
- 'sim_time' in the generated top-level main file should be a parameter.
- Given the above, the test execution script from verilog-sim-benchmark can be integrated
to generate better estimates of cycles/second through varying 'sim_time' over multiple executions.
This patch implements #3032. Verilator creates a module representing the
SystemVerilog $root scope (V3LinkLevel::wrapTop). Until now, this was
called the "TOP" module, which also acted as the user instantiated model
class. Syms used to hold a pointer to this root module, but hold
instances of any submodule. This patch renames this root scope module
from "TOP" to "$root", and introduces a separate model class which is
now an interface class. As the root module is no longer the user
interface class, it can now be made an instance of Syms, just like any
other submodule. This allows absolute references into the root module to
avoid an additional pointer indirection resulting in a potential speedup
(about 1.5% on OpenTitan). The model class now also contains all non
design specific generated code (e.g.: eval loops, trace config, etc),
which additionally simplifies Verilator internals.
Please see the updated documentation for the model interface changes.
These are necessary to link the executables. So far we have been saved
by one of the generated headers forward declaring these functions with
extern "C", but changing that header would break these tests.
* Add a test to reproduce #3023. Also applied verilog-mode formatting.
* use unique_ptr. No functional change is intended.
* Introduce restorer that reverts changes during iterate() if failed.
The goal of this patch is to move functionality related to constructing
the thread entry points and then invoking them out of V3EmitC (and into
V3Partition). The long term goal being enabling V3EmitC to emit
functions partitioned based on header dependencies. V3EmitC having to
deal with only AstCFunc instances and no other magic will facilitate
this.
In this patch:
- We construct AstCFuncs for each thread entry point in
V3Partition::finalize and move AstMTaskBody nodes under these functions.
- Add the invocation of the threads as text statements within the
AstExecGraph, so they are still invoked where the exec graph is located.
(the entry point functions are still referenced via AstCCall or
AstAddOrCFunc, so lazy declarations of referenced functions are created
automatically).
- Explicitly handle MTask state variables (VlMTaskVertex in
verilated_threads.h) within Verilator, so no need to text bash a lot of
these any more (some text refs still remain but they are all created
next to each other within V3Partition.cpp).
The effect of all this on the emitted code should be nothing but some
identifier/ordering changes. No functional change intended.
What previously used to be per module static constants created in
V3Table and V3Prelim are now merged globally within the whole model and
emitted as part of a separate constant pool. Members of the constant
pool are global variables which are declared lazily when used (similar to
loose methods).
This patch introduces the concept of 'loose' methods, which semantically
are methods, but are declared as global functions, and are passed an
explicit 'self' pointer. This enables these methods to be declared
outside the class, only when they are needed, therefore removing the
header dependency. The bulk of the emitted model implementation now uses
loose methods.
Using the standard model Makefile, when in addition to an explicit
target, the target 'ccache-report' is also given, a summary of ccache
hits/misses during this invocation of 'make' will be prited at the end
of the build.
A few names were incorrectly mangled, which made --protect-ids produce
invalid output when certain SV class constructs were uses. Now fixed and
added a few extra tests to catch this.
V3Reloop now can roll up indexed assignments between arrays if there is a
constant offset between indices on the left and right hand sides, e.g.:
a[0] = b[2];
a[1] = b[3];
...
a[x] = b[x + 2];
* Tests: Add more case that does not match native C++ width (8, 16, 32 or 64).
* Use AstVarRef::same() instead of AstNode::sameGateTree() because the latter checks dtype in addition to scope.
AstVarRef may have different minWidth in some cases,
but the difference should be ignored in the context of bitOpTree optimization.
This support code merely adds the capability to skip over the encrypted
parts. Many models have unencrypted module interfaces with ports, and
only encrypt the critical parts.
* Tests: Add another testcase that triggers assertion failure in bitOpTree opt.
* Fix assertion failure in bitOpTree opt reported in #2891. Consider the follwoing case.
CCast -> WordSel -> VarRef(leaf)
* Make sure that m_bitPolarity is expanded enough.
The intention was to not merge impure assignments, but the actual
predicate failed if the assignment was indeed pure.
This fix gains 1.5% speed on SweRV EH1.
This will still warn if a case item is completely covered by previous
items, but will no longer complain about overlaps like this:
priority casez (foo_i)
2'b ?1: bar_o = 3'd0;
2'b 1?: bar_o = 3'd1;
Before, there was a warning for the second statement because the first
two patterns match 2'b11.
Add V3OptionsParser that can suggest correct option.
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
Co-authored-by: github action <action@example.com>
** Add simulation context (VerilatedContext) to allow multiple fully independent
models to be in the same process. Please see the updated examples.
** Add context->time() and context->timeInc() API calls, to set simulation time.
These now are recommended in place of the legacy sc_time_stamp().
* Tests:Add some more signals to t_const_opt_red.v
* Optimize bit op trees such as ~a[2] & a[1] & ~a[0] to 3'b010 == (3'111 & a)
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Apply clang-format
* Don't edit and-or tree.
* Call matchBitOpTree() after V3Expand does its job.
* Internals: Rename newNodep -> newp. No functional change is intended.
* Internals: Remove m_sels. No functional change is intended.
* Internals: Remove stringstream. No functional change is intended.
* Internals: Use V3Number::bitIs1(). No functional change is intended.
* Internals: Use V3Number instead of std::map. Resolved laek. Result should be same.
* Internals: Resolve overload of setPolarity. No functional change is intended.
* Internals: Pass failure reason. No functional change is intended.
* Internals: Add VNUser::toPtr(). No functional change is intended.
* Internals: Use user4 instead of std::map. No functional change is intended.
* Catch up with the AST style aftre V3Expand
* Internals: Rename Context to VarInfo. No functional change is intended.
* Add some more test case
* tests:Add stats to tests
Update stats in t_merge_cond.pl as matchBitopTree does some of them.
* insert CCast if necessary
* small optimization to remove redundant bit mask
* No quick exit even when unoptimizable node is found.
* Simplify removing redundant And
* simplify
* Revert "Internals: Add VNUser::toPtr(). No functional change is intended."
This reverts commit f98dce10db.
* Consider AstWordSel and cleanup
* Update test
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Update src/V3Const.cpp
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Apply clang-format
* Internals: rename variables. No functional change is intended.
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
Co-authored-by: github action <action@example.com>
When using a "if" statement inside an always block, part of the code may
be unreachable. This can be used to avoid errors, but it generated an
error, this commit demotes this to a warning. Partly fixes#2625.
* Test hierarchical block that is explicitly set its default parameter value.
* Fix hierarchical verilation when a hierarchical block is instantiated with explicit setting of the default value.
Parameterized hierarchical block must have mangled name even when all parameters have default value,
otherwise the parameterized module will be hidden by protect-lib wrapper.
* rename variable names. No functional change is intended.
* Add a test for hierarchical verilation without timescale
* Emit timeunit in hierarchical wrapper only when it is specified in the input design or command line option.
* Update src/V3AstNodes.h
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
Co-authored-by: Wilson Snyder <wsnyder@wsnyder.org>
* Fix memory leak of t_trace_cat and t_trace_cat_renew
* Fix memory leak of t_trace_c_api
* Fix memory leak in t_trace_public_func and t_trace_public_func_vlt
* Fix memory leaks in t_flat_build (and probably more).
* Use unique_ptr in testcases
* Test that indexing into memory word fails
* VPI: don't index into memory word
Memory and memory words share a VerilatedVar, so check for memory word before attempting to index.
* Test that range iterator exhausts after 1 dimension
* Separate range iterator from range object
Range iterators should return NULL after all ranges have been returned
* Add a test to use unpacked array port in hierarchical verilation and protect-lib.
* V3EmitV supports unpacked array variables
* Can Emit local unpacked array properly
* Update golden of t_debug_emitv
* Support unpacked array port in protect-lib
* Remove t_prot_lib_unpacked_bad test as unpacked array is supported now.
* Add tests for unpacked array in DPI-C
* Add more generic parameter generator to AstNodes
* Supports multi dimensional array in DPI ( DPI argmuments <=> Verilator internal type conversion)
consider typedef in V3Task
fix export test
fix inout for scalar
support export func of time
* V3Premit does not show an error for wide words nor ArraySel
* Unnecessary pack func for unapcked array does not appear anymore
* Support unpacked array in runtime header
- Add an overload for lvalue VL_CVT_PACK_STR_NN
- Allow conversion from void *
* touch up tests for codacy advices
* resolve free functions. no functional change intended.
The end iterator always points to an element past the end of the list.
When new elements are added to the back of the list, they are inserted
before the end iterator.
Instead, track the last element in the list at the start of processing
and stop after it's been processed.
This patch normalizes what the tests do before exiting. After this
change each test should call final on the top module and explicitly
free the top module object before exiting.
* Add a test to use string for $fgets
* Use dedicated function for $fgets to std::string
* share the implementation of $fgets
* Pass -1 for bitwidth of std::string to distinguish from POD
* add checks for scanf with string
* apply clang-format
* hier_block with cmake test doesn't assume prefix now.
* Add space between files
* don't set -Mdir on cmake build as it will be set by DIRECTORY option
* Use top target name instead of prefix
* Add a test to make sure that lib modules (loaded via -y option) can be a hier_block.
* Add HDL file of the hier_block to the source list if the module is loaded via -y option.
(Each hier_block is treated as a top module when processing the hier_block.)
* Use "\n" for delimiter as the other files
* Add a test to use shared object of protect-lib
* Add a guard to call ctor/dtor just once even when a protec-lib is shared object.
* Pass .a to linker in leaf-last order for older ld.
* Add -flat_namespace for mac
* Don't use thin-archive to merge multiple archives because it is gnu-ar specific.
* remove -time option that may caues -Wunused-command-line-argument warning
* test:Add more tests for checking split_var for unpacked array.
* Fix range calculation of SliceSel and change to UASSERT_OBJ because the range check is done in V3Width beforehand.
cint.mainInt(nodep) walks the tree and populates m_ctorVarsVec.
Reuse EmitCImp cint for the slow mainImp() emition steps to make sure
we emit constructor calls to setup SystemC sc_module names.
* add a test for protect-lib without sequential logic that cause the follwoing error.
%Error: obj_vlt/t_prot_lib_comb/secret/secret.sv:78:14: syntax error, unexpected ')'
78 | always @() begin
| ^
* protect-lib wrapper must contain sequential signal related statements
only if the protected design is a sequential logic.
Diff looks big, but actually just add "if (m_hasClk)" condition.
Signed-off-by: Yutetsu TAKATSUKASA <y.takatsukasa@gmail.com>
- Do try to merge after assignment to condition when possible.
- Do not try to merge reduced form if not the expected statement.
This used to cause a crash.
Change the Travis builds to use workspaces and persistent ccache
We proceed in 2 stages (as before, but using workspaces for
persistence):
1. In the 'build' stage, we clone the repo, build it and
save the whole checkout ($TRAVIS_BUILD_DIR) as a workspace
2. In the 'test' stage, rather than cloning the repo, multiple jobs
pull down the same workspace we built to run the tests from
This enables:
- Reuse of the build in multiple test jobs (this is what we used the Travis
cache for before)
- Each job having a separate persistent Travis cache, which now only
contains the ccache. This means all jobs, including 'build' and 'test'
jobs can make maximum use of ccache across runs. This drastically cuts
down build times when the ccache hits, which is very often the case for
'test' jobs. Also, the separate caches only store the objects build by
the particular job that owns the cache, so we can keep the per job
ccache small.
If the commit message contains '[travis ccache clear]', the ccache will
be cleared at the beginning of the build. This can be used to test build
complete within the 50 minute timeout imposed by Travis, even without a
persistent ccache.
This provides minor simulation performance benefit, but can provide
large C++ compilation time improvement, notably with Clang (4x).
This patch implements #2366 .
This allows compiling the run-time library with optimization even when OPT_FAST is not used in order to imporove model build speed, possibly during debug cycles.
This adds the flag --generate-waivefile <filename>. This will generate
a verilator config file with the proper lint_off statemens to turn off
warnings emitted during this particular run.
This feature can be used to start with using Verilator as linter and
systematically capture all known lint warning for further
elimination. It hopefully helps people turning of -Wno-fatal or
-Wno-lint and gradually improve their code base.
Signed-off-by: Stefan Wallentowitz <stefan.wallentowitz@hm.edu>
--output-split is now on by default with value 20000.
--output-split-cfuncs and --output-split-ctrace now defaults to the
value of --output-split unless explicitly specified.
Instead of __ALLfast.cpp and __ALLslow.cpp, we now create only a single
__ALL.cpp and compile it with OPT_FAST, this speeds up small builds
where the C compiler does not dominate. A separate patch will follow
turning VM_PARALLEL_BUILDS on by default at a certain size.
Given this change to the build there is now no point in emitting both
fast and slow routines into the same .cpp file when --output-split is
not set as they will be just included in the same __ALL.cpp file. To
keep things simpler and the output easier to comprehend, V3EmitC has
also been changed to always emit the fast and slow files separately.
Also change verilated.mk to apply OPT_SLOW to all slow files, not just
ones called *__Slow.cpp. This change in particular ensures __Syms.cpp
is build as slow.
Part of #2360.
- Improve flag pruning heuristic
- Set all trace activity flags in slow code. This in turns enables us
to remove checking the slow flag on the fast path.
Firstly, we always use a byte array for fine grained activity flags
instead of a bit vector (we used to use a byte array only if we had
parallel mtasks). The byte vector can be set more cheaply in eval,
closing about 1/3 of the gap in performance between compiling with
or without --trace on SweRV EH1. The speed of tracing itself is not
measurably different.
Secondly, we prune the activity tracking such that if a set of activity
flag combinations only guard a small number of signals, we will turn
those signals into awayls traced signals. This avoids code which
sometimes tests dozens of activity flags just to subsequently check one
signal and dump it if it's value changed. We can just check the signal
state straight instead, and not bother with the flags. This removes
about 30% of activity flags in SweRV EH1, and makes both single threaded
VCD and FST tracing 8-9% faster.