__vpiVThrVec4Stack::vpi_get_value_int_ was always treating the thread variable
as unsigned, rather than observing the value of __vpiVThrVec4Stack::signed_flag_.
Not sure why this was done - none of the regression tests broke when I changed
this.
A disable statement can terminate a thread whilst it still has
local variables on the stack (e.g. the loop counter for a repeat
statement). We need to clear the thread stacks when this happens.
Using a UL constant in a unit64_t context does not work on a 32-bit
machine since UL is 32-bits. Instead create uint64_t constants using
static casts and the appropriate bit operators.
These bypass the vec4 stack in some common cases, saving instructions
and vec4 manipulations.
Also, minor improvement to the %flag/set/vec4 statement.
Kill a few warnings.
- Have %pushi/vec4 handle some special cases optimally.
- Eliminate some duplicated method calls in %load/vec4.
- Optimize the vvp_vector4_t::copy_from_ method by inlining
some parts.
The code to get the correct modpath delay for a given edge had the X and Z
entries swapped.
When putting a delay from the VPI the 2 delay to twelve delay mapping was
incorrect and the to/from X delays were also not being calculated correctly.
This adds the runtime support for class properties that are classes
to be arrayed. Add a means to define the dimensions of a property
in the vvp format, and add functions for setting/extracting elements
of a property.
This goes all the way down to the vvp level, where we create support
for arrays of objects, generate the new code in the -tvvp code
generator, and elaborate the arrays in the first place.
Internally, treat the "$" as a special expression type that takes
as an argument the signal that is being indexed. In the vvp target,
use the $last system function to implement this.
This works by translating it to a $size() system function call.
The $size function is already implemented for dynamic queues and
it is easy enough to expand it for queues.
When for example assigning to foo[<x>] within a contitional, and
doing synthesis, we need to create a NetSubstitute device to manage
the l-value bit selects.
Testing with 32-bit clang 3.3, with additional compiler flags
-Wsign-compare -Wundef
this patch eliminates the following warnings:
config.h💯6: warning: 'UINT64_T_AND_ULONG_SAME' is not defined, evaluates to 0 [-Wundef]
vcd_priv2.cc:233:12: warning: duplicate 'extern' declaration specifier [-Wduplicate-decl-specifier]
parse.cc:6496:9: warning: comparison of 0 <= unsigned expression is always true [-Wtautological-compare]
parse.cc:6499:13: warning: comparison of 0 <= unsigned expression is always true [-Wtautological-compare]
parse.cc:6502:9: warning: comparison of 0 <= unsigned expression is always true [-Wtautological-compare]
parse.cc:6510:53: warning: comparison of integers of different signs: 'const unsigned int' and 'int' [-Wsign-compare]
Changing the vlltype elements from unsigned to int reconciles their
type with the native bison YYLTYPE structure.
Before this patch, WARNING_FLAGS applied to both C and C++,
and WARNING_FLAGS_CXX applied to C++ only.
This patch adds a WARNING_FLAGS_CC that applies to C only.
That change should be generally useful; in particular the C
code is almost ready for -Wstrict-prototypes, which does not
apply to C++.
-Wextra (or -W) used to only apply to C++ via WARNING_FLAGS_CXX.
This patch moves it to WARNING_FLAGS, to apply to both C and C++.
Unfortunately, that triggers a ton of warnings.
For now, cover most of the new warnings up by adding
-Wno-unused -Wno-sign-compare -Wno-type-limits
to WARNING_FLAGS_CC. In the long run, I want to change the C coding
style, and take off these disable-warning flags. But those changes
can dribble in as separate commits; this patch is big enough already.
Actually fix a couple missing-field-initializers in libveriuser/veriusertfs.c.
119 formal void parameters added to keep -Wstrict-prototypes happy.
Process found one real missing prototype in vpi/vcd_priv.h:
EXTERN void vcd_names_delete(struct vcd_names_list_s*tab);
8 such warnings left, all in Tony's code
This generates an EQZ LPM device that carries the case-z-ness to
the code generator.
Also add to the vvp code generator support for the EQZ device so
that the synthesis results can be simulated.
Account for the wildcard devices in the sizer.
Second try cleaning up cast-alignment problems surrounding need_result_buf().
Clang gave a bunch of warnings like
vvp/vpi_const.cc:196:34: warning: cast from 'char *' to 'p_vpi_vecval' (aka 't_vpi_vecval *') increases required alignment from 1 to 4 [-Wcast-align]
This version is verbose and changes the prototype for need_result_buf().
But it is semantically (c++) correct, and makes need_result_buf() feel like malloc().
This function was apparently not well tested, because any use of
acc_handle_object() triggered a use of vpi_handle_by_name that was
buggy.
The implementation was awkwardly written, to parts of it were redone.
%exec_ufunc assumed that because a function can never block, a call to
vthread_run() on the function code would only return when the final %end
instruction had been executed. This is not true if the function contains
a named block, which will be executed via a %fork instruction, allowing
the main function thread to suspend after a %join instruction. The fix
is to break %exec_ufunc into two instructions, the first setting the
function inputs and executing the function code, the second collecting
the function result. This provides the opportunity for the parent thread
to suspend after the %exec_ufunc instruction until all its children have
completed.
When performing the initial assignment for a procedural continuous
assignment, any previous continuous assignment to the destination
signal must be unlinked first, otherwise the initial value for the
assignment will propagate to any other nets that are driven by the
original source signal.
In the case that the RHS of a procedural continuous assignment is a simple
vector that is wider than the LHS, changes to the RHS vector cause the
entire vector to be sent to port 1 of the LHS vvp_fun_signal object. This
vector needs to be coerced to the size of the LHS. Note that this is a
stopgap fix until vvp handles arbitrary expressions on the RHS of a
procedural continuous assignment.
Signed vector power operations were being implemented using the double
pow() function. This gave inaccurate results when the operands or
result were not exactly representable by a 64-bit floating point number.
The vvp_vector2_t::pow() function is recursive, and performs a multiplication
operation on each step. The multiplication operator was expanding the result
vector to accomodate the maximum possible result value for the given operand
vectors, thus causing the execution time of the power operation to be
exponentially proportional to the exponent value. Both in this case and
in general, it is unnecessary for the multiplication result vector to be
expanded, as the compiler has already determined the required vector width
during elaboration, and sizes the operand vectors to match.
The vvp_vector2_t constructor that takes a vvp_vector4_t value was
documented as creating a NaN value if the supplied vector contained
any X or Z bits, but instead used the standard Verilog 4-state to
2-state conversion semantics (X or Z translate to 0). I've added an
optional second parameter to the constructor to allow the user to
choose which semantics they want, as both are needed.
This means using some of the new vec4 infrastructure to get at the
data, instead of using the old thread bit pointers. In the process,
remove the vbit and vwid members that pointed to thread bits. Those
bits no longer exist.
Redsign the handling of the return value, including a rework of
the %vpi_func syntax to carry the needed information.
Add a few more arithmetic operator instructions.
The standard explicitly states that only object with a full name
can be searched for by name. A port does not have a full name and
hence should be skipped so that a different object (the signal,
etc.) can be returned. This patch adds code to skip ports when
searching for an object handle by name.
This patch fixes some leaks in the object stack when getting various
class properties. With this fix an assert can be added to verify that
the object stack is clean when a thread is exiting.
This allows for syntax like a.b.c where a is a class with member
b, which is a class with member c, and so on. The handling is mostly
for the support of compound objects like classes.
When a thread that has detached children is reaped the detached children
need to be fully detached so they can be reaped correctly. If they are not
fully detached then they may reference a parent that has already been
reaped (memory freed). Found with valgrind.
Currently vvp only applies the pullup/pulldown for tri1/tri0 nets when
the net is not driven. The correct behaviour is to treat the pullup/
pulldown as an extra driver (with pull strength).
This option is intended to make it easier to compare results from
Icarus with results from other simulators. For now, the only effect
it has is to change the default format for displaying real numbers
when no format string is supplied.
This includes adding support for returning strings from functions,
adding initializing new darray with array_pattern strings, and
assigning an array_pattern of strings to a preallocated darray.
Also fix up support for initializing array with simple string
expression.
When you have an expression like this (extreme example):
a[idx[1]][idx[2]*4 +: 4] <= #(idx[3]) 4'ha;
where a is a reg array and idx is a reg or net array. The retrieval
of idx[2] was clobbering index register 3, which was set before
evaluating the part offset expression, then used in the %set/av of the
array value. (likewise for idx[1] and idx[3]])
To avoid this issue, this patch adds and uses a new instruction
%ix/mov which simply copies one indexed register to another. When
necessary, expressions are first evaluated into temporary registers to
avoid clobbering, then moved in to place before the %*/av instruction.
When a fork/join contains a task, the task completion may become
confused with the completion of another thread if any of the
threads are embedded in the main thread. So always create threads
for all the fork paths, and joins to match.
Instead of just translating a generate scope to a named begin/end scope
this patch creates a generate specific scope (vpiScopeGenerate) that is
of the vpiGenScope type. This may not match the standard 100%, but does
allow the FST dumper to denote generate scopes differently than the
other scope types. Most of the VPI code treats a vpiGenScope just like a
named block so only the FST dumper should have different behavior.
If a VPI call with real arguments has no calltf function, we still
need to pop the arguments off the vthread stack. Similarly, if it
has a real result, we need to push a value onto the vthread stack.
This patch adds support for implicit casts to the elaborate_rval_expr()
function. This will handle the majority of cases where an implicit cast
can occur.
Currently, when a variable expression is passed to a system task,
the expression value is stored in thread memory. Values stored
in thread memory cannot safely be passed to $strobe or $monitor,
because the thread memory may get reused or deallocated before
the $strobe or $monitor task actually executes. As a temporary
measure, we just trap this case and terminate with a "sorry"
message. A proper fix would require the expression value to be
calculated at the time the $strobe or $monitor executes, not at
the time it is called.
This is rather a pointless sort of thing, but it does turn from
from time to time, for example when constant literals (with no x or
z bits) are given strengths. So handle .net8/2s and .net8/2u the
same as .net8.s and .net8 objects.
If a strength aware net has an unambiguous HiZ1 strength, VVP treats
it as a logic '1'. It should be treated as a logic 'z'. An ambiguous
HiZ1/HiZ0 strength should also be treated as a logic 'z'.
When VVP compiles a .array statement for a net array, it does not
know the data type, so initialises the array signed_flag to false.
We need to set the signed_flag to the correct value once we know
the data type, to allow the VPI routines to correctly format the
data.
When checking with valgrind clean up the following:
The arguments for invalid task/function calls.
The simulation callback queues (only needed when the runtime aborts).
Call pthread_exit(NULL) just before exiting to cleanup dynamic loading.
The vvp_darray_real class cal be used for static arrays as well
and this is a more general solution anyhow. Kill the now useless
vvp_realarray_t class.
This provides the ivl_target.h interface for class definitions
and expressions, the vvp code generator support for class objects
and properties, and the vvp run time support. Trivial class objects
now seem to work.
Create stub class objects at the vvp level and generate the code
to invoke that stub. Implement the routines needed to implement
a test for null object references.
This will hopefully improve performance slightly, but also this
intended as a model for what to do when I get around to doing the
same thing to other data types.
Strings, when put into dynamic arrays, are treated as first class
types much line reals. Add the code generator and vvp support for
this situation. Also fix a bug distinguishing between character
selects from strings and select form arrays of strings.
This involves working out the code to get the base type of a select
expression of a darray. Also added the runtime support for darrays
with real value elements.
Clean up the vector4_to_value to use templates and explicit
instantiations. This makes the interface much cleaner for a
wider variety of integral types.
This patch updates the vvp code so it will compile with the valgrind hooks
again. There are still new constructs that need to be cleaned up correctly
and some old constructs were changed enough that the old code no longer
works, but the rest of this can be done as an incremental improvement.
Windows and hence mingw does not follow the standard regarding the return
value of vsnprintf(). The mingw code needs to iteratively search for a
buffer large enough to print the output.
The second call to vsnprintf() needs to have a copy of the argument list
so it can run correctly. On some system vsnprintf() destorys the original
argument list.
When sending a string to a system task/function allocate the space needed
to avoid truncating the string vs using a large fixed buffer.
In vvp allocate and use an exactly sized buffer for the MCD print routine if
the fixed buffer is not large enough. Using a fixed buffer keeps normal
printing fast.
This patch implements the $countdrivers system function. It does not
yet support wires connected to islands (and outputs a suitable "sorry"
message when this is detected).
To implement the $countdrivers system function, we need to be able to
find all the driver values for a given wire. Currently, if a wire has
has more than four drivers, the compiler builds a resolver tree out
of 4-input nodes to resolve the driven values, and there is no way at
run time to work back from the output node to the original driver
values. This patch moves the implementation of the resolver tree into
a single vvp functor (using a mechanism similar to wide functors to
support more than 4 inputs), thus gathering all the driver values into
a single place.
Implement through the ivl core to the ivl_target.h API.
Also draft implementation of creating and storing arrays
in the vvp runtime and code generator.
When string[x] is an l-value, generate code to implement something
like the string.putc(x, ...) method.
Also handle when string[x] is the argument of a system task. In that
case resort to treating it as a calculated 8-bit vector, because that
is what it is.
This also advances support for string expressions in general.
Handle assignments to string variables in the code generator by
trying to calculate a string expression. This involves the new
string object thread details.
In vvp, create the .var/str variable for representing strings, and
handle strings in the $display system task.
Add to vvp threads the concept of a stack of strings. This is going to
be how complex objects are to me handled in the future: forth-like
operation stacks. Also add the first two instructions to minimally get
strings to work.
In the parser, handle the variable declaration and make it available
to the ivl_target.h code generator. The vvp code generator can use this
information to generate the code for new vvp support.
Added: basic vpiPort VPI Objects for vpiModulkes
vpiDirection, vpiPortIndex, vpiName, vpiSize attributes
Since ports do not exist as net-like entities (nets either side
module instance boundaries are in effect connect directly in
the language front-ends internal representation) the port information
is effectively just meta-data passed through t-dll interface and
output as a additional annotation of module scopes in vvp.
Added: vpiLocalParam attribute for vpiParameter VPI objects
Added: support build for 32-bit target on 64-bit host (--with-m32
option to configure.in and minor tweaks to Makefiles and systemc-vpi).
The fork/join list did not adequately support the tree of processes
that can happen in Verilog, so this patch reworks that support to
make it all more natural.