When performing a signed division or modulus operation using native
arithmetic, trap the special case that the numerator is the minimum
integer value and the denominator is -1, as this gives an undefined
result in C++.
If a thread becomes detached due to a join_any statement, that
thread must not attempt to join its parent, even if the parent
is waiting on a subsequent join statement.
SystemVerilog allows a mixture of procedural and continuous assignments
to be applied to different parts of the same vector. The previous attempt
to make this work for non-blocking assignments was flawed (see preceding
fix for vvp_fun_part_pv::recv_vec4_pv). Instead, handle this case by
converting the non-blocking assignment into a delayed force statement,
which matches the way mixed continuous and blocking assignments are
handled.
The .scope needs to be aware of return types so that the %call/vec4
function knows how to intialize the return value. We also need to
extend the %ret/vec4 to support writing parts of the return value.
Create The %callf/* opcodes to invoke user defined functions in a
more specialized way. This allows for some sanity checking on the
way, and also is a step towards keeping return values on stacks.
"child->delay_delete = 1;" was added, for when building with "Microsoft Visual Studio Express 2015 RC Web" in DEBUG mode, so that pr2909555.v would pass with -strict, otherwise it would cause memory access error will trying to access the previously deleted "child" variable.
vvp_net_t::force_vec4 propagates all bits of the forced value passed
to it, regardless of the mask value. I can't see any way to fix this
directly, so instead make sure anything that calls force_vec4 sets
the unforced bits of the passed value to the correct value.
A disable statement can terminate a thread whilst it still has
local variables on the stack (e.g. the loop counter for a repeat
statement). We need to clear the thread stacks when this happens.
Using a UL constant in a unit64_t context does not work on a 32-bit
machine since UL is 32-bits. Instead create uint64_t constants using
static casts and the appropriate bit operators.
These bypass the vec4 stack in some common cases, saving instructions
and vec4 manipulations.
Also, minor improvement to the %flag/set/vec4 statement.
Kill a few warnings.
- Have %pushi/vec4 handle some special cases optimally.
- Eliminate some duplicated method calls in %load/vec4.
- Optimize the vvp_vector4_t::copy_from_ method by inlining
some parts.
This adds the runtime support for class properties that are classes
to be arrayed. Add a means to define the dimensions of a property
in the vvp format, and add functions for setting/extracting elements
of a property.
This goes all the way down to the vvp level, where we create support
for arrays of objects, generate the new code in the -tvvp code
generator, and elaborate the arrays in the first place.
Internally, treat the "$" as a special expression type that takes
as an argument the signal that is being indexed. In the vvp target,
use the $last system function to implement this.
This works by translating it to a $size() system function call.
The $size function is already implemented for dynamic queues and
it is easy enough to expand it for queues.
%exec_ufunc assumed that because a function can never block, a call to
vthread_run() on the function code would only return when the final %end
instruction had been executed. This is not true if the function contains
a named block, which will be executed via a %fork instruction, allowing
the main function thread to suspend after a %join instruction. The fix
is to break %exec_ufunc into two instructions, the first setting the
function inputs and executing the function code, the second collecting
the function result. This provides the opportunity for the parent thread
to suspend after the %exec_ufunc instruction until all its children have
completed.
When performing the initial assignment for a procedural continuous
assignment, any previous continuous assignment to the destination
signal must be unlinked first, otherwise the initial value for the
assignment will propagate to any other nets that are driven by the
original source signal.
Signed vector power operations were being implemented using the double
pow() function. This gave inaccurate results when the operands or
result were not exactly representable by a 64-bit floating point number.
The vvp_vector2_t::pow() function is recursive, and performs a multiplication
operation on each step. The multiplication operator was expanding the result
vector to accomodate the maximum possible result value for the given operand
vectors, thus causing the execution time of the power operation to be
exponentially proportional to the exponent value. Both in this case and
in general, it is unnecessary for the multiplication result vector to be
expanded, as the compiler has already determined the required vector width
during elaboration, and sizes the operand vectors to match.
The vvp_vector2_t constructor that takes a vvp_vector4_t value was
documented as creating a NaN value if the supplied vector contained
any X or Z bits, but instead used the standard Verilog 4-state to
2-state conversion semantics (X or Z translate to 0). I've added an
optional second parameter to the constructor to allow the user to
choose which semantics they want, as both are needed.