These bypass the vec4 stack in some common cases, saving instructions
and vec4 manipulations.
Also, minor improvement to the %flag/set/vec4 statement.
Kill a few warnings.
- Have %pushi/vec4 handle some special cases optimally.
- Eliminate some duplicated method calls in %load/vec4.
- Optimize the vvp_vector4_t::copy_from_ method by inlining
some parts.
This adds the runtime support for class properties that are classes
to be arrayed. Add a means to define the dimensions of a property
in the vvp format, and add functions for setting/extracting elements
of a property.
This goes all the way down to the vvp level, where we create support
for arrays of objects, generate the new code in the -tvvp code
generator, and elaborate the arrays in the first place.
Internally, treat the "$" as a special expression type that takes
as an argument the signal that is being indexed. In the vvp target,
use the $last system function to implement this.
This works by translating it to a $size() system function call.
The $size function is already implemented for dynamic queues and
it is easy enough to expand it for queues.
%exec_ufunc assumed that because a function can never block, a call to
vthread_run() on the function code would only return when the final %end
instruction had been executed. This is not true if the function contains
a named block, which will be executed via a %fork instruction, allowing
the main function thread to suspend after a %join instruction. The fix
is to break %exec_ufunc into two instructions, the first setting the
function inputs and executing the function code, the second collecting
the function result. This provides the opportunity for the parent thread
to suspend after the %exec_ufunc instruction until all its children have
completed.
When performing the initial assignment for a procedural continuous
assignment, any previous continuous assignment to the destination
signal must be unlinked first, otherwise the initial value for the
assignment will propagate to any other nets that are driven by the
original source signal.
Signed vector power operations were being implemented using the double
pow() function. This gave inaccurate results when the operands or
result were not exactly representable by a 64-bit floating point number.
The vvp_vector2_t::pow() function is recursive, and performs a multiplication
operation on each step. The multiplication operator was expanding the result
vector to accomodate the maximum possible result value for the given operand
vectors, thus causing the execution time of the power operation to be
exponentially proportional to the exponent value. Both in this case and
in general, it is unnecessary for the multiplication result vector to be
expanded, as the compiler has already determined the required vector width
during elaboration, and sizes the operand vectors to match.
The vvp_vector2_t constructor that takes a vvp_vector4_t value was
documented as creating a NaN value if the supplied vector contained
any X or Z bits, but instead used the standard Verilog 4-state to
2-state conversion semantics (X or Z translate to 0). I've added an
optional second parameter to the constructor to allow the user to
choose which semantics they want, as both are needed.
Redsign the handling of the return value, including a rework of
the %vpi_func syntax to carry the needed information.
Add a few more arithmetic operator instructions.
This patch fixes some leaks in the object stack when getting various
class properties. With this fix an assert can be added to verify that
the object stack is clean when a thread is exiting.
This allows for syntax like a.b.c where a is a class with member
b, which is a class with member c, and so on. The handling is mostly
for the support of compound objects like classes.
When a thread that has detached children is reaped the detached children
need to be fully detached so they can be reaped correctly. If they are not
fully detached then they may reference a parent that has already been
reaped (memory freed). Found with valgrind.
This includes adding support for returning strings from functions,
adding initializing new darray with array_pattern strings, and
assigning an array_pattern of strings to a preallocated darray.
Also fix up support for initializing array with simple string
expression.
When you have an expression like this (extreme example):
a[idx[1]][idx[2]*4 +: 4] <= #(idx[3]) 4'ha;
where a is a reg array and idx is a reg or net array. The retrieval
of idx[2] was clobbering index register 3, which was set before
evaluating the part offset expression, then used in the %set/av of the
array value. (likewise for idx[1] and idx[3]])
To avoid this issue, this patch adds and uses a new instruction
%ix/mov which simply copies one indexed register to another. When
necessary, expressions are first evaluated into temporary registers to
avoid clobbering, then moved in to place before the %*/av instruction.
When a fork/join contains a task, the task completion may become
confused with the completion of another thread if any of the
threads are embedded in the main thread. So always create threads
for all the fork paths, and joins to match.
This provides the ivl_target.h interface for class definitions
and expressions, the vvp code generator support for class objects
and properties, and the vvp run time support. Trivial class objects
now seem to work.
Create stub class objects at the vvp level and generate the code
to invoke that stub. Implement the routines needed to implement
a test for null object references.