These bypass the vec4 stack in some common cases, saving instructions
and vec4 manipulations.
Also, minor improvement to the %flag/set/vec4 statement.
Kill a few warnings.
- Have %pushi/vec4 handle some special cases optimally.
- Eliminate some duplicated method calls in %load/vec4.
- Optimize the vvp_vector4_t::copy_from_ method by inlining
some parts.
This adds the runtime support for class properties that are classes
to be arrayed. Add a means to define the dimensions of a property
in the vvp format, and add functions for setting/extracting elements
of a property.
This goes all the way down to the vvp level, where we create support
for arrays of objects, generate the new code in the -tvvp code
generator, and elaborate the arrays in the first place.
Internally, treat the "$" as a special expression type that takes
as an argument the signal that is being indexed. In the vvp target,
use the $last system function to implement this.
This works by translating it to a $size() system function call.
The $size function is already implemented for dynamic queues and
it is easy enough to expand it for queues.
%exec_ufunc assumed that because a function can never block, a call to
vthread_run() on the function code would only return when the final %end
instruction had been executed. This is not true if the function contains
a named block, which will be executed via a %fork instruction, allowing
the main function thread to suspend after a %join instruction. The fix
is to break %exec_ufunc into two instructions, the first setting the
function inputs and executing the function code, the second collecting
the function result. This provides the opportunity for the parent thread
to suspend after the %exec_ufunc instruction until all its children have
completed.
When performing the initial assignment for a procedural continuous
assignment, any previous continuous assignment to the destination
signal must be unlinked first, otherwise the initial value for the
assignment will propagate to any other nets that are driven by the
original source signal.
Signed vector power operations were being implemented using the double
pow() function. This gave inaccurate results when the operands or
result were not exactly representable by a 64-bit floating point number.
The vvp_vector2_t::pow() function is recursive, and performs a multiplication
operation on each step. The multiplication operator was expanding the result
vector to accomodate the maximum possible result value for the given operand
vectors, thus causing the execution time of the power operation to be
exponentially proportional to the exponent value. Both in this case and
in general, it is unnecessary for the multiplication result vector to be
expanded, as the compiler has already determined the required vector width
during elaboration, and sizes the operand vectors to match.
The vvp_vector2_t constructor that takes a vvp_vector4_t value was
documented as creating a NaN value if the supplied vector contained
any X or Z bits, but instead used the standard Verilog 4-state to
2-state conversion semantics (X or Z translate to 0). I've added an
optional second parameter to the constructor to allow the user to
choose which semantics they want, as both are needed.
Redsign the handling of the return value, including a rework of
the %vpi_func syntax to carry the needed information.
Add a few more arithmetic operator instructions.
This patch fixes some leaks in the object stack when getting various
class properties. With this fix an assert can be added to verify that
the object stack is clean when a thread is exiting.
This allows for syntax like a.b.c where a is a class with member
b, which is a class with member c, and so on. The handling is mostly
for the support of compound objects like classes.
When a thread that has detached children is reaped the detached children
need to be fully detached so they can be reaped correctly. If they are not
fully detached then they may reference a parent that has already been
reaped (memory freed). Found with valgrind.
This includes adding support for returning strings from functions,
adding initializing new darray with array_pattern strings, and
assigning an array_pattern of strings to a preallocated darray.
Also fix up support for initializing array with simple string
expression.
When you have an expression like this (extreme example):
a[idx[1]][idx[2]*4 +: 4] <= #(idx[3]) 4'ha;
where a is a reg array and idx is a reg or net array. The retrieval
of idx[2] was clobbering index register 3, which was set before
evaluating the part offset expression, then used in the %set/av of the
array value. (likewise for idx[1] and idx[3]])
To avoid this issue, this patch adds and uses a new instruction
%ix/mov which simply copies one indexed register to another. When
necessary, expressions are first evaluated into temporary registers to
avoid clobbering, then moved in to place before the %*/av instruction.
When a fork/join contains a task, the task completion may become
confused with the completion of another thread if any of the
threads are embedded in the main thread. So always create threads
for all the fork paths, and joins to match.
This provides the ivl_target.h interface for class definitions
and expressions, the vvp code generator support for class objects
and properties, and the vvp run time support. Trivial class objects
now seem to work.
Create stub class objects at the vvp level and generate the code
to invoke that stub. Implement the routines needed to implement
a test for null object references.
This will hopefully improve performance slightly, but also this
intended as a model for what to do when I get around to doing the
same thing to other data types.
Strings, when put into dynamic arrays, are treated as first class
types much line reals. Add the code generator and vvp support for
this situation. Also fix a bug distinguishing between character
selects from strings and select form arrays of strings.
This involves working out the code to get the base type of a select
expression of a darray. Also added the runtime support for darrays
with real value elements.
Clean up the vector4_to_value to use templates and explicit
instantiations. This makes the interface much cleaner for a
wider variety of integral types.
Implement through the ivl core to the ivl_target.h API.
Also draft implementation of creating and storing arrays
in the vvp runtime and code generator.
When string[x] is an l-value, generate code to implement something
like the string.putc(x, ...) method.
Also handle when string[x] is the argument of a system task. In that
case resort to treating it as a calculated 8-bit vector, because that
is what it is.
This also advances support for string expressions in general.
Handle assignments to string variables in the code generator by
trying to calculate a string expression. This involves the new
string object thread details.
In vvp, create the .var/str variable for representing strings, and
handle strings in the $display system task.
Add to vvp threads the concept of a stack of strings. This is going to
be how complex objects are to me handled in the future: forth-like
operation stacks. Also add the first two instructions to minimally get
strings to work.
In the parser, handle the variable declaration and make it available
to the ivl_target.h code generator. The vvp code generator can use this
information to generate the code for new vvp support.
The fork/join list did not adequately support the tree of processes
that can happen in Verilog, so this patch reworks that support to
make it all more natural.
The clang compiler does not like using struct to reference a class object.
This patch removes all the struct keywords for __vpiNamedEvent objects
since they are now a class and can be called without a struct/class
qualifier.
This patch also removes all the extra class qualifiers from the rest of
the source code.
The clang compiler does not like mixing class and struct references. This
patch updates all the struct __vpiHandle, etc. to use class since that is
how they are now defined.
Instead of C-like data structures where the __vpiHandle base is a
leading member, make the __vpiHandle a derived class. Give the base
class a virtual destructor so that dynamic_cast works reliably, and
now pretty much all of the junk for testing if an object really is
of the derived class goes away. Also, problems with casting up to
a vpiHandle become trivial non-issues.
Now we have a code generator that can handle compressed assignments
as they have been re-imagined in elaboration. There are some cases
that are not yet supported, we'll patch them up in due course.
The power operator defines 2**-1 and -2**-1 to be zero. This patch fixes
both the procedural and continuous assignments to work correctly. It also
fixes a problem in the compiler power code so that the one constant value
always has at least two bits.
This patch adds support for tracing procedural statement execution in vvp.
This is accomplished by adding a new opcode that is inserted before the
code that represents a procedural statement. These opcodes also trigger
a message whenever time advances. By default these opcodes are not added.
To add them, pass the -pfileline=1 flag to the compiler. In the future we
may add support for turning the debug output on and off once the opcodes
have been added with a system task or from the interactive prompt.
Currently the vvp target emits multiple single bit %mov instructions
to perform sign extension. This patch adds a new %pad instruction
that allows sign extension to be performed with just one instruction.
This patch changes all the iterator code to use a prefix ++ instead
of postfix since it is more efficient (no need for a temporary). It
is likely that the compiler could optimize this away, but lets make
it efficient from the start.
This patch adds -Wextra to the compilation flags for C++ files in
the vvp and vpi subdirectories. It also fixes all the problems
found while adding -Wextra. This mostly entailed removing some of
the unused arguments, removing the name for others and using the
correct number of initializers.
The functions (malloc, free, etc.) that used to be provided in
malloc.h are now provided in cstdlib for C++ files and stdlib.h for
C files. Since we require a C99 compliant compiler it makes sense
that malloc.h is no longer needed.
This patch also modifies all the C++ files to use the <c...>
version of the standard C header files (e.g. <cstdlib> vs
<stdlib.h>). Some of the files used the C++ version and others did
not. There are still a few other header changes that could be done,
but this takes care of much of it.
Tran islands must do their calculations using the forced values,
if any. But the output from a port must also be subject to force
filtering. It's a little ugly, but hopefully won't hurt the more
normal case.
The vpi_get_value() function should not crash when called during
the compiletf phase. This patch fixes this by returning 'bx for
any vectors in thread space. It also fixes some other minor things
that my test code uncovered. Most of the other objects work as
expected.
The scope thread rework broek --with-valgrind builds due to the
different handling of the list of threads. Rework valgrind enabled
handling of the thread set within a scope.
The scope contains the threads running within. The rework of this
patch allows all threads to know their scope, and cleans up the
handling of threads listed in the scope.