Clarify that operands are typically 32bits, and have the code generator
make better use of this.
Also improve the %movi implementation to work well with marger vectors.
Add the %andi instruction to use immediate operands.
Use high radix long division to take advantage of the divide hardware
of the host computer. It looks brute force at first glance, but since
it is using the optimized arithmetic of the host processor, it is much
faster then implementing "fast" algorithms the hard way.
This instruction adds an integer value to the value being loaded. This
optimization uses subarrays instead of the += operator. This is faster
because the value is best loaded into the vector as a subarray anyhow.
The AND and OR operators for vvp_bit4_t are slightly tweaked to be
lighter and inlinable.
The vvp_vector4_t::set_bit is optimized to do less silly mask fiddling.
When processing wide vectors of these operations, it pays to process
them as vectors. This improves run-time performance. Have the run time
select vectorized or not based on the vector width.
Improve vvp_vector4_t methods copy_bits and the part selecting constructor
to make better use of vector words. Eliminate bit-by-bit processing by
these methods to take advantage of host processor words.
Improve vthread_bits_to_vector to use these improved methods and Update
the %load/av and %set/v instructions to take advantage of these changes.
The MinGW system() implementation appears to return the straight
return value instead of the waitpid() like result that more
normal systems return. Because of this just return the system()
result without processing for MinGW compilations.
Older version of the MinGW runtime (pre 3.14) just used the
underlying vsnprintf(). Which has some problems. The 3.14 version
has some nice improvements, but it has a sever bug when processing
"%*.*f", -1, -1, <some_real_value>. Because of this we need to use
the underlying version without the enhancements for now.
snprintf prints %p differently than the other printf routines
so use _snprintf to get consistent results.
Only build the PDF files if both man and ps2pdf exist.
MinGW does not know about the z modifier for %d, %u, etc.
Add some missing Makefile check targets.
These instructions can take advantage of the much optimized
vector_to_array function to do their arithmetic work quickly and
punt on X very quickly if needed. This helps some benchmarks.
Functions like $monitor need to attach callbacks to array words if
those words are to be monitored. Have the array hold all the callbacks
for words in the array, under the assumption that the monitored words
are sparse.
Array words don't have a vpiHandle with a label, so the %vpi_call
needs a special syntac for arguments that reference array words.
This syntax creates an array word reference that persists and can
be used at a VPI object by system tasks.
It is possible for the code generator to create .array/port objects
before the .array object that the port refereces, so use the resolv_list
to arrange for binding during cleanup.
Save tons of space per memory word by not creating a vpi handle for
each and every word of a variable array. (Net arrays still get a
vpiHandle for every word.) The consequence of this is that all
accesses to a variable array need to go through the indexing.
This commit handles the most common places where this impacts, but
there are still problems.
vpi_handle_by_name() was assuming it was always given a valid scope
object. In the context of vpi_chk_error() this is not required and
some users use/abuse the interface by calling the function with invalid
objects expecting a 0 return value. This patch adds an explicit check
for the supported types vpiScope and as an extension vpiModule.
Anything else should be flagged as an error once we have vpi_chk_error()
implemented, but for now it just returns 0.
The abs() function needs to be able to turn -0.0 into 0.0. This proved
to be too clunky (and perhaps impossible) to do with tests and jumps,
so add an %abs/wr opcode to do it using fabs().
The min/max functions need to take special care with the handling
of NaN operands. These matter, so generate the extra code to handle
them.
This patch adds file and line information for parameters and
local parameters. It also adds file/line stubs for signals in
the tgt-* files. It adds the pform code needed to eventually
do genvar checks and passing of genvar file/line information.
It verifies that a genvar does not have the same name as a
parameter/local parameter.
This patch makes sure that objects either support vpiFile
and vpiLineNo or adds dummy code so that a runtime error
will not occur when accessing these properties. It also
returns 1 for the size of real variables and adds a
simplified vpiIndex that matches the Memory interface.
This patch adds code to push the file and line information
for scope objects (modules, functions, tasks, etc.) to the
runtime. For modules, this includes the definition fields.
This patch adds ifnone functionality. It does not produce an
error when both an ifnone and an unconditional simple module
path are given. For this case the ifnone delays are ignored.
Fix handling of cases where multiple specified delays are activated
for a given output. Need to apply the standard selection criteria
that gets the minimum value.
The %sub instruction didn't have the efficent implementation that
the %add instructions used. Update subtraction to use the array
method, so that it gets the same performance benefits.
The vvp_vector4_t often receives the results of vector arithmetic.
Add an optimized method for setting that data into the vector. Take
into account that arithmetic results have no X/Z bits, etc.
On some systems, 1UL<<X will make a mess if X is the size of
an unsigned long. This especially seems to be a problem on i386
systems. Protect those shifts in the vvp_net.cc.
By slightly altering the vvp_bit4_t encoding, a few simple
optimizations become possible. By making Z==2 and X==3, the
conversion from X/Z to X is a simple shift-or, and this can
be used to reduce the size of some of the bit4 operators.
In previous incarnations of the vvp runtime, bit vectors were passed
around as arrays of unsigned char that charried bit4 vectors. That
is no longer used. Remove the last vestiges of that dead code.
Remove dependencies on vvp_bit4_encoding outside of the vvp_net
core types. The table_functor_s class was the worst offfender and
was barely used, so it is now removed completely. There are a few
opcodes in vhtread.cc that also make vvvp_bit4_t encoding
assumptions (and used casts) and those have been fixed. There
were also various VPI interface functions that are fixed.
The vvp_vector4_t holds 4-value logic. This patch changes the encoding
of 4-value bits in the vector to use separate A- and B bit vectors,
with the B- vector signaling the A- bits that are not 0/1. This
allows rapid conversion to 2-value logic, and rapid tests for X
and Z values.
This patch fixes deassign to allow it to unlink from a driver.
It also zeros the cassign_link and force_link pointers after
they have been unlinked. Not doing this will cause an assert
if deassign/release are called multiple times (variable only).
This patch adds the ability to assign/deassign a bit or part select.
It also cleans up the code and fixes some problem in the forcing of
strength aware nets.
Rework the MUXZ and MUXR code to use an enum instead of plain
integers for the select input state. This makes it more obvious
what is actually going on.
This patch adds a flag to the MUXZ object to make sure that it will
run at time zero if needed. If this is not done the default Z result
may not be overridden by an X result.
This patch adds a %assign/av/d opcode. This is a version of %assign/av
that allows a delay expression. Ultimately this allows a dynamically
indexed array to have a delay expression (non-constant delay value).