Arrays of vvp_vector4_t values redundantly store some fields in every
word. Create a special type that stores vvp_vector4_t values in a form
that does not duplicate the width of all the items. This can save a lot
of space when big memories are simulated.
The schedule_assign_plucked_vector is a better way to implement the
schedule_assign_vector, or at least no worse, so remove the now
redundent schedule_assign_vector.
It is legal (though worthy of a warning, I think) for the part select
of an l-value to me out of bounds, so replace the error message with
a warning, and generate the appropriate code. In the process, clean
up some of the code for signal l-values to divide out the various kinds
of processing that can be done. This cleans things up a bit.
This is a major rework on the sys_fileio routines. They now have
improved compiletf routines and the calltf routines are now
standardized. Along the way a few bug were fixed as well. Some
updates to other vpi files as well.
I changed the order of the $fputc() arguments to match C and the
rest of the system functions like it ($fungetc, etc.). I recently
fixed $fungetc() so I'm assuming the $fputc() needs the same fix.
It's an Icarus specific function.
The load-and-add for vectors %load/vp0/s can be combined with the
load-and-add for array words, and the %load/avp0/s added to round
out the combinations. This can make for fewer instructions when
words are padded in arithmetic expressions.
The load-and-add for vectors %load/vp0/s can be combined with the
load-and-add for array words, and the %load/avp0/s added to round
out the combinations. This can make for fewer instructions when
words are padded in arithmetic expressions.
The functor_ref_lookup() function fills its argument in with the
vvp_net_t* pointer that matches the var name, so there is no need
to create the vvp_net_t object before then.
The vpiIndex is really just a different view into the same object,
so implement the trickery needed to support a vpiIndex with the
absolute minimum memory cost.
The %load/vp0 instruction adds a signed value to the signal value being
loaded, but it doesn't allow for a signed source vector. Add the
%load/vp0/s instruction that pads the loaded vector, and add the code
generator details to properly use it.
The vvp_net_fun_t objects, and derived objects, are small, and are
created in large quantities. Tightly pack them into permanently
allocated space in order to save on system allocation overhead, and
thus save overall on memory.
Actually, the immediate value handling is a little chaotic and should
be cleaned up. This patch opens the door for allowing signed immediate
values, and uses them in a few places where they are explicitly handled.
We must go through the opcodes that can take immediate values and make
explicit whether they are signed/unsigned/etc, and what their size
limits are.
Part select of an entire signal returns just the NetESignal itself,
but since part selects are always unsigned, even if the selected
signal is signed, we need to cast the NetESignal to unsigned first.
The part select of a vector is converted by the compiler during
elaboration to a 0-based canonical address. But since it is legal
to address bits below the LSB, the canonical address can be negative.
So make the part select base for selecting from signals work with
signed arithmetic and make the code generator generate negative
indices when needed.
Scheduler cells are small objects that come and go in great quantities.
Even though they are allocated and deallocated a lot, they tend to a
steady state quantity, so put together a heap that is unique for each
cell type.
This heap actually saves memory overall because cells are allocated in
chunks, thus eliminating allocator overhead, and they are pulled/pushed
from/to a heap very quickly so that what overhead remains is slight and
bounded.
The vvp_net_t objects are never deleted, so overload the new operator
to do a more space efficient permanent allocation.
The %assign/v instruction copied the vvp_vector4_t object needlessly
on its way to the scheduler. Eliminate that duplication.(cherry picked from commit d0f303463d)
The vvp_vector8_t constructor and destructor involve memory allocation
so it is best to pass these objects by reference as much as possible.
Also have the islands take more care not to perform resolution if the
inputs aren't really different.
NOTE: This is a port of commit 2f4e5bf5b6
from the "performance" branch, without the resolver scheduling changes.
This was causing test suite variances with pr1820472.v. It looks like
there might be a race in that program anyhow, but for now leave out the
resolver scheduling changes so that the rest of this commit can go in.