We really want lazy processing of concatenation because it has multiple
inputs and lazy processing should (in theory) prevent redundant and
useless propagations through the net.
But enabling it seems to cause many tests in the regression test suite
to fail to compare their results. There are races in many tests that
are interacting badly with this feature. So for now, ifdef it out.
It doesn't really make any sense to do lazy processing of part selects,
but it is possible to use the part select position to more toroughly
check for changes in output and suppress non-changes. In particular,
we only need to check that the output part actually changes, and by the
way we only need to save those bits for the next go-round.
We do want to make sure that the very first input causes an output,
though, so that time-0 values get propagated.
This is not a solution to all the problems, but is a better catch-all
then what is currently there. Allow the index field to be a T<> that
accesses the thread to get the address index.
Note that the lexor.lex currently returns the T<> as a T_SYMBOL, and the
users of T_SYMBOL objects need to interpret the meaning. This is
probably not the best idea, in light of all the other *<> formats that
now exist.
Part select of parameter names is fixed up to be structurally similar
to part select of signals, and also to behave similarly. (Though not
identically, for reason.)
Part selects to signals are allowed to be off the ends of the signal
itself. The bits that are beyond the vector return X. This may mean
creating constant X bits on one or both ends of the result.
This patch cleans up some of the code to use common compiletf
routines where appropriate. It also adds code to print the
number of extra arguments and cleans up the messages a bit.
The vvp_net_t objects are never deleted, so overload the new operator
to do a more space efficient permanent allocation.
The %assign/v instruction copied the vvp_vector4_t object needlessly
on its way to the scheduler. Eliminate that duplication.
The concat and resolv functors are best evaluated lazily, because each
evaluation is costly and there is a high probability that an evaluation
will be invalidated when new input comes in.
Also optimization the recv_vec4_pv method of the resolver, which is
commonly used, and adjust the order of handling of vvp_fun_part to
work more efficiently.
The vvp_vector8_t constructor and destructor involve memory allocation
so it is best to pass these objects by reference as much as possible.
Also rework the resolver functor to only perform resolution after inputs
are in so that it doesn't get needlessly repeated. This eliminates many
resolve function calls, as well as activations throughout the net.
Also have the islands take more care not to perform resolution if the
inputs aren't really different.
Parameter value ranges support the exclude of a point as well as
range, so add the syntax to support that case. Internally it is
handled as a degenerate range, but the parse and initial elaboration
need to know about it.
If a memory word was accessed before it was defined the
code was returning a zero width vector result. Now it
returns an appropriately sized vector of 'x'.
The memory opcodes %assign/mv, %load/mv and %set/mv
were removed by a previous patch. This one removes
the documentation from opcodes.txt. It also removes
the documentation for the .mem* statements for the
same reason.
Recursive branch resolution was scanning every branch end, even though
many branch ends share ports and need not be repeatedly scanned. Handle
marks and flags to cut off recursion where it is not needed so as to
save much run time.
Conflicts:
tgt-vvp/vvp_scope.c
Note that the draw_net_input.c takes in a lot of the codes that used
to be in vvp_scope.c, so some changes may have been lost.
This patch cleans up the dump routines and adds file and
line number information for errors. It also adds some of
the missing MemoryWord properties so they can now be
dumped and monitored correctly.
This patch adds $simparam and $simparam$str from Verilog-A.
The analog simulator parameters return 0.0 or N/A. The
vvp_cpu_wordsize system function has been moved into the
$simparam call and is now named CPUWordSize.
This patch also starts the factoring of common code in the
vpi directory. Some routines were renamed.
The priv.c file was renamed to sys_priv.c to match the
include file.
System functions can now have strings put to their output.
Rather then join islands while branches are initially created, save the
island creating for the end. This way, the process is actually recursive
and greedy, reliably collecting branches into islands without conflict.
Fold the bi-directional part select into the pass switch (tran) support
so that it can be really bi-directional. This involves adding a new
tranvp device that does part select in tran islands, and reworking the
tran island resolution to handle non-identical nodes. This will be needed
for resistive tran devices anyhow.
The draw_net_input function is modified to account for nexus that is
a port of an island. Draw the ports (and the islands if necessary)
to the island and use the port output for the nexus instead of the
port input. This allows the bi-directional behavior of the port to
interpose itself in the data flow.
In this process of these changes, the draw_net_input function was
reorganized, and all the considerable amount of code for it was
moved to a file of its own. (vvp_scope.c is pretty unruly.)
NetTran devices must be collected into islands because they are all
a bi-directional mass. This is how vvp will process them and the code
generator will need a head start organizing them.
The vvp_island classes are added, as well as support for tranif nodes
that use this concept. The result is a working implementation for
tranif0 and tranif1.
In the process, the symbol table functions were cleaned up and made
into templates for better type safety, and the vvp_net_ptr_t was
generalized so that it can be used by the branches in the island
implementation.
Also fix up the array handling to use the better symbol table support,
and to remember to clear its own table when linking is done.
The code generator was reading the wrong node of a bi-directional
part select. This happens exclusively with part selects passed to
bi-directional ports, so was rare. The result was that the non-part
selected part may get an incorrect value.
The multiply runs does not need to do all the combinations of digit
products, because the higher ones cannot add into the result. Fix the
iteration to limit the scan.