Change from equality check to >= to allow cells where output
is wider than natural width (zero-extended). Only reject cells
with Y_WIDTH < natural width (truncated).
This fixes test failures while still preventing the truncation
issue identified in widlarizer's feedback.
Only allow rebalancing of cells with "natural" output widths (no truncation).
This prevents equivalence failures when moving operands between adders
with different intermediate truncation points.
For each operation type, the natural width is:
- Addition: max(A_WIDTH, B_WIDTH) + 1 (for carry bit)
- Multiplication: A_WIDTH + B_WIDTH
- Logic ops: max(A_WIDTH, B_WIDTH)
Fixes widlarizer's counterexample in YosysHQ/yosys#5605 where an 8-bit
intermediate wire was intentionally truncating adder results, and
rebalancing would change where that truncation occurred.
This pass converts cascaded chains of arithmetic and logic cells ($add,
$mul, $and, $or, $xor) into balanced binary trees to improve timing
performance in hardware synthesis.
The optimization uses a breadth-first search approach to identify chains
of compatible cells, then recursively constructs balanced trees that
reduce the critical path depth.
Features:
- Supports arithmetic cells: $add, $mul
- Supports logic cells: $and, $or, $xor
- Command-line options: -arith (arithmetic only), -logic (logic only)
- Preserves signed/unsigned semantics
- Comprehensive test suite with 30 test cases
Original implementation by Akash Levy <akash@silimate.com> for Silimate.
Upstreamed from https://github.com/Silimate/yosys
I'm not sure why but this is actually faster than existing `opt_merge` even with
YOSYS_MAX_THREADS=1, for the jpeg synthesis test. 16.0s before, 15.5s after for
end-to-end synthesis.
This avoids changing `assign_map` and `initvals`, which are inputs to the hash function for `known_cells`,
while `known_cells` exists. Changing the hash function for a hashtable while it exists leads to
confusing behavior. That also means the exact behavior of `opt_merge` cannot be reproduced by a
parallel implementation.
This code is quite confusing because there are two "is the cell known" filters
applied, one while building the cell vector and one after building the cell
vector, and they're subtly different. I'm preserving the actual behaviour here
but it looks like there is, or was, a bug here.
This ensures that entering and leaving bufnorm followed by `opt_clean`
is equivalent to just running `opt_clean`.
Also make sure that 'z-$buf cells get techmapped in a compatible way.
We call 'assign_map.set()' below which wipes the `SigMap` and reconstructs it.
This operation is expensive because it scans the whole module. I think it's
better to make heavyweight operations more visible so I'm removing
the less obvious operation.
The previous commit introduced code that optimizes the activation
patterns to be able to generate smaller activation logic. The resulting
supercell was then enqueued as shareable using those optimized
activation patterns. The condition represented by the optimized patterns
is an over-approximation of the actual activiation condition. This means
using it as activiation for the supercell loses precision and pessimises
sharing of the supercell with further cells, breaking the sat/share
test.
This commit fixes that by using the optimized activiation patterns only
for the generation of activation logic and using the original patterns
for enqueuing the supercell.
In case the two sets of activation patterns are mutually exclusive
without considering the logic feeding into the activation signals, an
activation condition can only be relevant if present in both sets with
opposite polarity.
This detects pattern-only mutual exclusion by running an additional SAT
query before importing the input cone logic. If that is already UNSAT,
we remove all non-relevant condition and re-simplify the remaining
patterns.
In cases of pattern-only mutual exclusion, this will often produce much
smaller selection logic and avoid the more costly SAT query that
includes the input cones.
If all the (non-select) inputs of a `$_MUX{4,8,16}_` are undefined, replace it, just like we do for `$mux` and `$_MUX_`.
Add `tests/opt/opt_expr_mux_undef.ys` to verify this.
This doesn't do any const folding on the wide muxes, or shrinking to less wide muxes. It only handles the case where all inputs are 'x and the mux can be completely removed.
- Techlib pmgens are now in relevant techlibs/*.
- `peepopt` pmgens are now in passes/opt.
- `test_pmgen` is still in passes/pmgen.
- Update `Makefile.inc` and `.gitignore` file(s) to match new `*_pm.h` location,
as well as the `#include`s.
- Change default `%_pm.h` make target to `techlibs/%_pm.h` and move it to the
top level Makefile.
- Update pmgen target to use `$(notdir $*)` (where `$*` is the part of the file
name that matched the '%' in the target) instead of `$(subst _pm.h,,$(notdir
$@))`.
The B port is for single-bit summands. These can just as well be
represented as an additional summand on the A port (which supports
summands of arbitrary width). An upcoming `$macc_v2` cell won't be
special-casing single-bit summands in any way.
In preparation, make the following changes:
* remove the `bit_ports` field from the `Macc` helper (instead add any
single-bit summands to `ports` next to other summands)
* leave `B` empty on cells emitted from `Macc::to_cell`