verilator/docs/guide
Thomas Santerre bd6b9161dc
Optimize bit-scan loops into $mostsetbitp1 / $countones (#7822)
Recognize the common single-bit scan loop idioms in V3Unroll (before it
unrolls) and lower them to bit-reduction primitives, replacing a literal
W-iteration loop with one intrinsic-backed expression:

  target=0; for (i=0;i<W;i++) if (vec[i]) target = i + 1;      -> $mostsetbitp1(vec)
  target=0; for (i=0;i<W;i++) if (vec[i]) target = target + 1; -> $countones(vec)

The leading-one form lowers to a new AstMostSetBitP1 node, emitted as
VL_MOSTSETBITP1_{I,Q,W}; those runtime helpers now use __builtin_clz where
available (same pattern as VL_REDXOR's __builtin_parity), with the existing
bit scan as fallback.  The count-ones form reuses AstCountOnes ($countones,
popcount); as the DFG requires a 32-bit countones result it is built at 32
bits and narrowed to the accumulator width with a select.

Matching is structural to stay sound: the index must start at 0, increment
by exactly 1, and scan all W==width(vec) bits via a single 1-bit select of a
distinct vector, with the target pre-zeroed and no else branch.  The loop
bound is accepted as a strict ascending 'idx < W' written either way and
signed or unsigned (Gt/GtS/Lt/LtS).  Gated by -fbit-scan-loops (on at -O).

Adds t_bit_scan_loops (I/Q/W, count-ones and unsigned-index positives;
step-2, start-1, idx*2+1, vec[idx+1], target=idx and W!=width negatives, all
self-checked and asserted via --stats not to lower) plus t_bit_scan_loops_off
for the disable flag.

Motivated by a transformer inference design whose 80-bit leading-one detector
ran every cycle (~37% of runtime); the lowering is worth ~39% there.
2026-06-24 10:43:05 +01:00
..
figures Verilator_gantt now shows the predicted mtask times, eval times, and additional statistics. 2021-09-23 22:59:36 -04:00
changes.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
conf.py Commentary: Changes update 2026-04-23 00:44:50 -04:00
connecting.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
contributing.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
contributors.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
control.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
copyright.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
deprecations.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
environment.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
example_binary.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
example_cc.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
example_common_install.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
example_dist.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
example_sc.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
examples.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
exe_sim.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
exe_verilator.rst Optimize bit-scan loops into $mostsetbitp1 / $countones (#7822) 2026-06-24 10:43:05 +01:00
exe_verilator_coverage.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
exe_verilator_gantt.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
exe_verilator_profcfunc.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
executables.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
extensions.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
faq.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
files.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
index.rst Commentary: Use standard multiline rst comments, other cleanups 2026-06-18 21:58:01 -04:00
install-cmake.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
install.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
languages.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
overview.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
simulating.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
verilating.rst Commentary: Make RST documents round-trip clean. No output change intended. 2026-06-21 10:15:47 -04:00
warnings.rst Support NBAs in initial blocks (#7754) 2026-06-20 17:23:05 -04:00