Remove AstJumpLabel
AstJumpGo now references one if its enclosing AstJumpBlocks, and
branches straight after the referenced block.
That is:
```
JumpBlock a {
...
JumpGo(a);
...
}
// <--- the JumpGo(a) goes here
```
This is sufficient for all use cases and makes control flow much easier to
reason about. As a result, V3Const can optimize a bit more aggressively.
Second half of, and fixes#6216
Somewhat commonly, there is code out there that compares an expression (or
variable) against many different constants, e.g. a one-hot decoder:
```systemverilog
assign oneHot = {x == 3, x == 2, x == 1, x == 0};
```
If the width of the expression is sufficiently large, this can blow up
a GCC pass and take an egregious amount of memory and time to compile.
Adding a new DFG pass that will generate a cheap one-hot decoder:
to compute:
```systemverilog
wire [$bits(x)-1:0] idx = <the expression being compared many times>
reg tab [1<<$bits(x)] = '{default: 0};
reg [$bits(x)-1:0] pre = '0;
always_comb begin
tab[pre] = 0;
tab[idx] = 1;
pre = idx ; // This assignment marked to avoid a false UNOPFTLAT
end
```
We then replace the comparisons `x == CONST` with `tab[CONST]`.
This is generally performance neutral, but avoids the compile time and memory
blowup with GCC (128GB+ -> 1GB in one example).
We do not apply this if the comparisons seem to be part of a `COMPARE ?
val : COND` conditional tree, which the C++ compilers can turn into jump
tables.
This enables all XiangShan configurations from RTLMeter to now build with GCC,
so in this patch we enabled those in the nightly runs.
Having many triggers still hits a bottleneck in LLVM leading to long
compile times.
Instead of setting triggers bit-wise, set them as a whole 64-bit word
when possible. This improves C++ compile times by ~4x on some large
designs and has minor run-time performance benefit.