This code is not thread safe. Specifically AstNode constructors are not
thread safe, as they may create entries in the shared Dtype table via
which can be racy.
Add a new pass to split up (recursively):
foo = {l, r};
into the following, with the right indices, iff the concatenation
straddles a wide word boundary.
foo[_:_] = r;
foo[_:_] = l;
This eliminates more wide temporaries.
Another 23% speedup on VeeR EH2 high_perf. Also brings the predicted
stack size from 8M to 40k.