Previously V3InlineCFuncs inlined call sites but never deleted the now
dead callees. Also missed a lot of opportunities due to evaluation order.
Rewrite using a graph based algorithm, using only a single traversal of
the netlist. This is clearer, more accurate, and faster at compile time.
Also add a clean -fno-inline-cfuncs disable. Setting the limits to 0
still disables inlining, except of empty functions, which can be inlined
with 0 limits (they are no ops). It will also prune unused functions
without -fno-inline-cfuncs.
Pass now also respects `--output-split`