mirror of https://github.com/YosysHQ/yosys.git
295 lines
8.6 KiB
Plaintext
295 lines
8.6 KiB
Plaintext
Clock gating
|
||
|
||
|
||
|
||
need to determine when the D == Q
|
||
|
||
Make sure that the flip flop has a clock but not ce: !ff.has_ce && ff.has_clk
|
||
check if
|
||
|
||
USE sat to determine when the Q is the same as D
|
||
|
||
Q is the feedback, and then there's also the D which does in and select for the mux
|
||
becomes the enable
|
||
|
||
|
||
D = f(Q, other_inputs)
|
||
|
||
!(D ^ f(Q, other_inputs))
|
||
|
||
|
||
Look somewhat like this:
|
||
Q is always going to be D
|
||
But D itself has an enable built into it
|
||
|
||
|
||
So what D really is is:
|
||
Mux (D_r, Q, en) where en is the enable signal
|
||
|
||
Our goal is to find en
|
||
|
||
Equation:
|
||
Q_next = (en ∧ D_r) ∨ (¬en ∧ Q)
|
||
|
||
Equality question:
|
||
!((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
Universal Quantization
|
||
FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
SAT Equation
|
||
FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
Now the issue is to determine what the en is and what the D_r in. In order
|
||
to continue using this approach, a way of differentiating that would be needed.
|
||
|
||
Need a simpler approach which just considers all of the inputs and then performs
|
||
SAT. Here's the approach:
|
||
Consider the flip flops which go into D. Consider all of those inputs and seen
|
||
if D != Q is UNSAT. Meaning for that set of inputs into D, D is going to be Q.
|
||
Then try to minimize this set (optimization phase).
|
||
|
||
|
||
1) Find the input's into D and determine if there's any level at which you can
|
||
determine if (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT (menaing for that
|
||
combination of inputs of x1, x2, x3, ..., xn, D is always = Q.
|
||
|
||
Algorithm version one:
|
||
This version doesn't take into accound the threshold (doesn't try and insert)
|
||
the same CE into multiple different clocks, it also doesn't do any pre SAT simulation
|
||
optimization. Fruthermore, it also doesn't try and find the minimal set, just a set.
|
||
Lastly, this also does a form of safe clock gating which means that:
|
||
input_set == 0 -> D == Q
|
||
Which means there's cases when input_set may be 1 but D != Q
|
||
|
||
// determines if the input set serves as an enable
|
||
input_set_is_en(input_set, D, Q):
|
||
return (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT
|
||
|
||
// determines the input set
|
||
determine_en_rec (input_set&, D, Q):
|
||
if (count(input_set) > N):
|
||
return false:
|
||
if input_set_is_en(input_set, D, Q):
|
||
// for now this returns. Later, when optimizing, this will try and find a smaller subset within the set
|
||
return true
|
||
else:
|
||
// Detemine set of inputs
|
||
input_set_new = Do a BFS on the Data pin in the clock and add those pins to set
|
||
determine_en_rec(input_set_new, D, Q)
|
||
|
||
// create the CE based on the input set
|
||
// adds the CE into the clock
|
||
create_ce_logic(input_set, D, Q, ffData):
|
||
// CE = OR(all signals in input_set)
|
||
// When any input is 1 → CE=1 (update register)
|
||
// When all inputs are 0 → CE=0 (hold, since SAT proved D==Q)
|
||
|
||
if input_set.size() == 1:
|
||
ce_signal = input_set[0]
|
||
else:
|
||
ce_wire = module.addWire(NEW_ID)
|
||
module.addReduceOr(NEW_ID, input_set, ce_wire)
|
||
ce_signal = ce_wire
|
||
|
||
ffData.has_ce = true
|
||
ffData.sig_ce = ce_signal
|
||
ffData.pol_ce = true // active high
|
||
ffData.emit() // rebuild the FF with CE
|
||
|
||
set_ff_ces(design):
|
||
for module in design:
|
||
for cell in module:
|
||
if cell is_builtin_ff:
|
||
ffFata ff = cell;
|
||
if (!ff.has_ce && ff.has_clk && ff.has_d && ff.has_Q):
|
||
input_set = {}
|
||
if (determine_en_rec(input_set, D, Q)):
|
||
create_ce_logic(input_set, D, Q, ffData)
|
||
|
||
// insert the ICG gates based on the new CEs inserted
|
||
pass::call("clockgate", design);
|
||
|
||
|
||
|
||
(input_set=0) AND (D≠Q) == UNSAT
|
||
-> if one of them is 1 and D = Q
|
||
Scenario What happens OK?
|
||
CE=0, D==Q Gate clock, hold value Correct (power saved)
|
||
CE=0, D≠Q Gate clock, lose data BUG
|
||
CE=1, D==Q Clock passes, write same value Correct (wasted power)
|
||
CE=1, D≠Q Clock passes, update register Correct
|
||
|
||
|
||
|
||
|
||
|
||
(input_set=0) AND (D≠Q) == UNSAT ->
|
||
Existential Quantization of input_set such that
|
||
To ensure that if (CE=0) then D==Q:
|
||
((combination of inputs) AND (D≠Q)) = UNSAT
|
||
- This is functionally accurate but there might be cases when CE=1 but D==Q which
|
||
is a waste of power
|
||
To ensure that if (CE=1) then D!=Q:
|
||
((combination of inputs) AND (D==Q)) = UNSAT
|
||
- This alone is risky since there might be combinations such that CE=0 but D!=Q
|
||
which is incorrect behaviour
|
||
|
||
Need to ensure CE=1 <-> D!=Q:
|
||
BOTH conditions must hold:
|
||
1) ((CE=0) AND (D≠Q)) = UNSAT // CE=0 → D==Q (safe to gate)
|
||
2) ((CE=1) AND (D==Q)) = UNSAT // CE=1 → D≠Q (no wasted power)
|
||
|
||
Combined: CE must be the exact boolean function where CE = (D ≠ Q)
|
||
For MUX pattern D = sel ? new_val : Q, CE = sel satisfies both.
|
||
|
||
|
||
In order words, we need
|
||
Exist a combination of inputs such that UNSAT((combination of inputs) ^ (D==Q))
|
||
Which means
|
||
((!COI) && (D!=Q)) && ((COI) && (D==Q))
|
||
|
||
Exit a set of inputs such that COI <-> D != Q
|
||
(COI && (D != Q)) && (!COI && (D == Q))
|
||
|
||
|
||
|
||
The final equation for the UNSAT is:
|
||
((D != Q) != (COI)) -> UNSAT => COI = (D != Q)
|
||
Exists COI such that ((D ^ Q) ^ (COI)) -> UNSAT
|
||
|
||
|
||
So in the new pseudo code algorithm, once the pool<SigBit> is populated (the input_set),
|
||
we create this miter with the exestential quantization with the input_set.
|
||
|
||
The issue is that since the input set isn't a boolean function (is a BFS traversal),
|
||
we need to manually create a boolean function out of the input set.
|
||
|
||
Another issue is that we must ensure that these somehow impact the D and/or the Q. This
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Future TODOs:
|
||
1) Recursively minimize the set which is actually needed
|
||
2) Add threshold (how many flops depend on the same enable signal)
|
||
3) Add different setting for type of mitter (maybe add just the !COI -> D == Q)
|
||
for a weaker clock gate (still consumes power if COI and D == Q but might be faster)
|
||
4) Deal with posede vs negedge of the clocks
|
||
5) Experiment with different logical combinations of the COI set (rather than just
|
||
or-ing them all together)
|
||
6) Consider pruning ezSAT expressions list — accumulates across queries, may cause memory growth
|
||
|
||
|
||
=== FIXED input_set_is_enable IMPLEMENTATIONS ===
|
||
|
||
// VERSION 1: Safe clock gating (current approach, cleaned up)
|
||
// Checks: (input_set=0) AND (D≠Q) == UNSAT
|
||
// Meaning: when all inputs are 0, D is guaranteed to equal Q
|
||
bool input_set_is_enable_safe(const pool<SigBit> &input_set, SigSpec sig_d, SigSpec sig_q)
|
||
{
|
||
if (input_set.empty())
|
||
return false;
|
||
|
||
ezSatPtr ez;
|
||
SatGen satgen(ez.get(), &sigmap);
|
||
|
||
// Import circuit behavior
|
||
for (auto cell : module->cells())
|
||
satgen.importCell(cell);
|
||
|
||
// Import D and Q
|
||
std::vector<int> d_vec = satgen.importSigSpec(sig_d);
|
||
std::vector<int> q_vec = satgen.importSigSpec(sig_q);
|
||
|
||
// Constraint 1: All input_set bits = 0
|
||
for (auto bit : input_set) {
|
||
int bit_var = satgen.importSigSpec(SigSpec(bit))[0];
|
||
ez->assume(ez->NOT(bit_var));
|
||
}
|
||
|
||
// Constraint 2: D != Q
|
||
ez->assume(ez->vec_ne(d_vec, q_vec));
|
||
|
||
// If UNSAT: no way for D≠Q when inputs=0 → valid enable
|
||
return !ez->solve();
|
||
}
|
||
|
||
// VERSION 2: Exact clock gating (stronger, no wasted power)
|
||
// Checks: (D≠Q) XOR (OR(input_set)) == UNSAT
|
||
// Meaning: COI is exactly equivalent to D≠Q
|
||
bool input_set_is_enable_exact(const pool<SigBit> &input_set, SigSpec sig_d, SigSpec sig_q)
|
||
{
|
||
if (input_set.empty())
|
||
return false;
|
||
|
||
ezSatPtr ez;
|
||
SatGen satgen(ez.get(), &sigmap);
|
||
|
||
// Import circuit behavior
|
||
for (auto cell : module->cells())
|
||
satgen.importCell(cell);
|
||
|
||
// Import D and Q
|
||
std::vector<int> d_vec = satgen.importSigSpec(sig_d);
|
||
std::vector<int> q_vec = satgen.importSigSpec(sig_q);
|
||
|
||
// Build COI = OR(input_set)
|
||
std::vector<int> input_vars;
|
||
for (auto bit : input_set)
|
||
input_vars.push_back(satgen.importSigSpec(SigSpec(bit))[0]);
|
||
int coi = ez->expression(ezSAT::OpOr, input_vars);
|
||
|
||
// Build D != Q (single bit: is any bit different?)
|
||
int d_ne_q = ez->vec_ne(d_vec, q_vec);
|
||
|
||
// Constraint: COI XOR (D≠Q) — want this UNSAT (meaning COI ↔ D≠Q)
|
||
ez->assume(ez->XOR(coi, d_ne_q));
|
||
|
||
// If UNSAT: COI is exactly when D≠Q → perfect enable
|
||
return !ez->solve();
|
||
}
|
||
|
||
|
||
|
||
|
||
Setting up the SAT condition:
|
||
- Need to have the equation for Q, need to have the equation for D
|
||
need to XOR those equations, need to XOR that equation with the new
|
||
one. Need to make sure that that's never SAT.
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Ok so I have this idea:
|
||
|
||
|
||
|
||
|
||
I'm doing to take the input variables, universally quantize them and also take D and Q and universilly quantise them
|
||
|
||
Then I have this formula SAT((D^Q) !^ MUX))
|
||
|
||
Where the MUX has (for the select inputs the input values, and then has random variables d_0 to D_2^n when there's n inputs.
|
||
|
||
SAT returns the combination of the values for d which make this work.
|
||
|
||
But in this case, how can I go from determining the values of the Ds to determine the gates and converting that to combinational logic? And also, how can I Universially Quantize the other values? |