mirror of https://github.com/YosysHQ/yosys.git
164 lines
5.0 KiB
Plaintext
164 lines
5.0 KiB
Plaintext
Clock gating
|
||
|
||
|
||
|
||
need to determine when the D == Q
|
||
|
||
Make sure that the flip flop has a clock but not ce: !ff.has_ce && ff.has_clk
|
||
check if
|
||
|
||
USE sat to determine when the Q is the same as D
|
||
|
||
Q is the feedback, and then there's also the D which does in and select for the mux
|
||
becomes the enable
|
||
|
||
|
||
D = f(Q, other_inputs)
|
||
|
||
!(D ^ f(Q, other_inputs))
|
||
|
||
|
||
Look somewhat like this:
|
||
Q is always going to be D
|
||
But D itself has an enable built into it
|
||
|
||
|
||
So what D really is is:
|
||
Mux (D_r, Q, en) where en is the enable signal
|
||
|
||
Our goal is to find en
|
||
|
||
Equation:
|
||
Q_next = (en ∧ D_r) ∨ (¬en ∧ Q)
|
||
|
||
Equality question:
|
||
!((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
Universal Quantization
|
||
FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
SAT Equation
|
||
FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q)))
|
||
|
||
Now the issue is to determine what the en is and what the D_r in. In order
|
||
to continue using this approach, a way of differentiating that would be needed.
|
||
|
||
Need a simpler approach which just considers all of the inputs and then performs
|
||
SAT. Here's the approach:
|
||
Consider the flip flops which go into D. Consider all of those inputs and seen
|
||
if D != Q is UNSAT. Meaning for that set of inputs into D, D is going to be Q.
|
||
Then try to minimize this set (optimization phase).
|
||
|
||
|
||
1) Find the input's into D and determine if there's any level at which you can
|
||
determine if (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT (menaing for that
|
||
combination of inputs of x1, x2, x3, ..., xn, D is always = Q.
|
||
|
||
Algorithm version one:
|
||
This version doesn't take into accound the threshold (doesn't try and insert)
|
||
the same CE into multiple different clocks, it also doesn't do any pre SAT simulation
|
||
optimization. Fruthermore, it also doesn't try and find the minimal set, just a set.
|
||
Lastly, this also does a form of safe clock gating which means that:
|
||
input_set == 0 -> D == Q
|
||
Which means there's cases when input_set may be 1 but D != Q
|
||
|
||
// determines if the input set serves as an enable
|
||
input_set_is_en(input_set, D, Q):
|
||
return (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT
|
||
|
||
// determines the input set
|
||
determine_en_rec (input_set&, D, Q):
|
||
if (count(input_set) > N):
|
||
return false:
|
||
if input_set_is_en(input_set, D, Q):
|
||
// for now this returns. Later, when optimizing, this will try and find a smaller subset within the set
|
||
return true
|
||
else:
|
||
// Detemine set of inputs
|
||
input_set_new = Do a BFS on the Data pin in the clock and add those pins to set
|
||
determine_en_rec(input_set_new, D, Q)
|
||
|
||
// create the CE based on the input set
|
||
// adds the CE into the clock
|
||
create_ce_logic(input_set, D, Q, ffData):
|
||
// CE = OR(all signals in input_set)
|
||
// When any input is 1 → CE=1 (update register)
|
||
// When all inputs are 0 → CE=0 (hold, since SAT proved D==Q)
|
||
|
||
if input_set.size() == 1:
|
||
ce_signal = input_set[0]
|
||
else:
|
||
ce_wire = module.addWire(NEW_ID)
|
||
module.addReduceOr(NEW_ID, input_set, ce_wire)
|
||
ce_signal = ce_wire
|
||
|
||
ffData.has_ce = true
|
||
ffData.sig_ce = ce_signal
|
||
ffData.pol_ce = true // active high
|
||
ffData.emit() // rebuild the FF with CE
|
||
|
||
set_ff_ces(design):
|
||
for module in design:
|
||
for cell in module:
|
||
if cell is_builtin_ff:
|
||
ffFata ff = cell;
|
||
if (!ff.has_ce && ff.has_clk && ff.has_d && ff.has_Q):
|
||
input_set = {}
|
||
if (determine_en_rec(input_set, D, Q)):
|
||
create_ce_logic(input_set, D, Q, ffData)
|
||
|
||
// insert the ICG gates based on the new CEs inserted
|
||
pass::call("clockgate", design);
|
||
|
||
|
||
|
||
(input_set=0) AND (D≠Q) == UNSAT
|
||
-> if one of them is 1 and D = Q
|
||
Scenario What happens OK?
|
||
CE=0, D==Q Gate clock, hold value Correct (power saved)
|
||
CE=0, D≠Q Gate clock, lose data BUG
|
||
CE=1, D==Q Clock passes, write same value Correct (wasted power)
|
||
CE=1, D≠Q Clock passes, update register Correct
|
||
|
||
|
||
|
||
|
||
|
||
(input_set=0) AND (D≠Q) == UNSAT ->
|
||
Existential Quantization of input_set such that
|
||
To ensure that if (CE=0) then D==Q:
|
||
((combination of inputs) AND (D≠Q)) = UNSAT
|
||
- This is functionally accurate but there might be cases when CE=1 but D==Q which
|
||
is a waste of power
|
||
To ensure that if (CE=1) then D!=Q:
|
||
((combination of inputs) AND (D==Q)) = UNSAT
|
||
- This alone is risky since there might be combinations such that CE=0 but D!=Q
|
||
which is incorrect behaviour
|
||
|
||
Need to ensure CE=1 <-> D!=Q:
|
||
BOTH conditions must hold:
|
||
1) ((CE=0) AND (D≠Q)) = UNSAT // CE=0 → D==Q (safe to gate)
|
||
2) ((CE=1) AND (D==Q)) = UNSAT // CE=1 → D≠Q (no wasted power)
|
||
|
||
Combined: CE must be the exact boolean function where CE = (D ≠ Q)
|
||
For MUX pattern D = sel ? new_val : Q, CE = sel satisfies both.
|
||
|
||
|
||
In order words, we need
|
||
Exist a combination of inputs such that UNSAT((combination of inputs) ^ (D==Q))
|
||
Which means
|
||
((!COI) && (D!=Q)) && ((COI) && (D==Q))
|
||
|
||
Exit a set of inputs such that COI <-> D != Q
|
||
(COI && (D != Q)) && (!COI && (D == Q))
|
||
|
||
|
||
|
||
The final equation for the UNSAT is:
|
||
((D != Q) != (COI)) -> UNSAT => COI = (D != Q)
|
||
Exists COI such that ((D ^ Q) ^ (COI)) -> UNSAT
|
||
|
||
|
||
So in the new pseudo code algorithm, once the pool<SigBit> is populated (the input_set),
|
||
we create this miter with the exestential quantization with the input_set.
|