diff --git a/notes.txt b/notes.txt deleted file mode 100644 index 1c2755f2b..000000000 --- a/notes.txt +++ /dev/null @@ -1,373 +0,0 @@ -Clock gating - - - -need to determine when the D == Q - -Make sure that the flip flop has a clock but not ce: !ff.has_ce && ff.has_clk -check if - -USE sat to determine when the Q is the same as D - -Q is the feedback, and then there's also the D which does in and select for the mux -becomes the enable - - -D = f(Q, other_inputs) - -!(D ^ f(Q, other_inputs)) - - -Look somewhat like this: - Q is always going to be D - But D itself has an enable built into it - - - So what D really is is: - Mux (D_r, Q, en) where en is the enable signal - - Our goal is to find en - - Equation: - Q_next = (en ∧ D_r) ∨ (¬en ∧ Q) - - Equality question: - !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q))) - - Universal Quantization - FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q))) - - SAT Equation - FA(Q_next, D_r, Q) !((Q_next) ^ ((en ∧ D_r) ∨ (¬en ∧ Q))) - -Now the issue is to determine what the en is and what the D_r in. In order -to continue using this approach, a way of differentiating that would be needed. - -Need a simpler approach which just considers all of the inputs and then performs -SAT. Here's the approach: - Consider the flip flops which go into D. Consider all of those inputs and seen -if D != Q is UNSAT. Meaning for that set of inputs into D, D is going to be Q. -Then try to minimize this set (optimization phase). - - -1) Find the input's into D and determine if there's any level at which you can -determine if (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT (menaing for that -combination of inputs of x1, x2, x3, ..., xn, D is always = Q. - -Algorithm version one: -This version doesn't take into accound the threshold (doesn't try and insert) -the same CE into multiple different clocks, it also doesn't do any pre SAT simulation -optimization. Fruthermore, it also doesn't try and find the minimal set, just a set. -Lastly, this also does a form of safe clock gating which means that: - input_set == 0 -> D == Q -Which means there's cases when input_set may be 1 but D != Q - -// determines if the input set serves as an enable -input_set_is_en(input_set, D, Q): - return (Exists(x1, x2, x3, ..., xn) | D != Q) == UNSAT - -// determines the input set -determine_en_rec (input_set&, D, Q): - if (count(input_set) > N): - return false: - if input_set_is_en(input_set, D, Q): - // for now this returns. Later, when optimizing, this will try and find a smaller subset within the set - return true - else: - // Detemine set of inputs - input_set_new = Do a BFS on the Data pin in the clock and add those pins to set - determine_en_rec(input_set_new, D, Q) - -// create the CE based on the input set -// adds the CE into the clock -create_ce_logic(input_set, D, Q, ffData): - // CE = OR(all signals in input_set) - // When any input is 1 → CE=1 (update register) - // When all inputs are 0 → CE=0 (hold, since SAT proved D==Q) - - if input_set.size() == 1: - ce_signal = input_set[0] - else: - ce_wire = module.addWire(NEW_ID) - module.addReduceOr(NEW_ID, input_set, ce_wire) - ce_signal = ce_wire - - ffData.has_ce = true - ffData.sig_ce = ce_signal - ffData.pol_ce = true // active high - ffData.emit() // rebuild the FF with CE - -set_ff_ces(design): - for module in design: - for cell in module: - if cell is_builtin_ff: - ffFata ff = cell; - if (!ff.has_ce && ff.has_clk && ff.has_d && ff.has_Q): - input_set = {} - if (determine_en_rec(input_set, D, Q)): - create_ce_logic(input_set, D, Q, ffData) - - // insert the ICG gates based on the new CEs inserted - pass::call("clockgate", design); - - - -(input_set=0) AND (D≠Q) == UNSAT - -> if one of them is 1 and D = Q -Scenario What happens OK? -CE=0, D==Q Gate clock, hold value Correct (power saved) -CE=0, D≠Q Gate clock, lose data BUG -CE=1, D==Q Clock passes, write same value Correct (wasted power) -CE=1, D≠Q Clock passes, update register Correct - - - - - -(input_set=0) AND (D≠Q) == UNSAT -> -Existential Quantization of input_set such that -To ensure that if (CE=0) then D==Q: - ((combination of inputs) AND (D≠Q)) = UNSAT - - This is functionally accurate but there might be cases when CE=1 but D==Q which - is a waste of power -To ensure that if (CE=1) then D!=Q: - ((combination of inputs) AND (D==Q)) = UNSAT - - This alone is risky since there might be combinations such that CE=0 but D!=Q - which is incorrect behaviour - -Need to ensure CE=1 <-> D!=Q: - BOTH conditions must hold: - 1) ((CE=0) AND (D≠Q)) = UNSAT // CE=0 → D==Q (safe to gate) - 2) ((CE=1) AND (D==Q)) = UNSAT // CE=1 → D≠Q (no wasted power) - - Combined: CE must be the exact boolean function where CE = (D ≠ Q) - For MUX pattern D = sel ? new_val : Q, CE = sel satisfies both. - - -In order words, we need -Exist a combination of inputs such that UNSAT((combination of inputs) ^ (D==Q)) -Which means -((!COI) && (D!=Q)) && ((COI) && (D==Q)) - -Exit a set of inputs such that COI <-> D != Q -(COI && (D != Q)) && (!COI && (D == Q)) - - - -The final equation for the UNSAT is: - ((D != Q) != (COI)) -> UNSAT => COI = (D != Q) - Exists COI such that ((D ^ Q) ^ (COI)) -> UNSAT - - -So in the new pseudo code algorithm, once the pool is populated (the input_set), -we create this miter with the exestential quantization with the input_set. - -The issue is that since the input set isn't a boolean function (is a BFS traversal), -we need to manually create a boolean function out of the input set. - -Another issue is that we must ensure that these somehow impact the D and/or the Q. This - - - - - - - - - - - - - - - - - -Future TODOs: -1) Recursively minimize the set which is actually needed -2) Add threshold (how many flops depend on the same enable signal) -3) Add different setting for type of mitter (maybe add just the !COI -> D == Q) - for a weaker clock gate (still consumes power if COI and D == Q but might be faster) -4) Deal with posede vs negedge of the clocks -5) Experiment with different logical combinations of the COI set (rather than just - or-ing them all together) -6) Consider pruning ezSAT expressions list — accumulates across queries, may cause memory growth - - -=== FIXED input_set_is_enable IMPLEMENTATIONS === - -// VERSION 1: Safe clock gating (current approach, cleaned up) -// Checks: (input_set=0) AND (D≠Q) == UNSAT -// Meaning: when all inputs are 0, D is guaranteed to equal Q -bool input_set_is_enable_safe(const pool &input_set, SigSpec sig_d, SigSpec sig_q) -{ - if (input_set.empty()) - return false; - - ezSatPtr ez; - SatGen satgen(ez.get(), &sigmap); - - // Import circuit behavior - for (auto cell : module->cells()) - satgen.importCell(cell); - - // Import D and Q - std::vector d_vec = satgen.importSigSpec(sig_d); - std::vector q_vec = satgen.importSigSpec(sig_q); - - // Constraint 1: All input_set bits = 0 - for (auto bit : input_set) { - int bit_var = satgen.importSigSpec(SigSpec(bit))[0]; - ez->assume(ez->NOT(bit_var)); - } - - // Constraint 2: D != Q - ez->assume(ez->vec_ne(d_vec, q_vec)); - - // If UNSAT: no way for D≠Q when inputs=0 → valid enable - return !ez->solve(); -} - -// VERSION 2: Exact clock gating (stronger, no wasted power) -// Checks: (D≠Q) XOR (OR(input_set)) == UNSAT -// Meaning: COI is exactly equivalent to D≠Q -bool input_set_is_enable_exact(const pool &input_set, SigSpec sig_d, SigSpec sig_q) -{ - if (input_set.empty()) - return false; - - ezSatPtr ez; - SatGen satgen(ez.get(), &sigmap); - - // Import circuit behavior - for (auto cell : module->cells()) - satgen.importCell(cell); - - // Import D and Q - std::vector d_vec = satgen.importSigSpec(sig_d); - std::vector q_vec = satgen.importSigSpec(sig_q); - - // Build COI = OR(input_set) - std::vector input_vars; - for (auto bit : input_set) - input_vars.push_back(satgen.importSigSpec(SigSpec(bit))[0]); - int coi = ez->expression(ezSAT::OpOr, input_vars); - - // Build D != Q (single bit: is any bit different?) - int d_ne_q = ez->vec_ne(d_vec, q_vec); - - // Constraint: COI XOR (D≠Q) — want this UNSAT (meaning COI ↔ D≠Q) - ez->assume(ez->XOR(coi, d_ne_q)); - - // If UNSAT: COI is exactly when D≠Q → perfect enable - return !ez->solve(); -} - - - - -Setting up the SAT condition: - - Need to have the equation for Q, need to have the equation for D - need to XOR those equations, need to XOR that equation with the new - one. Need to make sure that that's never SAT. - - - - - - - - - -Ok so I have this idea: - - - - -I'm doing to take the input variables, universally quantize them and also take D and Q and universilly quantise them - -Then I have this formula SAT((D^Q) !^ MUX)) - -Where the MUX has (for the select inputs the input values, and then has random variables d_0 to D_2^n when there's n inputs. - -SAT returns the combination of the values for d which make this work. - -But in this case, how can I go from determining the values of the Ds to determine the gates and converting that to combinational logic? And also, how can I Universially Quantize the other values? - - - -This is difficult due to QBF (Quantified Boolean Format) engines being very expensive and slow. -Rather than this, potentially trying CEGAR (not sure if this is practical). Idea is this: - 1. Start with a CANDIDATE solution (guess/abstraction) - 2. CHECK: Does candidate work for ALL inputs? (via SAT) (UNSAT for XOR means they are the same) - - If YES → Done, return candidate ✓ - - If NO → SAT gives a COUNTEREXAMPLE (inputs where it fails) - 3. REFINE: Use counterexample to improve candidate - 4. GOTO 2 - - - - The approach by this paper: https://dl.acm.org/doi/epdf/10.1145/1391469.1391637 - -Try each signal individually → collect ones that work → OR the winners -Your approach: "OR(all inputs) == enable?" -Paper's approach: "Which individual signals could BE the enable?" -Paper's Algorithm in Pseudocode (Yosys-feasible): -SigBit find_clock_enable(SigSpec sig_d, SigSpec sig_q) { pool candidates = get_cone_signals(sig_d); // All signals in fanin pool valid_enables; // STEP 1: Prune with simulation (fast) for (auto sig : candidates) { if (!simulation_suggests_valid(sig, sig_d, sig_q)) candidates.erase(sig); // Quick reject } // STEP 2: Prove with SAT (slow but conclusive) for (auto sig : candidates) { // Check: sig=1 → D==Q (safe to gate) // SAT query: (sig ∧ (D ⊕ Q)) — want UNSAT if (sat_proves_valid(sig, sig_d, sig_q)) valid_enables.insert(sig); } // STEP 3: Pick best (by estimated power savings) SigBit best = select_best_coverage(valid_enables); return best; // Or OR multiple if needed}bool sat_proves_valid(SigBit candidate, SigSpec sig_d, SigSpec sig_q) { // "Can candidate=1 while D≠Q?" — if UNSAT, candidate is valid int cand = satgen.importSigSpec(SigSpec(candidate))[0]; int d_ne_q = ez->vec_ne(d_vec, q_vec); ez->assume(cand); ez->assume(d_ne_q); return !ez->solve(); // UNSAT = valid enable} - - -=== SAFE GATING vs EXACT GATING === - -Safe Gating: - SAT check: sig ∧ (D≠Q) = UNSAT - Meaning: When sig=1, D is guaranteed to equal Q (safe to gate clock) - - sig=1 → gate clock, hold register (D==Q guaranteed) - - sig=0 → clock runs freely (D may or may not equal Q) - Allows wasted power (clock runs when D==Q but sig=0), but NEVER loses data. - -Exact Gating: - SAT check: sig ⊕ (D≠Q) = UNSAT - Meaning: sig is EXACTLY equivalent to (D≠Q) - - sig=1 ↔ D≠Q (perfect bidirectional match) - No wasted power, but much harder to find matching signals. - -Comparison: - | Type | SAT Check | Finds more? | Power optimal? | - |--------|-----------------------|-------------|----------------| - | Safe | sig ∧ (D≠Q) = UNSAT | Yes | No (some waste)| - | Exact | sig ⊕ (D≠Q) = UNSAT | No | Yes (perfect) | - -Recommendation: - • Use SAFE GATING — faster (weaker SAT query), finds more candidates - • Safe gating is industry standard (used in paper, commercial tools) - • Exact gating rarely finds matches unless design has explicit MUX-with-Q pattern - • Power difference is minor — safe gating still saves most power - • Safe gating has better QoR: more FFs get clock-gated - - There's also clock as_enable and as_disable - as_enable = true (clock enable): - Signal high → clock runs. Signal low → clock blocked. Check: (!enable ∧ D≠Q) must be UNSAT. - as_enable = false (clock disable): - Signal high → clock blocked. Signal low → clock runs. Check: (disable ∧ D≠Q) must be UNSAT. - - - - - -TODOs: -1) Convert from the string hash to an integer hash -3) See why this path is needed: - if (ff.has_ce) { - // Already has CE, AND with new condition - Wire *combined_ce = module->addWire(NEW_ID); - module->addAnd(NEW_ID, ff.sig_ce, gating_signal, combined_ce); - ff.sig_ce = combined_ce; - } else { -4) Print the netlist before and after (checkout ways to determine # of flipflips) -5) Power analysis -6) Remove redundant vectors (visited and result) from getDownstreamSignals - and getUpstreamSignals -7) Check recursion -8) Check isValidGatingSet and findGatingCondition - -Add a new feature to not do simulation or SAT based on the false paths \ No newline at end of file