this is the optimized delay-tracking mechanism on which the 32-bit shift regs is replaced by a 4-bit counter plus a 4-bit mask. This uses lower resources but still able to track the delays and the exact slot number where the delay is already satisfied (hence no added latency)
There are 4 delays being tracked (delay_before_precharge, delay_before_activate, delay_before_read, and delay_before_write) and 8 banks, that means 32x4x8 = 1024 bits needed for this tracking delay mechanism (totally wasteful!)
- completed (mostly) the reset sequence
- added formal assertions and cover statements for reset sequence logic
- moved all parameters to this file
- fixed port widths
- converted IO ports to ANSI