Merge pull request #118 from mcmasterg/timfuz

fabric timing fuzzer
This commit is contained in:
John McMaster 2018-09-19 11:09:53 -07:00 committed by GitHub
commit e77842bec9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
69 changed files with 8542 additions and 25 deletions

View File

@ -1,10 +0,0 @@
.NOTPARALLEL:
SUBDIRS := $(patsubst %/,%, $(wildcard */))
.PHONY: $(SUBDIRS)
$(MAKECMDGOALS): $(SUBDIRS)
$(SUBDIRS):
$(MAKE) -C $@ -f ../Makefile.database $(MAKECMDGOALS)

View File

@ -1,15 +0,0 @@
.PHONY: all clean
all: tilegrid.json
# Small dance to say that there is a single recipe that is run once to generate
# multiple files.
tileconn.json tilegrid.json: fuzzers
.INTERMEDIATE: fuzzers
fuzzers: SHELL:=/bin/bash
fuzzers: settings.sh
source settings.sh && $(MAKE) -C ../../fuzzers all
clean:
rm -f *.db tileconn.json tilegrid.json *.yaml
rm -rf html

9
fuzzers/007-timing/.gitignore vendored Normal file
View File

@ -0,0 +1,9 @@
/.Xil
/design/
/design.bit
/design.bits
/design.dcp
/usage_statistics_webtalk.*
/vivado*
/specimen_*
build_speed

View File

@ -0,0 +1,26 @@
# for now hammering on just picorv32
# consider instead aggregating multiple projects
PRJ?=picorv32
PRJN?=8
all: build/timgrid-v.json
clean:
rm -rf build
cd speed && $(MAKE) clean
cd timgrid && $(MAKE) clean
cd projects/$(PRJ) && $(MAKE) clean
speed/build/speed.json:
cd speed && $(MAKE)
timgrid/build/timgrid.json:
cd timgrid && $(MAKE)
build/timgrid-v.json: projects/$(PRJ)/build/timgrid-v.json
mkdir -p build
cp projects/$(PRJ)/build/timgrid-v.json build/timgrid-v.json
projects/$(PRJ)/build/timgrid-v.json: speed/build/speed.json timgrid/build/timgrid.json
cd projects/$(PRJ) && $(MAKE) N=$(PRJN)

View File

@ -0,0 +1,267 @@
# Timing analysis fuzzer (timfuz)
WIP: 2018-09-10: this process is just starting together and is going to get significant cleanup. But heres the general idea
This runs various designs through Vivado and processes the
resulting timing informationin order to create very simple timing models.
While Vivado might have more involved models (say RC delays, fanout, etc),
timfuz creates simple models that bound realistic min and max element delays.
Currently this document focuses exclusively on fabric timing delays.
## Quick start
```
make -j$(nproc)
```
This will take a relatively long time (say 45 min) and generate build/timgrid-v.json.
You can do a quicker test run (say 3 min) using:
```
make PRJ=oneblinkw PRJN=1 -j$(nproc)
```
## Vivado background
Examples are for a XC750T on Vivado 2017.2.
TODO maybe move to: https://github.com/SymbiFlow/prjxray/wiki/Timing
### Speed index
Vivado seems to associate each delay model with a "speed index".
The fabric has these in two elements: wires (ie one delay element per tile) and pips.
For example, LUT output node A (ex: CLBLL_L_X12Y100/CLBLL_LL_A) has a single wire, also called CLBLL_L_X12Y100/CLBLL_LL_A.
This has speed index 733. Speed models can be queried and we find this corresponds to model C_CLBLL_LL_A.
There are various speed model types:
* bel_delay
* buffer
* buffer_switch
* content_version
* functional
* inpin
* outpin
* parameters
* pass_transistor
* switch
* table_lookup
* tl_buffer
* vt_limits
* wire
IIRC the interconnect is only composed of switch and wire types.
Indices with value 65535 (0xFFFF) never appear. Presumably these are unused models.
They are used for some special models such as those of type "content_version".
For example, the "xilinx" model is of type "content_version".
There are also "cost codes", but these seem to be very course (only around 30 of these)
and are suspected to be related more to PnR than timing model.
### Timing paths
The Vivado timing analyzer can easily output the following:
* Full: delay from BEL pin to BEL pin
* Interconnect only (ICO): delay from BEL pin to BEL pin, but only report interconnect delays (ie exclude site delays)
There is also theoretically an option to report delays up to a specific pip,
but this option is poorly documented and I was unable to get it to work.
Each timing path reports a fast process and a slow process min and max value. So four process values are reported in total:
* fast_max
* fast_min
* slow_max
* slow_min
For example, if the device is end of life, was poorly made, and at an extreme temperature, the delay may be up to the slow_max value.
Since ICO can be reported for each of these, fully analyzing a timing path results in 8 values.
Finally, part of this was analyzing tile regularity to discover what a reasonably compact timing model was.
We verified that all tiles of the same type have exactly the same delay elements.
## Methodology
Make sure you've read the Vivado background section first
### Background
This section briefly describes some of the mathmatics used by this technique that readers may not be familiar with.
These definitions are intended to be good enough to provide a high level understanding and may not be precise.
Numerical analysis: the study of algorithms that use numerical approximation (as opposed to general symbolic manipulations)
numpy: a popular numerical analysis python library. Often written np (import numpy as np).
scipy: provides higher level functionality on top of numpy
sympy ("symbolic python"): like numpy, but is designed to work with rational numbers.
For example, python actually stores 0.1 as 0.1000000000000000055511151231257827021181583404541015625.
However, sympy can represent this as the fraction 1/10, eliminating numerical approximation issues.
Least squares (ex: scipy.optimize.least_squares): approximation method to do a best fit of several variables to a set of equations.
For example, given the equations "x = 1" and "x = 2" there isn't an exact solution.
However, "x = 1.5" is a good compromise since its reasonably solves both equations.
Linear programming (ex: scipy.optimize.linprog aka linprog): approximation method that finds a set of variables that satisfy a set of inequalities.
For example,
Reduced row echelon form (RREF, ex: sympy.Matrix.rref): the simplest form that a system of linear equations can be solved to.
For example, given "x = 1" and "x + y = 9", one can solve for "x = 1" and "y = 8".
However, given "x + y = 1" and "x + y + z = 9", there aren't enough variables to solve this fully.
In this case RREF provides a best effort by giving the ratios between correlated variables.
One variable is normalized to 1 in each of these ratios and is called the "pivot".
Note that if numpy.linalg.solve encounters an unsolvable matrix it may either complain
or generate a false solution due to numerical approximation issues.
### What didn't work
First some quick background on things that didn't work to illustrate why the current approach was chosen.
I first tried to directly through things into linprog, but it unfairly weighted towards arbitrary shared variables. For example, feeding in:
* t0 >= 10
* t0 + t1 >= 100
It would declare "t0 = 100", "t1 = 0" instead of the more intuitive "t0 = 10", "t1 = 90".
I tried to work around this in several ways, notably subtracting equations from each other to produce additional constraints.
This worked okay, but was relatively slow and wasn't approaching nearly solved solutions, even when throwing a lot of data at it.
Next we tried randomly combining a bunch of the equations together and solving them like a regular linear algebra matrix (numpy.linalg.solve).
However, this illustrated that the system was under-constrained.
Further analysis revealed that there are some delay element combinations that simply can't be linearly separated.
This was checked primarily using numpy.linalg.matrix_rank, with some use of numpy.linalg.slogdet.
matrix_rank was preferred over slogdet since its more flexible against non-square matrices.
### Process
Above ultimately led to the idea that we should come up with a set of substitutions that would make the system solvable. This has several advantages:
* Easy to evaluate which variables aren't covered well enough by source data
* Easy to evaluate which variables weren't solved properly (if its fully constrained it should have had a non-zero delay)
At a high level, the above learnings gave this process:
* Find correlated variables by using RREF (sympy.Matrix.rref) to create variable groups
- Note pivots
- You must input a fractional type (ex: fractions.Fraction, but surprisingly not int) to get exact results, otherwise it seems to fall back to numerical approximation
- This is by far the most computationally expensive step
- Mixing RREF substitutions from one data set to another may not be recommended
* Use RREF result to substitute groups on input data, creating new meta variables, but ultimately reducing the number of columns
* Pick a corner
- Examples assume fast_max, but other corners are applicable with appropriate column and sign changes
* De-duplicate by removing equations that are less constrained
- Ex: if solving for a max corner and given:
- t0 + t1 >= 10
- t0 + t1 >= 12
- The first equation is redundant since the second provides a stricter constraint
- This significantly reduces computational time
* Use least squares (scipy.optimize.least_squares) to fit variables near input constraints
- Helps fairly weight delays vs the original input constraints
- Does not guarantee all constraints are met. For example, if this was put in (ignoring these would have been de-duplicated):
- t0 = 10
- t0 = 12
- It may decide something like t0 = 11, which means that the second constraint was not satisfied given we actually want t0 >= 12
* Use linear programming (scipy.optimize.linprog aka linprog) to formally meet all remaining constraints
- Start by filtering out all constraints that are already met. This should eliminate nearly all equations
* Map resulting constraints onto different tile types
- Group delays map onto the group pivot variable, typically setting other elements to 0 (if the processed set is not the one used to create the pivots they may be non-zero)
## TODO
Milestone 1 (MVP)
* DONE
* Provide any process corner with at least some of the fabric
Milestone 2
* Provide all four fabric corners
* Simple makefile based flow
* Cleanup/separate fabric input targets
Milestone 3
* Create site delay model
Final
* Investigate ZERO
* Investigate virtual switchboxes
* Compare our vs Xilinx output on random designs
### Improve test cases
Test cases are somewhat random right now. We could make much more targetted cases using custom routing to improve various fanout estimates and such.
Also there are a lot more elements that are not covered.
At a minimum these should be moved to their own directory.
### ZERO models
Background: there are a number of speed models with the name ZERO in them.
These generally seem to be zero delay, although needs more investigation.
Example: see virtual switchbox item below
The timing models will probably significantly improve if these are removed.
In the past I was removing them, but decided to keep them in for now in the spirit of being more conservative.
They include:
* _BSW_CLK_ZERO
* BSW_CLK_ZERO
* _BSW_ZERO
* BSW_ZERO
* _B_ZERO
* B_ZERO
* C_CLK_ZERO
* C_DSP_ZERO
* C_ZERO
* I_ZERO
* _O_ZERO
* O_ZERO
* RC_ZERO
* _R_ZERO
* R_ZERO
### Virtual switchboxes
Background: several low level configuration details are abstracted with virtual configurable elements.
For example, LUT inputs can be rearranged to reduce routing congestion.
However, the LUT configuratioon must be changed to match the switched inputs.
This is handled by the CLBLL_L_INTER switchbox, which doesn't encode any physical configuration bits.
However, this contains PIPs with delay models.
For example, LUT A, input A1 has node CLBLM_M_A1 coming from pip junction CLBLM_M_A1 has PIP CLBLM_IMUX7->CLBLM_M_A1
with speed index 659 (R_ZERO).
This might be further evidence on related issue that ZERO models should probably be removed.
### Incporporate fanout
We could probably significantly improve model granularity by studying delay impact on fanout
### Investigate RC delays
Suspect accuracy could be significantly improved by moving to SPICE based models. But this will take significantly more characterization
### Characterize real hardware
A few people have expressed interest on running tests on real hardware. Will take some thought given we don't have direct access
### Review approximation errors
Ex: one known issue is that the objective function linearly weights small and large delays.
This is only recommended when variables are approximately the same order of magnitude.
For example, carry chain delays are on the order of 7 ps while other delays are 100 ps.
Its very easy to put a large delay on the carry chain while it could have been more appropriately put somewhere else.

View File

@ -0,0 +1,73 @@
'''
pr0ntools
Benchmarking utility
Copyright 2010 John McMaster
'''
import time
def time_str(delta):
fraction = delta % 1
delta -= fraction
delta = int(delta)
seconds = delta % 60
delta /= 60
minutes = delta % 60
delta /= 60
hours = delta
return '%02d:%02d:%02d.%04d' % (hours, minutes, seconds, fraction * 10000)
class Benchmark:
start_time = None
end_time = None
def __init__(self, max_items=None):
# For the lazy
self.start_time = time.time()
self.end_time = None
self.max_items = max_items
self.cur_items = 0
def start(self):
self.start_time = time.time()
self.end_time = None
self.cur_items = 0
def stop(self):
self.end_time = time.time()
def advance(self, n=1):
self.cur_items += n
def set_cur_items(self, n):
self.cur_items = n
def delta_s(self):
if self.end_time:
return self.end_time - self.start_time
else:
return time.time() - self.start_time
def __str__(self):
if self.end_time:
return time_str(self.end_time - self.start_time)
elif self.max_items:
cur_time = time.time()
delta_t = cur_time - self.start_time
rate_s = 'N/A'
if delta_t > 0.000001:
rate = self.cur_items / (delta_t)
rate_s = '%f items / sec' % rate
if rate == 0:
eta_str = 'inf'
else:
remaining = (self.max_items - self.cur_items) / rate
eta_str = time_str(remaining)
else:
eta_str = "indeterminate"
return '%d / %d, ETA: %s @ %s' % (
self.cur_items, self.max_items, eta_str, rate_s)
else:
return time_str(time.time() - self.start_time)

View File

@ -0,0 +1,134 @@
#!/usr/bin/env python3
from timfuz import Benchmark, Ar_di2np, Ar_ds2t, A_di2ds, A_ds2di, loadc_Ads_b, index_names, A_ds2np, load_sub, run_sub_json
import numpy as np
import glob
import json
import math
from collections import OrderedDict
from fractions import Fraction
def Adi2matrix_random(A_ubd, b_ub, names):
# random assignment
# was making some empty rows
A_ret = [np.zeros(len(names)) for _i in range(len(names))]
b_ret = np.zeros(len(names))
for row, b in zip(A_ubd, b_ub):
# Randomly assign to a row
dst_rowi = random.randint(0, len(names) - 1)
rownp = Ar_di2np(row, cols=len(names), sf=1)
A_ret[dst_rowi] = np.add(A_ret[dst_rowi], rownp)
b_ret[dst_rowi] += b
return A_ret, b_ret
def Ads2matrix_linear(Ads, b):
names, Adi = A_ds2di(Ads)
cols = len(names)
rows_out = len(b)
A_ret = [np.zeros(cols) for _i in range(rows_out)]
b_ret = np.zeros(rows_out)
dst_rowi = 0
for row_di, row_b in zip(Adi, b):
row_np = Ar_di2np(row_di, cols)
A_ret[dst_rowi] = np.add(A_ret[dst_rowi], row_np)
b_ret[dst_rowi] += row_b
dst_rowi = (dst_rowi + 1) % rows_out
return A_ret, b_ret
def pmatrix(Anp, s):
import sympy
msym = sympy.Matrix(Anp)
print(s)
sympy.pprint(msym)
def pds(Ads, s):
names, Anp = A_ds2np(Ads)
pmatrix(Anp, s)
print('Names: %s' % (names, ))
def run(fns_in, sub_json=None, verbose=False):
assert len(fns_in) > 0
# arbitrary corner...data is thrown away
Ads, b = loadc_Ads_b(fns_in, "slow_max", ico=True)
if sub_json:
print('Subbing JSON %u rows' % len(Ads))
#pds(Ads, 'Orig')
names_old = index_names(Ads)
run_sub_json(Ads, sub_json, verbose=verbose)
names_new = index_names(Ads)
print("Sub: %u => %u names" % (len(names_old), len(names_new)))
print(names_new)
print('Subbed JSON %u rows' % len(Ads))
names = names_new
#pds(Ads, 'Sub')
else:
names = index_names(Ads)
# Squash into a matrix
# A_ub2, b_ub2 = Adi2matrix_random(A_ubd, b, names)
Amat, _bmat = Ads2matrix_linear(Ads, b)
#pmatrix(Amat, 'Matrix')
'''
The matrix must be fully ranked to even be considered reasonable
Even then, floating point error *possibly* could make it fully ranked, although probably not since we have whole numbers
Hence the slogdet check
'''
print
# https://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.linalg.matrix_rank.html
rank = np.linalg.matrix_rank(Amat)
print('rank: %s / %d col' % (rank, len(names)))
# doesn't work on non-square matrices
if 0:
# https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.linalg.slogdet.html
sign, logdet = np.linalg.slogdet(Amat)
# If the determinant is zero, then sign will be 0 and logdet will be -Inf
if sign == 0 and logdet == float('-inf'):
print('slogdet :( : 0')
else:
print('slogdet :) : %s, %s' % (sign, logdet))
if rank != len(names):
raise Exception(
"Matrix not fully ranked w/ %u / %u" % (rank, len(names)))
def main():
import argparse
parser = argparse.ArgumentParser(
description='Check if sub.json would make a linear equation solvable')
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--sub-json', help='')
parser.add_argument('fns_in', nargs='*', help='timing3.csv input files')
args = parser.parse_args()
# Store options in dict to ease passing through functions
bench = Benchmark()
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
sub_json = None
if args.sub_json:
sub_json = load_sub(args.sub_json)
try:
run(sub_json=sub_json, fns_in=fns_in, verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,58 @@
#!/usr/bin/env python3
from timfuz import Benchmark, simplify_rows, loadc_Ads_b
import glob
def run(fout, fns_in, corner, verbose=0):
Ads, b = loadc_Ads_b(fns_in, corner, ico=True)
Ads, b = simplify_rows(Ads, b, corner=corner)
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for row_b, row_ds in zip(b, Ads):
# write in same format, but just stick to this corner
out_b = [str(row_b) for _i in range(4)]
ico = '1'
items = [ico, ' '.join(out_b)]
for k, v in sorted(row_ds.items()):
items.append('%u %s' % (v, k))
fout.write(','.join(items) + '\n')
def main():
import argparse
parser = argparse.ArgumentParser(
description='Create a .csv with a single process corner')
parser.add_argument('--verbose', type=int, help='')
parser.add_argument(
'--auto-name', action='store_true', help='timing3.csv => timing3c.csv')
parser.add_argument('--out', default=None, help='Output csv')
parser.add_argument('--corner', help='Output csv')
parser.add_argument('fns_in', nargs='+', help='timing3.csv input files')
args = parser.parse_args()
bench = Benchmark()
fnout = args.out
if fnout is None:
if args.auto_name:
assert len(args.fns_in) == 1
fnin = args.fns_in[0]
fnout = fnin.replace('timing3.csv', 'timing3c.csv')
assert fnout != fnin, 'Expect timing3.csv in'
else:
fnout = '/dev/stdout'
print("Writing to %s" % fnout)
fout = open(fnout, 'w')
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
run(fout=fout, fns_in=fns_in, corner=args.corner, verbose=args.verbose)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,71 @@
#!/usr/bin/env python3
from timfuz import Benchmark, loadc_Ads_bs, index_names, load_sub, run_sub_json, instances
def gen_group(fnin, sub_json, strict=False, verbose=False):
print('Loading data')
Ads, bs = loadc_Ads_bs([fnin], ico=True)
print('Sub: %u rows' % len(Ads))
iold = instances(Ads)
names_old = index_names(Ads)
run_sub_json(Ads, sub_json, strict=strict, verbose=verbose)
names = index_names(Ads)
print("Sub: %u => %u names" % (len(names_old), len(names)))
print('Sub: %u => %u instances' % (iold, instances(Ads)))
for row_ds, row_bs in zip(Ads, bs):
yield row_ds, row_bs
def run(fns_in, fnout, sub_json, strict=False, verbose=False):
with open(fnout, 'w') as fout:
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for fn_in in fns_in:
for row_ds, row_bs in gen_group(fn_in, sub_json, strict=strict):
row_ico = 1
items = [str(row_ico), ' '.join([str(x) for x in row_bs])]
for k, v in sorted(row_ds.items()):
items.append('%u %s' % (v, k))
fout.write(','.join(items) + '\n')
def main():
import argparse
parser = argparse.ArgumentParser(
description='Substitute .csv to group correlated variables')
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--strict', action='store_true', help='')
parser.add_argument('--sub-csv', help='')
parser.add_argument(
'--sub-json',
required=True,
help='Group substitutions to make fully ranked')
parser.add_argument('--out', help='Output sub.json substitution result')
parser.add_argument('fns_in', nargs='+', help='timing3.csv input files')
args = parser.parse_args()
# Store options in dict to ease passing through functions
bench = Benchmark()
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
sub_json = load_sub(args.sub_json)
try:
run(
fns_in,
args.out,
sub_json=sub_json,
strict=args.strict,
verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,98 @@
#!/usr/bin/env python3
from timfuz import Benchmark, loadc_Ads_bs, load_sub, Ads2bounds, corners2csv, corner_s2i
def gen_flat(fns_in, sub_json, corner=None):
Ads, bs = loadc_Ads_bs(fns_in, ico=True)
bounds = Ads2bounds(Ads, bs)
zeros = set()
nonzeros = set()
for bound_name, bound_bs in bounds.items():
sub = sub_json['subs'].get(bound_name, None)
if sub:
# put entire delay into pivot
pivot = sub_json['pivots'][bound_name]
assert pivot not in zeros
nonzeros.add(pivot)
non_pivot = set(sub.keys() - set([pivot]))
#for name in non_pivot:
# assert name not in nonzeros, (pivot, name, nonzeros)
zeros.update(non_pivot)
yield pivot, bound_bs
else:
nonzeros.add(bound_name)
yield bound_name, bound_bs
# non-pivots can appear multiple times, but they should always be zero
# however, due to substitution limitations, just warn
violations = zeros.intersection(nonzeros)
if len(violations):
print('WARNING: %s non-0 non-pivot' % (len(violations)))
# XXX: how to best handle these?
# should they be fixed 0?
if corner:
zero_row = [None, None, None, None]
zero_row[corner_s2i[corner]] = 0
for zero in zeros - violations:
yield zero, zero_row
def run(fns_in, fnout, sub_json, corner=None, sort=False, verbose=False):
'''
if sort:
sortf = sorted
else:
sortf = lambda x: x
'''
with open(fnout, 'w') as fout:
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
#for name, corners in sortf(gen_flat(fnin, sub_json)):
for name, corners in gen_flat(fns_in, sub_json, corner=corner):
row_ico = 1
items = [str(row_ico), corners2csv(corners)]
items.append('%u %s' % (1, name))
fout.write(','.join(items) + '\n')
def main():
import argparse
parser = argparse.ArgumentParser(
description='Substitute .csv to ungroup correlated variables')
parser.add_argument('--verbose', action='store_true', help='')
#parser.add_argument('--sort', action='store_true', help='')
parser.add_argument('--sub-csv', help='')
parser.add_argument(
'--sub-json',
required=True,
help='Group substitutions to make fully ranked')
parser.add_argument('--corner', default=None, help='')
parser.add_argument('--out', default=None, help='output timing delay .csv')
parser.add_argument(
'fns_in',
nargs='+',
help='input timing delay .csv (NOTE: must be single column)')
args = parser.parse_args()
# Store options in dict to ease passing through functions
bench = Benchmark()
sub_json = load_sub(args.sub_json)
try:
run(
args.fns_in,
args.out,
sub_json=sub_json,
#sort=args.sort,
verbose=args.verbose,
corner=args.corner)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,3 @@
specimen_*
build

View File

@ -0,0 +1,4 @@
all:
echo "FIXME: tie projects together"
false

View File

@ -0,0 +1,47 @@
# Run corner specific calculations
TIMFUZ_DIR=$(XRAY_DIR)/fuzzers/007-timing
CORNER=slow_max
ALLOW_ZERO_EQN?=N
BADPRJ_OK?=N
all: build/$(CORNER)/timgrid-s.json
run:
$(MAKE) clean
$(MAKE) all
touch run.ok
clean:
rm -rf specimen_[0-9][0-9][0-9]/ seg_clblx.segbits __pycache__ run.ok
rm -rf vivado*.log vivado_*.str vivado*.jou design *.bits *.dcp *.bit
rm -rf build
.PHONY: all run clean
build/$(CORNER):
mkdir build/$(CORNER)
build/checksub:
false
build/$(CORNER)/leastsq.csv: build/sub.json build/grouped.csv build/checksub build/$(CORNER)
# Create a rough timing model that approximately fits the given paths
python3 $(TIMFUZ_DIR)/solve_leastsq.py --sub-json build/sub.json build/grouped.csv --corner $(CORNER) --out build/$(CORNER)/leastsq.csv.tmp
mv build/$(CORNER)/leastsq.csv.tmp build/$(CORNER)/leastsq.csv
build/$(CORNER)/linprog.csv: build/$(CORNER)/leastsq.csv build/grouped.csv
# Tweak rough timing model, making sure all constraints are satisfied
ALLOW_ZERO_EQN=$(ALLOW_ZERO_EQN) python3 $(TIMFUZ_DIR)/solve_linprog.py --sub-json build/sub.json --sub-csv build/$(CORNER)/leastsq.csv --massage build/grouped.csv --corner $(CORNER) --out build/$(CORNER)/linprog.csv.tmp
mv build/$(CORNER)/linprog.csv.tmp build/$(CORNER)/linprog.csv
build/$(CORNER)/flat.csv: build/$(CORNER)/linprog.csv
# Take separated variables and back-annotate them to the original timing variables
python3 $(TIMFUZ_DIR)/csv_group2flat.py --sub-json build/sub.json --corner $(CORNER) --out build/$(CORNER)/flat.csv.tmp build/$(CORNER)/linprog.csv
mv build/$(CORNER)/flat.csv.tmp build/$(CORNER)/flat.csv
build/$(CORNER)/timgrid-s.json: build/$(CORNER)/flat.csv
# Final processing
# Insert timing delays into actual tile layouts
python3 $(TIMFUZ_DIR)/tile_annotate.py --timgrid-s $(TIMFUZ_DIR)/timgrid/build/timgrid-s.json --out build/$(CORNER)/timgrid-vc.json build/$(CORNER)/flat.csv

View File

@ -0,0 +1,9 @@
#!/bin/bash
source ${XRAY_GENHEADER}
TIMFUZ_DIR=$XRAY_DIR/fuzzers/007-timing
timing_txt2csv () {
python3 $TIMFUZ_DIR/timing_txt2csv.py --speed-json $TIMFUZ_DIR/speed/build/speed.json --out timing3.csv timing3.txt
}

View File

@ -0,0 +1,4 @@
# so simple that likely no linprog adjustments will be needed
ALLOW_ZERO_EQN?=Y
include ../project.mk

View File

@ -0,0 +1,2 @@
Minimal, simple test

View File

@ -0,0 +1,8 @@
#!/bin/bash
set -ex
source ../generate.sh
vivado -mode batch -source ../generate.tcl
timing_txt2csv

View File

@ -0,0 +1,47 @@
source ../../../../../utils/utils.tcl
source ../../project.tcl
proc build_design {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog ../top.v
synth_design -top top
puts "Locking pins"
set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
[get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
puts "Package stuff"
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
puts "pblocking"
create_pblock roi
set roipb [get_pblocks roi]
set_property EXCLUDE_PLACEMENT 1 $roipb
add_cells_to_pblock $roipb [get_cells roi]
resize_pblock $roipb -add "$::env(XRAY_ROI)"
puts "randplace"
#randplace_pblock 50 roi
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
puts "dedicated route"
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
# disable combinitorial loop
# set_property IS_ENABLED 0 [get_drc_checks {LUTLP-1}]
#write_bitstream -force design.bit
}
build_design
write_info3

View File

@ -0,0 +1,37 @@
module roi (
input wire clk,
output wire out);
reg [23:0] counter;
assign out = counter[23] ^ counter[22] ^ counter[2] && counter[1] || counter[0];
always @(posedge clk) begin
counter <= counter + 1;
end
endmodule
module top(input wire clk, input wire stb, input wire di, output wire do);
localparam integer DIN_N = 0;
localparam integer DOUT_N = 1;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi(
.clk(clk),
.out(dout[0])
);
endmodule

View File

@ -0,0 +1,2 @@
include ../project.mk

View File

@ -0,0 +1,3 @@
picorv32 uses more advanced structures than simple LUT/FF tests (such as carry chain)
It attempts to provide some realistic timing data wrt fanout and used fabric

View File

@ -0,0 +1,8 @@
#!/bin/bash
set -ex
source ../generate.sh
vivado -mode batch -source ../generate.tcl
timing_txt2csv

View File

@ -0,0 +1,48 @@
source ../../../../../utils/utils.tcl
source ../../project.tcl
proc build_design {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog ../../../src/picorv32.v
read_verilog ../top.v
synth_design -top top
puts "Locking pins"
set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
[get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
puts "Package stuff"
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
puts "pblocking"
create_pblock roi
set roipb [get_pblocks roi]
set_property EXCLUDE_PLACEMENT 1 $roipb
add_cells_to_pblock $roipb [get_cells roi]
resize_pblock $roipb -add "$::env(XRAY_ROI)"
puts "randplace"
randplace_pblock 50 roi
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
puts "dedicated route"
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
# disable combinitorial loop
# set_property IS_ENABLED 0 [get_drc_checks {LUTLP-1}]
#write_bitstream -force design.bit
}
build_design
write_info3

View File

@ -0,0 +1,109 @@
//move some stuff to minitests/ncy0
`define SEED 32'h12345678
module top(input clk, stb, di, output do);
localparam integer DIN_N = 42;
localparam integer DOUT_N = 79;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi #(.DIN_N(DIN_N), .DOUT_N(DOUT_N))
roi (
.clk(clk),
.din(din),
.dout(dout)
);
endmodule
module roi(input clk, input [DIN_N-1:0] din, output [DOUT_N-1:0] dout);
parameter integer DIN_N = -1;
parameter integer DOUT_N = -1;
/*
//Take out for now to make sure LUTs are more predictable
picorv32 picorv32 (
.clk(clk),
.resetn(din[0]),
.mem_valid(dout[0]),
.mem_instr(dout[1]),
.mem_ready(din[1]),
.mem_addr(dout[33:2]),
.mem_wdata(dout[66:34]),
.mem_wstrb(dout[70:67]),
.mem_rdata(din[33:2])
);
*/
/*
randluts randluts (
.din(din[41:34]),
.dout(dout[78:71])
);
*/
randluts #(.N(150)) randluts (
.din(din[41:34]),
.dout(dout[78:71])
);
endmodule
module randluts(input [7:0] din, output [7:0] dout);
parameter integer N = 250;
function [31:0] xorshift32(input [31:0] xorin);
begin
xorshift32 = xorin;
xorshift32 = xorshift32 ^ (xorshift32 << 13);
xorshift32 = xorshift32 ^ (xorshift32 >> 17);
xorshift32 = xorshift32 ^ (xorshift32 << 5);
end
endfunction
function [63:0] lutinit(input [7:0] a, b);
begin
lutinit[63:32] = xorshift32(xorshift32(xorshift32(xorshift32({a, b} ^ `SEED))));
lutinit[31: 0] = xorshift32(xorshift32(xorshift32(xorshift32({b, a} ^ `SEED))));
end
endfunction
wire [(N+1)*8-1:0] nets;
assign nets[7:0] = din;
assign dout = nets[(N+1)*8-1:N*8];
genvar i, j;
generate
for (i = 0; i < N; i = i+1) begin:is
for (j = 0; j < 8; j = j+1) begin:js
localparam integer k = xorshift32(xorshift32(xorshift32(xorshift32((i << 20) ^ (j << 10) ^ `SEED)))) & 255;
(* KEEP, DONT_TOUCH *)
LUT6 #(
.INIT(lutinit(i, j))
) lut (
.I0(nets[8*i+(k+0)%8]),
.I1(nets[8*i+(k+1)%8]),
.I2(nets[8*i+(k+2)%8]),
.I3(nets[8*i+(k+3)%8]),
.I4(nets[8*i+(k+4)%8]),
.I5(nets[8*i+(k+5)%8]),
.O(nets[8*i+8+j])
);
end
end
endgenerate
endmodule

View File

@ -0,0 +1,2 @@
include ../project.mk

View File

@ -0,0 +1,2 @@
LUTs are physically laid out in an array and directly connected to a test pattern generator

View File

@ -0,0 +1,92 @@
#!/usr/bin/env python
import argparse
parser = argparse.ArgumentParser(description='')
parser.add_argument('--sdx', default='8', help='')
parser.add_argument('--sdy', default='4', help='')
args = parser.parse_args()
'''
Generate in pairs
Fill up switchbox quad for now
Create random connections between the LUTs
See how much routing pressure we can generate
Start with non-random connections to the LFSR for solver comparison
Start at SLICE_X16Y102
'''
SBASE = (16, 102)
SDX = int(args.sdx, 0)
SDY = int(args.sdy, 0)
nlut = 4 * SDX * SDY
nin = 6 * nlut
nout = nlut
print('//placelut simple')
print('//SBASE: %s' % (SBASE, ))
print('//SDX: %s' % (SDX, ))
print('//SDY: %s' % (SDX, ))
print('//nlut: %s' % (nlut, ))
print(
'''\
module roi (
input wire clk,
input wire [%u:0] ins,
output wire [%u:0] outs);''') % (nin - 1, nout - 1)
ini = 0
outi = 0
for lutx in xrange(SBASE[0], SBASE[0] + SDX):
for luty in xrange(SBASE[1], SBASE[1] + SDY):
loc = "SLICE_X%uY%u" % (lutx, luty)
for belc in 'ABCD':
bel = '%c6LUT' % belc
print(
'''\
(* KEEP, DONT_TOUCH, LOC="%s", BEL="%s" *)
LUT6 #(
.INIT(64'hBAD1DEA_1DEADCE0)
) %s (''') % (loc, bel, 'lut_x%uy%u_%c' % (lutx, luty, belc))
for i in xrange(6):
print('''\
.I%u(ins[%u]),''' % (i, ini))
ini += 1
print('''\
.O(outs[%u]));''') % (outi, )
outi += 1
assert nin == ini
assert nout == outi
print(
'''
endmodule
module top(input wire clk, input wire stb, input wire di, output wire do);
localparam integer DIN_N = %u;
localparam integer DOUT_N = %u;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi(
.clk(clk),
.ins(din),
.outs(dout)
);
endmodule''') % (nin, nout)

View File

@ -0,0 +1,11 @@
#!/bin/bash
set -ex
source ${XRAY_GENHEADER}
TIMFUZ_DIR=$XRAY_DIR/fuzzers/007-timing
python ../generate.py --sdx 4 --sdy 4 >top.v
vivado -mode batch -source ../generate.tcl
python3 $TIMFUZ_DIR/timing_txt2csv.py --speed-json $TIMFUZ_DIR/speed/build/speed.json --out timing3.csv timing3.txt

View File

@ -0,0 +1,47 @@
source ../../../../../utils/utils.tcl
source ../../project.tcl
proc build_design {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog top.v
synth_design -top top
puts "Locking pins"
set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
[get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
puts "Package stuff"
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
puts "pblocking"
create_pblock roi
set roipb [get_pblocks roi]
set_property EXCLUDE_PLACEMENT 1 $roipb
add_cells_to_pblock $roipb [get_cells roi]
resize_pblock $roipb -add "$::env(XRAY_ROI)"
puts "randplace"
randplace_pblock 50 roi
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
puts "dedicated route"
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
# disable combinitorial loop
# set_property IS_ENABLED 0 [get_drc_checks {LUTLP-1}]
#write_bitstream -force design.bit
}
build_design
write_info3

View File

@ -0,0 +1,2 @@
include ../project.mk

View File

@ -0,0 +1,2 @@
LUTs are physically laid out in an array and connected to test pattern generator and fed back to other LUTs

View File

@ -0,0 +1,103 @@
#!/usr/bin/env python
'''
Note: vivado will (by default) fail bitgen DRC on LUT feedback loops
Looks like can probably be disabled, but we actually don't need a bitstream for timing analysis
'''
import argparse
import random
random.seed()
parser = argparse.ArgumentParser(description='')
parser.add_argument('--sdx', default='8', help='')
parser.add_argument('--sdy', default='4', help='')
args = parser.parse_args()
'''
Generate in pairs
Fill up switchbox quad for now
Create random connections between the LUTs
See how much routing pressure we can generate
Start with non-random connections to the LFSR for solver comparison
Start at SLICE_X16Y102
'''
SBASE = (16, 102)
SDX = int(args.sdx, 0)
SDY = int(args.sdy, 0)
nlut = 4 * SDX * SDY
nin = 6 * nlut
nout = nlut
print('//placelut w/ feedback')
print('//SBASE: %s' % (SBASE, ))
print('//SDX: %s' % (SDX, ))
print('//SDY: %s' % (SDX, ))
print('//nlut: %s' % (nlut, ))
print(
'''\
module roi (
input wire clk,
input wire [%u:0] ins,
output wire [%u:0] outs);''') % (nin - 1, nout - 1)
ini = 0
outi = 0
for lutx in xrange(SBASE[0], SBASE[0] + SDX):
for luty in xrange(SBASE[1], SBASE[1] + SDY):
loc = "SLICE_X%uY%u" % (lutx, luty)
for belc in 'ABCD':
bel = '%c6LUT' % belc
print(
'''\
(* KEEP, DONT_TOUCH, LOC="%s", BEL="%s" *)
LUT6 #(
.INIT(64'hBAD1DEA_1DEADCE0)
) %s (''') % (loc, bel, 'lut_x%uy%u_%c' % (lutx, luty, belc))
for i in xrange(6):
if random.randint(0, 9) < 1:
wfrom = 'ins[%u]' % ini
ini += 1
else:
wfrom = 'outs[%u]' % random.randint(0, nout - 1)
print('''\
.I%u(%s),''' % (i, wfrom))
print('''\
.O(outs[%u]));''') % (outi, )
outi += 1
#assert nin == ini
assert nout == outi
print(
'''
endmodule
module top(input wire clk, input wire stb, input wire di, output wire do);
localparam integer DIN_N = %u;
localparam integer DOUT_N = %u;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi(
.clk(clk),
.ins(din),
.outs(dout)
);
endmodule''') % (nin, nout)

View File

@ -0,0 +1,9 @@
#!/bin/bash
set -ex
source ../generate.sh
python ../generate.py --sdx 4 --sdy 4 >top.v
vivado -mode batch -source ../generate.tcl
timing_txt2csv

View File

@ -0,0 +1,47 @@
source ../../../../../utils/utils.tcl
source ../../project.tcl
proc build_design {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog top.v
synth_design -top top
puts "Locking pins"
set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
[get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
puts "Package stuff"
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
puts "pblocking"
create_pblock roi
set roipb [get_pblocks roi]
set_property EXCLUDE_PLACEMENT 1 $roipb
add_cells_to_pblock $roipb [get_cells roi]
resize_pblock $roipb -add "$::env(XRAY_ROI)"
puts "randplace"
randplace_pblock 50 roi
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
puts "dedicated route"
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
# disable combinitorial loop
# set_property IS_ENABLED 0 [get_drc_checks {LUTLP-1}]
#write_bitstream -force design.bit
}
build_design
write_info3

View File

@ -0,0 +1,2 @@
include ../project.mk

View File

@ -0,0 +1,3 @@
LUTs are physically laid out in an array and connected to test pattern generator and fed back to other LUTs.
FFs are randomly inserted between connections.

View File

@ -0,0 +1,127 @@
#!/usr/bin/env python
'''
Note: vivado will (by default) fail bitgen DRC on LUT feedback loops
Looks like can probably be disabled, but we actually don't need a bitstream for timing analysis
ERROR: [Vivado 12-2285] Cannot set LOC property of instance 'roi/lut_x22y102_D', Instance roi/lut_x22y102_D can not be placed in D6LUT of site SLICE_X18Y103 because the bel is occupied by roi/lut_x18y103_D(port:). This could be caused by bel constraint conflict
Resolution: When using BEL constraints, ensure the BEL constraints are defined before the LOC constraints to avoid conflicts at a given site.
'''
import argparse
import random
random.seed()
parser = argparse.ArgumentParser(description='')
parser.add_argument('--sdx', default='8', help='')
parser.add_argument('--sdy', default='4', help='')
args = parser.parse_args()
'''
Generate in pairs
Fill up switchbox quad for now
Create random connections between the LUTs
See how much routing pressure we can generate
Start with non-random connections to the LFSR for solver comparison
Start at SLICE_X16Y102
'''
SBASE = (16, 102)
SDX = int(args.sdx, 0)
SDY = int(args.sdy, 0)
nlut = 4 * SDX * SDY
nin = 6 * nlut
nout = nlut
print('//placelut w/ FF + feedback')
print('//SBASE: %s' % (SBASE, ))
print('//SDX: %s' % (SDX, ))
print('//SDY: %s' % (SDX, ))
print('//nlut: %s' % (nlut, ))
print(
'''\
module roi (
input wire clk,
input wire [%u:0] ins,
output wire [%u:0] outs);''') % (nin - 1, nout - 1)
ini = 0
outi = 0
for lutx in xrange(SBASE[0], SBASE[0] + SDX):
for luty in xrange(SBASE[1], SBASE[1] + SDY):
loc = "SLICE_X%uY%u" % (lutx, luty)
for belc in 'ABCD':
bel = '%c6LUT' % belc
name = 'lut_x%uy%u_%c' % (lutx, luty, belc)
print(
'''\
(* KEEP, DONT_TOUCH, LOC="%s", BEL="%s" *)
LUT6 #(
.INIT(64'hBAD1DEA_1DEADCE0)
) %s (''') % (loc, bel, name)
for i in xrange(6):
rval = random.randint(0, 9)
if rval < 3:
wfrom = 'ins[%u]' % ini
ini += 1
#elif rval < 6:
# wfrom = 'outsr[%u]' % random.randint(0, nout - 1)
else:
wfrom = 'outs[%u]' % random.randint(0, nout - 1)
print('''\
.I%u(%s),''' % (i, wfrom))
out_w = name + '_o'
print('''\
.O(%s));''') % (out_w, )
outs_w = "outs[%u]" % outi
if random.randint(0, 9) < 5:
print(' assign %s = %s;' % (outs_w, out_w))
else:
out_r = name + '_or'
print(
'''\
reg %s;
assign %s = %s;
always @(posedge clk) begin
%s = %s;
end
''' % (out_r, outs_w, out_r, out_r, out_w))
outi += 1
#assert nin == ini
assert nout == outi
print(
'''
endmodule
module top(input wire clk, input wire stb, input wire di, output wire do);
localparam integer DIN_N = %u;
localparam integer DOUT_N = %u;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi(
.clk(clk),
.ins(din),
.outs(dout)
);
endmodule''') % (nin, nout)

View File

@ -0,0 +1,9 @@
#!/bin/bash
set -ex
source ../generate.sh
python ../generate.py --sdx 4 --sdy 4 >top.v
vivado -mode batch -source ../generate.tcl
timing_txt2csv

View File

@ -0,0 +1,47 @@
source ../../../../../utils/utils.tcl
source ../../project.tcl
proc build_design {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog top.v
synth_design -top top
puts "Locking pins"
set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
[get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
puts "Package stuff"
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
puts "pblocking"
create_pblock roi
set roipb [get_pblocks roi]
set_property EXCLUDE_PLACEMENT 1 $roipb
add_cells_to_pblock $roipb [get_cells roi]
resize_pblock $roipb -add "$::env(XRAY_ROI)"
puts "randplace"
randplace_pblock 50 roi
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
puts "dedicated route"
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
# disable combinitorial loop
# set_property IS_ENABLED 0 [get_drc_checks {LUTLP-1}]
#write_bitstream -force design.bit
}
build_design
write_info3

View File

@ -0,0 +1,77 @@
# project.mk: build specimens (run vivado), compute rref
# corner.mk: run corner specific calculations
N := 1
SPECIMENS := $(addprefix specimen_,$(shell seq -f '%03.0f' $(N)))
SPECIMENS_OK := $(addsuffix /OK,$(SPECIMENS))
CSVS := $(addsuffix /timing3.csv,$(SPECIMENS))
TIMFUZ_DIR=$(XRAY_DIR)/fuzzers/007-timing
RREF_CORNER=slow_max
ALLOW_ZERO_EQN?=N
BADPRJ_OK?=N
TIMGRID_VCS=build/fast_max/timgrid-vc.json build/fast_min/timgrid-vc.json build/slow_max/timgrid-vc.json build/slow_min/timgrid-vc.json
all: build/timgrid-v.json
# make build/checksub first
build/fast_max/timgrid-vc.json: build/checksub
$(MAKE) -f $(TIMFUZ_DIR)/projects/corner.mk CORNER=fast_max
build/fast_min/timgrid-vc.json: build/checksub
$(MAKE) -f $(TIMFUZ_DIR)/projects/corner.mk CORNER=fast_min
build/slow_max/timgrid-vc.json: build/checksub
$(MAKE) -f $(TIMFUZ_DIR)/projects/corner.mk CORNER=slow_max
build/slow_min/timgrid-vc.json: build/checksub
$(MAKE) -f $(TIMFUZ_DIR)/projects/corner.mk CORNER=slow_min
$(SPECIMENS_OK):
bash generate.sh $(subst /OK,,$@) || (if [ "$(BADPRJ_OK)" != 'Y' ] ; then exit 1; fi; exit 0)
touch $@
run:
$(MAKE) clean
$(MAKE) all
touch run.ok
clean:
rm -rf specimen_[0-9][0-9][0-9]/ seg_clblx.segbits __pycache__ run.ok
rm -rf vivado*.log vivado_*.str vivado*.jou design *.bits *.dcp *.bit
rm -rf build
.PHONY: all run clean
# Normally require all projects to complete
# If BADPRJ_OK is allowed, only take projects that were successful
# FIXME: couldn't get call to work
exist_csvs = \
for f in $(CSVS); do \
if [ "$(BADPRJ_OK)" != 'Y' -o -f $$f ] ; then \
echo $$f; \
fi; \
done
# rref should be the same regardless of corner
build/sub.json: $(SPECIMENS_OK)
mkdir -p build
# Discover which variables can be separated
# This is typically the longest running operation
\
csvs=$$(for f in $(CSVS); do if [ "$(BADPRJ_OK)" != 'Y' -o -f $$f ] ; then echo $$f; fi; done) ; \
python3 $(TIMFUZ_DIR)/rref.py --corner $(RREF_CORNER) --simplify --out build/sub.json.tmp $$csvs
mv build/sub.json.tmp build/sub.json
build/grouped.csv: $(SPECIMENS_OK) build/sub.json
# Separate variables
\
csvs=$$(for f in $(CSVS); do if [ "$(BADPRJ_OK)" != 'Y' -o -f $$f ] ; then echo $$f; fi; done) ; \
python3 $(TIMFUZ_DIR)/csv_flat2group.py --sub-json build/sub.json --strict --out build/grouped.csv.tmp $$csvs
mv build/grouped.csv.tmp build/grouped.csv
build/checksub: build/grouped.csv build/sub.json
# Verify sub.json makes a cleanly solvable solution with no non-pivot leftover
python3 $(TIMFUZ_DIR)/checksub.py --sub-json build/sub.json build/grouped.csv
touch build/checksub
build/timgrid-v.json: $(TIMGRID_VCS)
python3 $(TIMFUZ_DIR)/timgrid_vc2v.py --out build/timgrid-v.json $(TIMGRID_VCS)

View File

@ -0,0 +1,174 @@
proc pin_info {pin} {
set cell [get_cells -of_objects $pin]
set bel [get_bels -of_objects $cell]
set site [get_sites -of_objects $bel]
return "$site $bel"
}
proc pin_bel {pin} {
set cell [get_cells -of_objects $pin]
set bel [get_bels -of_objects $cell]
return $bel
}
# Changed to group wires and nodes
# This allows tracing the full path along with pips
proc write_info3 {} {
set outdir "."
set fp [open "$outdir/timing3.txt" w]
# bel as site/bel, so don't bother with site
puts $fp "net src_bel dst_bel ico fast_max fast_min slow_max slow_min pips inodes wires"
set TIME_start [clock clicks -milliseconds]
set verbose 0
set equations 0
set site_src_nets 0
set site_dst_nets 0
set neti 0
set nets [get_nets -hierarchical]
set nnets [llength $nets]
foreach net $nets {
incr neti
#if {$neti >= 10} {
# puts "Debug break"
# break
#}
puts "Net $neti / $nnets: $net"
# The semantics of get_pins -leaf is kind of odd
# When no passthrough LUTs exist, it has no effect
# When passthrough LUT present:
# -w/o -leaf: some pins + passthrough LUT pins
# -w/ -leaf: different pins + passthrough LUT pins
# With OUT filter this seems to be sufficient
set src_pin [get_pins -leaf -filter {DIRECTION == OUT} -of_objects $net]
set src_bel [pin_bel $src_pin]
set src_site [get_sites -of_objects $src_bel]
# Only one net driver
set src_site_pins [get_site_pins -filter {DIRECTION == OUT} -of_objects $net]
# Sometimes this is empty for reasons that escape me
# Emitting direction doesn't help
if {[llength $src_site_pins] < 1} {
if $verbose {
puts " Ignoring site internal net"
}
incr site_src_nets
continue
}
set dst_site_pins_net [get_site_pins -filter {DIRECTION == IN} -of_objects $net]
if {[llength $dst_site_pins_net] < 1} {
puts " Skipping site internal source net"
incr site_dst_nets
continue
}
foreach src_site_pin $src_site_pins {
if $verbose {
puts "Source: $src_pin at site $src_site:$src_bel, spin $src_site_pin"
}
# Run with and without interconnect only
foreach ico "0 1" {
set ico_flag ""
if $ico {
set ico_flag "-interconnect_only"
set delays [get_net_delays $ico_flag -of_objects $net]
} else {
set delays [get_net_delays -of_objects $net]
}
foreach delay $delays {
set delaystr [get_property NAME $delay]
set dst_pins [get_property TO_PIN $delay]
set dst_pin [get_pins $dst_pins]
#puts " $delaystr: $src_pin => $dst_pin"
set dst_bel [pin_bel $dst_pin]
set dst_site [get_sites -of_objects $dst_bel]
if $verbose {
puts " Dest: $dst_pin at site $dst_site:$dst_bel"
}
set dst_site_pins [get_site_pins -of_objects $dst_pin]
# Some nets are internal
# But should this happen on dest if we've already filtered source?
if {"$dst_site_pins" eq ""} {
continue
}
# Also apparantly you can have multiple of these as well
foreach dst_site_pin $dst_site_pins {
set fast_max [get_property "FAST_MAX" $delay]
set fast_min [get_property "FAST_MIN" $delay]
set slow_max [get_property "SLOW_MAX" $delay]
set slow_min [get_property "SLOW_MIN" $delay]
# Want:
# Site / BEL at src
# Site / BEL at dst
# Pips in between
# Walk net, looking for interesting elements in between
set pips [get_pips -of_objects $net -from $src_site_pin -to $dst_site_pin]
if $verbose {
foreach pip $pips {
puts " PIP $pip"
}
}
set nodes [get_nodes -of_objects $net -from $src_site_pin -to $dst_site_pin]
#set wires [get_wires -of_objects $net -from $src_site_pin -to $dst_site_pin]
set wires [get_wires -of_objects $nodes]
# puts $fp "$net $src_bel $dst_bel $ico $fast_max $fast_min $slow_max $slow_min $pips"
puts -nonewline $fp "$net $src_bel $dst_bel $ico $fast_max $fast_min $slow_max $slow_min"
# Write pips w/ speed index
puts -nonewline $fp " "
set needspace 0
foreach pip $pips {
if $needspace {
puts -nonewline $fp "|"
}
set speed_index [get_property SPEED_INDEX $pip]
puts -nonewline $fp "$pip:$speed_index"
set needspace 1
}
# Write nodes
#set nodes_str [string map {" " "|"} $nodes]
#puts -nonewline $fp " $nodes_str"
puts -nonewline $fp " "
set needspace 0
foreach node $nodes {
if $needspace {
puts -nonewline $fp "|"
}
set nwires [llength [get_wires -of_objects $node]]
puts -nonewline $fp "$node:$nwires"
set needspace 1
}
# Write wires
puts -nonewline $fp " "
set needspace 0
foreach wire $wires {
if $needspace {
puts -nonewline $fp "|"
}
set speed_index [get_property SPEED_INDEX $wire]
puts -nonewline $fp "$wire:$speed_index"
set needspace 1
}
puts $fp ""
incr equations
break
}
}
}
}
}
close $fp
set TIME_taken [expr [clock clicks -milliseconds] - $TIME_start]
puts "Took ms: $TIME_taken"
puts "Generated $equations equations"
puts "Skipped $site_src_nets (+ $site_dst_nets) site nets"
}

237
fuzzers/007-timing/rref.py Normal file
View File

@ -0,0 +1,237 @@
#!/usr/bin/env python3
from timfuz import Benchmark, Ar_di2np, loadc_Ads_b, index_names, A_ds2np, simplify_rows
import numpy as np
import glob
import math
import json
import sympy
from collections import OrderedDict
from fractions import Fraction
def fracr_quick(r):
return [Fraction(numerator=int(x), denominator=1) for x in r]
def fracm_quick(m):
'''Convert integer matrix to Fraction matrix'''
t = type(m[0][0])
print('fracm_quick type: %s' % t)
return [fracr_quick(r) for r in m]
class State(object):
def __init__(self, Ads, drop_names=[]):
self.Ads = Ads
self.names = index_names(self.Ads)
# known zero delay elements
self.drop_names = set(drop_names)
# active names in rows
# includes sub variables, excludes variables that have been substituted out
self.base_names = set(self.names)
self.names = set(self.base_names)
# List of variable substitutions
# k => dict of v:n entries that it came from
self.subs = {}
self.verbose = True
def print_stats(self):
print("Stats")
print(" Substitutions: %u" % len(self.subs))
if self.subs:
print(
" Largest: %u" % max([len(x) for x in self.subs.values()]))
print(" Rows: %u" % len(self.Ads))
print(
" Cols (in): %u" % (len(self.base_names) + len(self.drop_names)))
print(" Cols (preprocessed): %u" % len(self.base_names))
print(" Drop names: %u" % len(self.drop_names))
print(" Cols (out): %u" % len(self.names))
print(" Solvable vars: %u" % len(self.names & self.base_names))
assert len(self.names) >= len(self.subs)
@staticmethod
def load(fn_ins, simplify=False, corner=None):
Ads, b = loadc_Ads_b(fn_ins, corner=corner, ico=True)
if simplify:
print('Simplifying corner %s' % (corner, ))
Ads, b = simplify_rows(Ads, b, remove_zd=False, corner=corner)
return State(Ads)
def write_state(state, fout):
j = {
'names': dict([(x, None) for x in state.names]),
'drop_names': list(state.drop_names),
'base_names': list(state.base_names),
'subs': dict([(name, values) for name, values in state.subs.items()]),
'pivots': state.pivots,
}
json.dump(j, fout, sort_keys=True, indent=4, separators=(',', ': '))
def Anp2matrix(Anp):
'''
Original idea was to make into a square matrix
but this loses too much information
so now this actually isn't doing anything and should probably be eliminated
'''
ncols = len(Anp[0])
A_ub2 = [np.zeros(ncols) for _i in range(ncols)]
dst_rowi = 0
for rownp in Anp:
A_ub2[dst_rowi] = np.add(A_ub2[dst_rowi], rownp)
dst_rowi = (dst_rowi + 1) % ncols
return A_ub2
def row_np2ds(rownp, names):
ret = {}
assert len(rownp) == len(names), (len(rownp), len(names))
for namei, name in enumerate(names):
v = rownp[namei]
if v:
ret[name] = v
return ret
def row_sym2dsf(rowsym, names):
'''Convert a sympy row into a dictionary of keys to (numerator, denominator) tuples'''
from sympy import fraction
ret = {}
assert len(rowsym) == len(names), (len(rowsym), len(names))
for namei, name in enumerate(names):
v = rowsym[namei]
if v:
(num, den) = fraction(v)
ret[name] = (int(num), int(den))
return ret
def state_rref(state, verbose=False):
print('Converting rows to integer keys')
names, Anp = A_ds2np(state.Ads)
print('np: %u rows x %u cols' % (len(Anp), len(Anp[0])))
if 0:
print('Combining rows into matrix')
mnp = Anp2matrix(Anp)
else:
mnp = Anp
print('Matrix: %u rows x %u cols' % (len(mnp), len(mnp[0])))
print('Converting np to sympy matrix')
mfrac = fracm_quick(mnp)
# doesn't seem to change anything
#msym = sympy.MutableSparseMatrix(mfrac)
msym = sympy.Matrix(mfrac)
# internal encoding has significnat performance implications
#assert type(msym[0]) is sympy.Integer
if verbose:
print('names')
print(names)
print('Matrix')
sympy.pprint(msym)
print('Making rref')
rref, pivots = msym.rref(normalize_last=False)
if verbose:
print('Pivots')
sympy.pprint(pivots)
print('rref')
sympy.pprint(rref)
state.pivots = {}
def row_solved(rowsym, row_pivot):
for ci, c in enumerate(rowsym):
if ci == row_pivot:
continue
if c != 0:
return False
return True
#rrefnp = np.array(rref).astype(np.float64)
#print('Computing groups w/ rref %u row x %u col' % (len(rrefnp), len(rrefnp[0])))
#print(rrefnp)
# rows that have a single 1 are okay
# anything else requires substitution (unless all 0)
# pivots may be fewer than the rows
# remaining rows should be 0s
for row_i, row_pivot in enumerate(pivots):
rowsym = rref.row(row_i)
# yipee! nothign to report
if row_solved(rowsym, row_pivot):
continue
# a grouping
group_name = "GRP_%u" % row_i
rowdsf = row_sym2dsf(rowsym, names)
state.subs[group_name] = rowdsf
# Add the new variables
state.names.add(group_name)
# Remove substituted variables
# Note: variables may appear multiple times
state.names.difference_update(set(rowdsf.keys()))
pivot_name = names[row_pivot]
state.pivots[group_name] = pivot_name
if verbose:
print("%s (%s): %s" % (group_name, pivot_name, rowdsf))
return state
def run(fnout, fn_ins, simplify=False, corner=None, verbose=0):
print('Loading data')
assert len(fn_ins) > 0
state = State.load(fn_ins, simplify=simplify, corner=corner)
state_rref(state, verbose=verbose)
state.print_stats()
if fnout:
with open(fnout, 'w') as fout:
write_state(state, fout)
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Compute reduced row echelon (RREF) to form sub.json (variable groups)'
)
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--simplify', action='store_true', help='')
parser.add_argument('--corner', default="slow_max", help='')
parser.add_argument(
'--speed-json',
default='build_speed/speed.json',
help='Provides speed index to name translation')
parser.add_argument('--out', help='Output sub.json substitution result')
parser.add_argument('fns_in', nargs='*', help='timing3.csv input files')
args = parser.parse_args()
bench = Benchmark()
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
try:
run(
fnout=args.out,
fn_ins=fns_in,
simplify=args.simplify,
corner=args.corner,
verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,190 @@
#!/usr/bin/env python3
from timfuz import Benchmark, load_sub, corner_s2i, acorner2csv
import timfuz
import numpy as np
import math
import sys
import os
import time
import timfuz_solve
import scipy.optimize as optimize
def mkestimate(Anp, b):
'''
Ballpark upper bound estimate assuming variables contribute all of the delay in their respective row
Return the min of all of the occurances
XXX: should this be corner adjusted?
'''
cols = len(Anp[0])
x0 = np.array([1e3 for _x in range(cols)])
for row_np, row_b in zip(Anp, b):
for coli, val in enumerate(row_np):
if val:
# Scale by number occurances
ub = row_b / val
if ub >= 0:
x0[coli] = min(x0[coli], ub)
return x0
def save(outfn, xvals, names, corner):
# ballpark minimum actual observed delay is around 7 (carry chain)
# anything less than one is probably a solver artifact
delta = 0.5
corneri = corner_s2i[corner]
# Round conservatively
roundf = {
'fast_max': math.ceil,
'fast_min': math.floor,
'slow_max': math.ceil,
'slow_min': math.floor,
}[corner]
print('Writing results')
skips = 0
keeps = 0
with open(outfn, 'w') as fout:
# write as one variable per line
# this natively forms a bound if fed into linprog solver
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for xval, name in zip(xvals, names):
row_ico = 1
# also review ceil vs floor choice for min vs max
# lets be more conservative for now
if xval < delta:
#print('Skipping %s: %0.6f' % (name, xval))
skips += 1
continue
keeps += 1
#xvali = round(xval)
items = [str(row_ico), acorner2csv(roundf(xval), corneri)]
items.append('%u %s' % (1, name))
fout.write(','.join(items) + '\n')
print(
'Wrote: skip %u => %u / %u valid delays' % (skips, keeps, len(names)))
assert keeps, 'Failed to estimate delay'
def run_corner(
Anp, b, names, corner, verbose=False, opts={}, meta={}, outfn=None):
# Given timing scores for above delays (-ps)
assert type(Anp[0]) is np.ndarray, type(Anp[0])
assert type(b) is np.ndarray, type(b)
#check_feasible(Anp, b)
'''
Be mindful of signs
Have something like
timing1/timing 2 are constants
delay1 + delay2 + delay4 >= timing1
delay2 + delay3 >= timing2
But need it in compliant form:
-delay1 + -delay2 + -delay4 <= -timing1
-delay2 + -delay3 <= -timing2
'''
rows = len(Anp)
cols = len(Anp[0])
print('Unique delay elements: %d' % len(names))
print('Input paths')
print(' # timing scores: %d' % len(b))
print(' Rows: %d' % rows)
'''
You must have at least as many things to optimize as variables
That is, the system must be plausibly constrained for it to attempt a solve
If not, you'll get a message like
TypeError: Improper input: N=3 must not exceed M=2
'''
if rows < cols:
raise Exception("rows must be >= cols")
tlast = [None]
iters = [0]
printn = [0]
def progress_print():
iters[0] += 1
if tlast[0] is None:
tlast[0] = time.time()
if time.time() - tlast[0] > 1.0:
sys.stdout.write('I:%d ' % iters[0])
tlast[0] = time.time()
printn[0] += 1
if printn[0] % 10 == 0:
sys.stdout.write('\n')
sys.stdout.flush()
def func(params):
progress_print()
return (b - np.dot(Anp, params))
print('')
# Now find smallest values for delay constants
# Due to input bounds (ex: column limit), some delay elements may get eliminated entirely
print('Running leastsq w/ %d r, %d c (%d name)' % (rows, cols, len(names)))
# starting at 0 completes quicky, but gives a solution near 0 with terrible results
# maybe give a starting estimate to the smallest net delay with the indicated variable
#x0 = np.array([1000.0 for _x in range(cols)])
print('Creating x0 estimate')
x0 = mkestimate(Anp, b)
print('Solving')
res = optimize.least_squares(func, x0, bounds=(0, float('inf')))
print('Done')
if outfn:
save(outfn, res.x, names, corner)
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Solve timing solution using least squares objective function')
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument(
'--massage',
action='store_true',
help='Derive additional constraints to improve solution')
parser.add_argument(
'--sub-json', help='Group substitutions to make fully ranked')
parser.add_argument('--corner', required=True, default="slow_max", help='')
parser.add_argument(
'--out', default=None, help='output timing delay .json')
parser.add_argument('fns_in', nargs='+', help='timing3.csv input files')
args = parser.parse_args()
# Store options in dict to ease passing through functions
bench = Benchmark()
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
sub_json = None
if args.sub_json:
sub_json = load_sub(args.sub_json)
try:
timfuz_solve.run(
run_corner=run_corner,
sub_json=sub_json,
fns_in=fns_in,
corner=args.corner,
massage=args.massage,
outfn=args.out,
verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,229 @@
#!/usr/bin/env python3
import scipy.optimize as optimize
from timfuz import Benchmark, load_sub, A_ub_np2d, acorner2csv, corner_s2i
import numpy as np
import glob
import json
import math
import sys
import os
import time
import timfuz_solve
def save(outfn, xvals, names, corner):
# ballpark minimum actual observed delay is around 7 (carry chain)
# anything less than one is probably a solver artifact
delta = 0.5
corneri = corner_s2i[corner]
roundf = {
'fast_max': math.ceil,
'fast_min': math.floor,
'slow_max': math.ceil,
'slow_min': math.floor,
}[corner]
print('Writing results')
zeros = 0
with open(outfn, 'w') as fout:
# write as one variable per line
# this natively forms a bound if fed into linprog solver
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for xval, name in zip(xvals, names):
row_ico = 1
# FIXME: only report for the given corner?
# also review ceil vs floor choice for min vs max
# lets be more conservative for now
if xval < delta:
print('WARNING: near 0 delay on %s: %0.6f' % (name, xval))
zeros += 1
#continue
items = [str(row_ico), acorner2csv(roundf(xval), corneri)]
items.append('%u %s' % (1, name))
fout.write(','.join(items) + '\n')
nonzeros = len(names) - zeros
print(
'Wrote: %u / %u constrained delays, %u zeros' %
(nonzeros, len(names), zeros))
def run_corner(
Anp, b, names, corner, verbose=False, opts={}, meta={}, outfn=None):
if len(Anp) == 0:
print('WARNING: zero equations')
if outfn:
save(outfn, [], [], corner)
return
maxcorner = {
'slow_max': True,
'slow_min': False,
'fast_max': True,
'fast_min': False,
}[corner]
# Given timing scores for above delays (-ps)
assert type(Anp[0]) is np.ndarray, type(Anp[0])
assert type(b) is np.ndarray, type(b)
#check_feasible(Anp, b)
'''
Be mindful of signs
t1, t2: total delay contants
d1, d2..: variables to solve for
Max corner intuitive form:
d1 + d2 + d4 >= t1
d2 + d3 >= t2
But need it in compliant form:
-d1 + -d2 + -d4 <= -t1
-d2 + -d3 <= -t2
Minimize delay elements
Min corner intuitive form:
d1 + d2 + d4 <= t1
d2 + d3 <= t2
Maximize delay elements
'''
rows = len(Anp)
cols = len(Anp[0])
if maxcorner:
print('maxcorner => scaling to solution form...')
b_ub = -1.0 * b
#A_ub = -1.0 * Anp
A_ub = [-1.0 * x for x in Anp]
else:
print('mincorner => no scaling required')
b_ub = b
A_ub = Anp
print('Creating misc constants...')
# Minimization function scalars
# Treat all logic elements as equally important
if maxcorner:
# Best result are min delays
c = [1 for _i in range(len(names))]
else:
# Best result are max delays
c = [-1 for _i in range(len(names))]
# Delays cannot be negative
# (this is also the default constraint)
#bounds = [(0, None) for _i in range(len(names))]
# Also you can provide one to apply to all
bounds = (0, None)
# Seems to take about rows + 3 iterations
# Give some margin
#maxiter = int(1.1 * rows + 100)
#maxiter = max(1000, int(1000 * rows + 1000))
# Most of the time I want it to just keep going unless I ^C it
maxiter = 1000000
if verbose >= 2:
print('b_ub', b)
print('Unique delay elements: %d' % len(names))
print(' # delay minimization weights: %d' % len(c))
print(' # delay constraints: %d' % len(bounds))
print('Input paths')
print(' # timing scores: %d' % len(b))
print(' Rows: %d' % rows)
tlast = [time.time()]
iters = [0]
printn = [0]
def callback(xk, **kwargs):
iters[0] = kwargs['nit']
if time.time() - tlast[0] > 1.0:
sys.stdout.write('I:%d ' % kwargs['nit'])
tlast[0] = time.time()
printn[0] += 1
if printn[0] % 10 == 0:
sys.stdout.write('\n')
sys.stdout.flush()
print('')
# Now find smallest values for delay constants
# Due to input bounds (ex: column limit), some delay elements may get eliminated entirely
# https://docs.scipy.org/doc/scipy-0.18.1/reference/generated/scipy.optimize.linprog.html
print('Running linprog w/ %d r, %d c (%d name)' % (rows, cols, len(names)))
res = optimize.linprog(
c,
A_ub=A_ub,
b_ub=b_ub,
bounds=bounds,
callback=callback,
options={
"disp": True,
'maxiter': maxiter,
'bland': True,
'tol': 1e-6,
})
nonzeros = 0
print('Ran %d iters' % iters[0])
if res.success:
print('Result sample (%d elements)' % (len(res.x)))
plim = 3
for xi, (name, x) in enumerate(zip(names, res.x)):
nonzero = x >= 0.001
if nonzero:
nonzeros += 1
#if nonzero and (verbose >= 1 or xi > 30):
if nonzero and (verbose or (
(nonzeros < 100 or nonzeros % 20 == 0) and nonzeros <= plim)):
print(' % 4u % -80s % 10.1f' % (xi, name, x))
print('Delay on %d / %d' % (nonzeros, len(res.x)))
if outfn:
save(outfn, res.x, names, corner)
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Solve timing solution using linear programming inequalities')
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--massage', action='store_true', help='')
parser.add_argument('--sub-csv', help='')
parser.add_argument(
'--sub-json', help='Group substitutions to make fully ranked')
parser.add_argument('--corner', required=True, default="slow_max", help='')
parser.add_argument(
'--out', default=None, help='output timing delay .json')
parser.add_argument('fns_in', nargs='+', help='timing3.csv input files')
args = parser.parse_args()
# Store options in dict to ease passing through functions
bench = Benchmark()
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.csv')
sub_json = None
if args.sub_json:
sub_json = load_sub(args.sub_json)
try:
timfuz_solve.run(
run_corner=run_corner,
sub_json=sub_json,
sub_csv=args.sub_csv,
fns_in=fns_in,
corner=args.corner,
massage=args.massage,
outfn=args.out,
verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

2
fuzzers/007-timing/speed/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
build

View File

@ -0,0 +1,12 @@
all: build/speed.json
build/node.txt: speed_json.py generate.tcl
mkdir -p build
cd build && vivado -mode batch -source ../generate.tcl
build/speed.json: build/node.txt
cd build && python ../speed_json.py speed_model.txt node.txt speed.json
clean:
rm -rf build

View File

@ -0,0 +1,4 @@
Generates speed.json, describing speed index to string translation.
These indices appear to be fixed between runtimes.
It is unknown how stable they are across versions.

View File

@ -0,0 +1,173 @@
proc pin_info {pin} {
set cell [get_cells -of_objects $pin]
set bel [get_bels -of_objects $cell]
set site [get_sites -of_objects $bel]
return "$site $bel"
}
proc pin_bel {pin} {
set cell [get_cells -of_objects $pin]
set bel [get_bels -of_objects $cell]
return $bel
}
proc build_design_full {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog ../top.v
read_verilog ../../src/picorv32.v
synth_design -top top
#set_property LOCK_PINS {I0:A1 I1:A2 I2:A3 I3:A4 I4:A5 I5:A6} \
# [get_cells -quiet -filter {REF_NAME == LUT6} -hierarchical]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports stb]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports do]
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
place_design
route_design
write_checkpoint -force design.dcp
write_bitstream -force design.bit
}
proc build_design_synth {} {
create_project -force -part $::env(XRAY_PART) design design
read_verilog ../top.v
read_verilog ../picorv32.v
synth_design -top top
}
# WARNING: [Common 17-673] Cannot get value of property 'FORWARD' because this property is not valid in conjunction with other property setting on this object.
# WARNING: [Common 17-673] Cannot get value of property 'REVERSE' because this property is not valid in conjunction with other property setting on this object.
proc speed_models1 {} {
set outdir "."
set fp [open "$outdir/speed_model.txt" w]
# list_property [lindex [get_speed_models] 0]
set speed_models [get_speed_models]
set properties [list_property [lindex $speed_models 0]]
# "CLASS DELAY FAST_MAX FAST_MIN IS_INSTANCE_SPECIFIC NAME NAME_LOGICAL SLOW_MAX SLOW_MIN SPEED_INDEX TYPE"
puts $fp $properties
set needspace 0
foreach speed_model $speed_models {
foreach property $properties {
if $needspace {
puts -nonewline $fp " "
}
puts -nonewline $fp [get_property $property $speed_model]
set needspace 1
}
puts $fp ""
}
close $fp
}
proc speed_models2 {} {
set outdir "."
set fp [open "$outdir/speed_model.txt" w]
# list_property [lindex [get_speed_models] 0]
set speed_models [get_speed_models]
puts "Items: [llength $speed_models]"
set needspace 0
# Not all objects have the same properties
# But they do seem to have the same list
set properties [list_property [lindex $speed_models 0]]
foreach speed_model $speed_models {
set needspace 0
foreach property $properties {
set val [get_property $property $speed_model]
if {"$val" ne ""} {
if $needspace {
puts -nonewline $fp " "
}
puts -nonewline $fp "$property:$val"
set needspace 1
}
}
puts $fp ""
}
close $fp
}
# For cost codes
# Items: 2663055s
# Hmm too much
# Lets filter out items we really want
proc nodes_all {} {
set outdir "."
set fp [open "$outdir/node_all.txt" w]
set items [get_nodes]
puts "Items: [llength $items]"
set needspace 0
set properties [list_property [lindex $items 0]]
foreach item $items {
set needspace 0
foreach property $properties {
set val [get_property $property $item]
if {"$val" ne ""} {
if $needspace {
puts -nonewline $fp " "
}
puts -nonewline $fp "$property:$val"
set needspace 1
}
}
puts $fp ""
}
close $fp
}
# Only writes out items with unique cost codes
# (much faster)
proc nodes_unique_cc {} {
set outdir "."
set fp [open "$outdir/node.txt" w]
set items [get_nodes]
puts "Computing cost codes with [llength $items] items"
set needspace 0
set properties [list_property [lindex $items 0]]
set cost_codes_known [dict create]
set itemi 0
foreach item $items {
incr itemi
set cost_code [get_property COST_CODE $item]
if {[ dict exists $cost_codes_known $cost_code ]} {
continue
}
puts " Adding cost code $cost_code @ item $itemi"
dict set cost_codes_known $cost_code 1
set needspace 0
foreach property $properties {
set val [get_property $property $item]
if {"$val" ne ""} {
if $needspace {
puts -nonewline $fp " "
}
puts -nonewline $fp "$property:$val"
set needspace 1
}
}
puts $fp ""
}
close $fp
}
build_design_full
speed_models2
nodes_unique_cc

View File

@ -0,0 +1,94 @@
import json
def load_speed(fin):
speed_models = {}
speed_types = {}
for l in fin:
delay = {}
l = l.strip()
for kvs in l.split():
name, value = kvs.split(':')
name = name.lower()
if name in ('class', ):
continue
if name in ('speed_index', ):
value = int(value)
if name == 'type':
speed_types.setdefault(value, {})
delay[name] = value
delayk = delay['name']
if delayk in delay:
raise Exception("Duplicate name")
if "name" in delay and "name_logical" in delay:
# Always true
if delay['name'] != delay['name_logical']:
raise Exception("nope!")
# Found a counter example
if 0 and delay['name'] != delay['forward']:
# ('BSW_NONTLFW_TLRV', '_BSW_LONG_NONTLFORWARD')
print(delay['name'], delay['forward'])
raise Exception("nope!")
# Found a counter example
if 0 and delay['forward'] != delay['reverse']:
# _BSW_LONG_NONTLFORWARD _BSW_LONG_TLREVERSE
print(delay['forward'], delay['reverse'])
raise Exception("nope!")
speed_models[delayk] = delay
return speed_models, speed_types
def load_cost_code(fin):
# COST_CODE:4 COST_CODE_NAME:SLOWSINGLE
cost_codes = {}
for l in fin:
lj = {}
l = l.strip()
for kvs in l.split():
name, value = kvs.split(':')
name = name.lower()
lj[name] = value
cost_code = {
'name': lj['cost_code_name'],
'code': int(lj['cost_code']),
# Hmm is this unique per type?
#'speed_class': int(lj['speed_class']),
}
cost_codes[cost_code['name']] = cost_code
return cost_codes
def run(speed_fin, node_fin, fout, verbose=0):
print('Loading data')
speed_models, speed_types = load_speed(speed_fin)
cost_codes = load_cost_code(node_fin)
j = {
'speed_model': speed_models,
'speed_type': speed_types,
'cost_code': cost_codes,
}
json.dump(j, fout, sort_keys=True, indent=4, separators=(',', ': '))
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(description='Timing fuzzer')
parser.add_argument('--verbose', type=int, help='')
parser.add_argument(
'speed_fn_in', default='/dev/stdin', nargs='?', help='Input file')
parser.add_argument(
'node_fn_in', default='/dev/stdin', nargs='?', help='Input file')
parser.add_argument(
'fn_out', default='/dev/stdout', nargs='?', help='Output file')
args = parser.parse_args()
run(
open(args.speed_fn_in, 'r'),
open(args.node_fn_in, 'r'),
open(args.fn_out, 'w'),
verbose=args.verbose)

View File

@ -0,0 +1,8 @@
module top(input clk, stb, di, output do);
reg dor;
always @(posedge clk) begin
dor <= stb & di;
end
assign do = dor;
endmodule

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,89 @@
#!/usr/bin/env python3
from timfuz import Benchmark
import json
def write_state(state, fout):
j = {
'names': dict([(x, None) for x in state.names]),
'drop_names': list(state.drop_names),
'base_names': list(state.base_names),
'subs': dict([(name, values) for name, values in state.subs.items()]),
'pivots': state.pivots,
}
json.dump(j, fout, sort_keys=True, indent=4, separators=(',', ': '))
def gen_rows(fn_ins):
for fn_in in fn_ins:
try:
print('Loading %s' % fn_in)
j = json.load(open(fn_in, 'r'))
group0 = list(j['subs'].values())[0]
value0 = list(group0.values())[0]
if type(value0) is float:
print("WARNING: skipping old format JSON")
continue
else:
print("Value OK")
for sub in j['subs'].values():
row_ds = {}
# TODO: convert to gcd
# den may not always be 0
# lazy solution...just multiply out all the fractions
n = 1
for _var, (_num, den) in sub.items():
n *= den
for var, (num, den) in sub.items():
num2 = n * num
assert num2 % den == 0
row_ds[var] = num2 / den
yield row_ds
except:
print("Error processing %s" % fn_in)
raise
def run(fnout, fn_ins, verbose=0):
print('Loading data')
with open(fnout, 'w') as fout:
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for row_ds in gen_rows(fn_ins):
ico = '1'
out_b = [1e9, 1e9, 1e9, 1e9]
items = [ico, ' '.join(['%u' % x for x in out_b])]
for k, v in sorted(row_ds.items()):
items.append('%i %s' % (v, k))
fout.write(','.join(items) + '\n')
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Convert substitution groups into .csv to allow incremental rref results'
)
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--out', help='Output csv')
parser.add_argument('fns_in', nargs='+', help='sub.json input files')
args = parser.parse_args()
bench = Benchmark()
fns_in = args.fns_in
try:
run(fnout=args.out, fn_ins=args.fns_in, verbose=args.verbose)
finally:
print('Exiting after %s' % bench)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,61 @@
#!/usr/bin/env python3
import timfuz
from timfuz import loadc_Ads_bs, Ads2bounds
import sys
import os
import time
import json
def run(fns_in, fnout, tile_json_fn, verbose=False):
# modified in place
tilej = json.load(open(tile_json_fn, 'r'))
for fnin in fns_in:
Ads, bs = loadc_Ads_bs([fnin], ico=True)
bounds = Ads2bounds(Ads, bs)
for tile in tilej['tiles'].values():
pips = tile['pips']
for k, v in pips.items():
pips[k] = bounds.get('PIP_' + v, [None, None, None, None])
wires = tile['wires']
for k, v in wires.items():
wires[k] = bounds.get('WIRE_' + v, [None, None, None, None])
timfuz.tilej_stats(tilej)
json.dump(
tilej,
open(fnout, 'w'),
sort_keys=True,
indent=4,
separators=(',', ': '))
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Substitute timgrid timing model names for real timing values')
parser.add_argument(
'--timgrid-s',
default='../../timgrid/build/timgrid-s.json',
help='tilegrid timing delay symbolic input (timgrid-s.json)')
parser.add_argument(
'--out',
default='build/timgrid-vc.json',
help='tilegrid timing delay values at corner (timgrid-vc.json)')
parser.add_argument(
'fn_ins', nargs='+', help='Input flattened timing csv (flat.json)')
args = parser.parse_args()
run(args.fn_ins, args.out, args.timgrid_s, verbose=False)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,849 @@
#!/usr/bin/env python
import math
import numpy as np
from collections import OrderedDict
import time
import re
import os
import datetime
import json
import copy
import sys
import random
import glob
from fractions import Fraction
from benchmark import Benchmark
NAME_ZERO = set(
[
"BSW_CLK_ZERO",
"BSW_ZERO",
"B_ZERO",
"C_CLK_ZERO",
"C_DSP_ZERO",
"C_ZERO",
"I_ZERO",
"O_ZERO",
"RC_ZERO",
"R_ZERO",
])
# csv index
corner_s2i = OrderedDict(
[
('fast_max', 0),
('fast_min', 1),
('slow_max', 2),
('slow_min', 3),
])
# Equations are filtered out until nothing is left
class SimplifiedToZero(Exception):
pass
def allow_zero_eqns():
return os.getenv('ALLOW_ZERO_EQN', 'N') == 'Y'
def print_eqns(A_ubd, b_ub, verbose=0, lim=3, label=''):
rows = len(b_ub)
print('Sample equations (%s) from %d r' % (label, rows))
prints = 0
#verbose = 1
for rowi, row in enumerate(A_ubd):
if verbose or ((rowi < 10 or rowi % max(1, (rows / 20)) == 0) and
(not lim or prints < lim)):
line = ' EQN: p%u: ' % rowi
for k, v in sorted(row.items()):
line += '%u*t%d ' % (v, k)
line += '= %d' % b_ub[rowi]
print(line)
prints += 1
def print_name_eqns(A_ubd, b_ub, names, verbose=0, lim=3, label=''):
rows = len(b_ub)
print('Sample equations (%s) from %d r' % (label, rows))
prints = 0
#verbose = 1
for rowi, row in enumerate(A_ubd):
if verbose or ((rowi < 10 or rowi % max(1, (rows / 20)) == 0) and
(not lim or prints < lim)):
line = ' EQN: p%u: ' % rowi
for k, v in sorted(row.items()):
line += '%u*%s ' % (v, names[k])
line += '= %d' % b_ub[rowi]
print(line)
prints += 1
def print_names(names, verbose=1):
print('Names: %d' % len(names))
for xi, name in enumerate(names):
print(' % 4u % -80s' % (xi, name))
def invb(b_ub):
#return [-b for b in b_ub]
return -np.array(b_ub)
def check_feasible_d(A_ubd, b_ub, names):
A_ub, b_ub_inv = Ab_d2np(A_ubd, b_ub, names)
check_feasible(A_ub, b_ub_inv)
def check_feasible(A_ub, b_ub):
sys.stdout.write('Check feasible ')
sys.stdout.flush()
rows = len(b_ub)
cols = len(A_ub[0])
progress = max(1, rows / 100)
# Chose a high arbitrary value for x
# Delays should be in order of ns, so a 10 ns delay should be way above what anything should be
xs = [10e3 for _i in range(cols)]
# FIXME: use the correct np function to do this for me
# Verify bounds
#b_res = np.matmul(A_ub, xs)
#print(type(A_ub), type(xs)
#A_ub = np.array(A_ub)
#xs = np.array(xs)
#b_res = np.matmul(A_ub, xs)
def my_mul(A_ub, xs):
#print('cols', cols
#print('rows', rows
ret = [None] * rows
for row in range(rows):
this = 0
for col in range(cols):
this += A_ub[row][col] * xs[col]
ret[row] = this
return ret
b_res = my_mul(A_ub, xs)
# Verify bound was respected
for rowi, (this_b, this_b_ub) in enumerate(zip(b_res, b_ub)):
if rowi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if this_b >= this_b_ub or this_b > 0:
print(
'% 4d Want res % 10.1f <= % 10.1f <= 0' %
(rowi, this_b, this_b_ub))
raise Exception("Bad ")
print(' done')
def Ab_ub_dt2d(eqns):
'''Convert dict using the rows as keys into a list of dicts + b_ub list (ie return A_ub, b_ub)'''
#return [dict(rowt) for rowt in eqns]
rows = [(dict(rowt), b) for rowt, b in eqns.items()]
A_ubd, b_ub = zip(*rows)
return list(A_ubd), list(b_ub)
# This significantly reduces runtime
def simplify_rows(Ads, b_ub, remove_zd=False, corner=None):
'''Remove duplicate equations, taking highest delay'''
# dict of constants to highest delay
eqns = OrderedDict()
assert len(Ads) == len(b_ub), (len(Ads), len(b_ub))
assert corner is not None
minmax = {
'fast_max': max,
'fast_min': min,
'slow_max': max,
'slow_min': min,
}[corner]
# An outlier to make unknown values be ignored
T_UNK = {
'fast_max': 0,
'fast_min': 10e9,
'slow_max': 0,
'slow_min': 10e9,
}[corner]
sys.stdout.write('SimpR ')
sys.stdout.flush()
progress = int(max(1, len(b_ub) / 100))
zero_ds = 0
zero_es = 0
for loopi, (b, rowd) in enumerate(zip(b_ub, Ads)):
if loopi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
# TODO: elements have zero delay (ex: COUT)
# Remove these from now since they make me nervous
# Although they should just solve to 0
if remove_zd and not b:
zero_ds += 1
continue
# think ran into these before when taking out ZERO elements
if len(rowd) == 0:
if b != 0:
assert zero_es == 0, 'Unexpected zero element row with non-zero delay'
zero_es += 1
continue
rowt = Ar_ds2t(rowd)
eqns[rowt] = minmax(eqns.get(rowt, T_UNK), b)
print(' done')
print(
'Simplify rows: %d => %d rows w/ zd %d, ze %d' %
(len(b_ub), len(eqns), zero_ds, zero_es))
if len(eqns) == 0:
raise SimplifiedToZero()
A_ubd_ret, b_ub_ret = Ab_ub_dt2d(eqns)
return A_ubd_ret, b_ub_ret
def A_ubr_np2d(row):
'''Convert a single row'''
#d = {}
d = OrderedDict()
for coli, val in enumerate(row):
if val:
d[coli] = val
return d
def A_ub_np2d(A_ub):
'''Convert A_ub entries in numpy matrix to dictionary / sparse form'''
Adi = [None] * len(A_ub)
for i, row in enumerate(A_ub):
Adi[i] = A_ubr_np2d(row)
return Adi
def Ar_di2np(row_di, cols):
rownp = np.zeros(cols)
for coli, val in row_di.items():
# Sign inversion due to way solver works
rownp[coli] = val
return rownp
# NOTE: sign inversion
def A_di2np(Adi, cols):
'''Convert A_ub entries in dictionary / sparse to numpy matrix form'''
return [Ar_di2np(row_di, cols) for row_di in Adi]
def Ar_ds2t(rowd):
'''Convert a dictionary row into a tuple with (column number, value) tuples'''
return tuple(sorted(rowd.items()))
def A_ubr_t2d(rowt):
'''Convert a dictionary row into a tuple with (column number, value) tuples'''
return OrderedDict(rowt)
def A_ub_d2t(A_ubd):
'''Convert rows as dicts to rows as tuples'''
return [Ar_ds2t(rowd) for rowd in A_ubd]
def A_ub_t2d(A_ubd):
'''Convert rows as tuples to rows as dicts'''
return [OrderedDict(rowt) for rowt in A_ubd]
def Ab_d2np(A_ubd, b_ub, names):
A_ub = A_di2np(A_ubd, len(names))
b_ub_inv = invb(b_ub)
return A_ub, b_ub_inv
def Ab_np2d(A_ub, b_ub_inv):
A_ubd = A_ub_np2d(A_ub)
b_ub = invb(b_ub_inv)
return A_ubd, b_ub
def sort_equations(Ads, b):
# Track rows with value column
# Hmm can't sort against np arrays
tosort = [(sorted(row.items()), rowb) for row, rowb in zip(Ads, b)]
#res = sorted(tosort, key=lambda e: e[0])
res = sorted(tosort)
A_ubtr, b_ubr = zip(*res)
return [OrderedDict(rowt) for rowt in A_ubtr], b_ubr
def derive_eq_by_row(A_ubd, b_ub, verbose=0, col_lim=0, tweak=False):
'''
Derive equations by subtracting whole rows
Given equations like:
t0 >= 10
t0 + t1 >= 15
t0 + t1 + t2 >= 17
When I look at these, I think of a solution something like:
t0 = 10f
t1 = 5
t2 = 2
However, linprog tends to choose solutions like:
t0 = 17
t1 = 0
t2 = 0
To this end, add additional constraints by finding equations that are subsets of other equations
How to do this in a reasonable time span?
Also equations are sparse, which makes this harder to compute
'''
rows = len(A_ubd)
assert rows == len(b_ub)
# Index equations into hash maps so can lookup sparse elements quicker
assert len(A_ubd) == len(b_ub)
A_ubd_ret = copy.copy(A_ubd)
assert len(A_ubd) == len(A_ubd_ret)
#print('Finding subsets')
ltes = 0
scs = 0
b_ub_ret = list(b_ub)
sys.stdout.write('Deriving rows ')
sys.stdout.flush()
progress = max(1, rows / 100)
for row_refi, row_ref in enumerate(A_ubd):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if col_lim and len(row_ref) > col_lim:
continue
for row_cmpi, row_cmp in enumerate(A_ubd):
if row_refi == row_cmpi or col_lim and len(row_cmp) > col_lim:
continue
# FIXME: this check was supposed to be removed
'''
Every elements in row_cmp is in row_ref
but this doesn't mean the constants are smaller
Filter these out
'''
# XXX: just reduce and filter out solutions with positive constants
# or actually are these also useful as is?
lte = lte_const(row_ref, row_cmp)
if lte:
ltes += 1
sc = 0 and shared_const(row_ref, row_cmp)
if sc:
scs += 1
if lte or sc:
if verbose:
print('')
print('match')
print(' ', row_ref, b_ub[row_refi])
print(' ', row_cmp, b_ub[row_cmpi])
# Reduce
A_new = reduce_const(row_ref, row_cmp)
# Did this actually significantly reduce the search space?
#if tweak and len(A_new) > 4 and len(A_new) > len(row_cmp) / 2:
if tweak and len(A_new) > 8 and len(A_new) > len(row_cmp) / 2:
continue
b_new = b_ub[row_refi] - b_ub[row_cmpi]
# Definitely possible
# Maybe filter these out if they occur?
if verbose:
print(b_new)
# Also inverted sign
if b_new <= 0:
if verbose:
print("Unexpected b")
continue
if verbose:
print('OK')
A_ubd_ret.append(A_new)
b_ub_ret.append(b_new)
print(' done')
#A_ub_ret = A_di2np(A_ubd2, cols=cols)
print(
'Derive row: %d => %d rows using %d lte, %d sc' %
(len(b_ub), len(b_ub_ret), ltes, scs))
assert len(A_ubd_ret) == len(b_ub_ret)
return A_ubd_ret, b_ub_ret
def derive_eq_by_col(A_ubd, b_ub, verbose=0):
'''
Derive equations by subtracting out all bounded constants (ie "known" columns)
'''
rows = len(A_ubd)
# Find all entries where
# Index equations with a single constraint
knowns = {}
sys.stdout.write('Derive col indexing ')
#A_ubd = A_ub_np2d(A_ub)
sys.stdout.flush()
progress = max(1, rows / 100)
for row_refi, row_refd in enumerate(A_ubd):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if len(row_refd) == 1:
k, v = list(row_refd.items())[0]
# Reduce any constants to canonical form
if v != 1:
row_refd[k] = 1
b_ub[row_refi] /= v
knowns[k] = b_ub[row_refi]
print(' done')
#knowns_set = set(knowns.keys())
print('%d constrained' % len(knowns))
'''
Now see what we can do
Rows that are already constrained: eliminate
TODO: maybe keep these if this would violate their constraint
Otherwise eliminate the original row and generate a simplified result now
'''
b_ub_ret = []
A_ubd_ret = []
sys.stdout.write('Derive col main ')
sys.stdout.flush()
progress = max(1, rows / 100)
for row_refi, row_refd in enumerate(A_ubd):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
# Reduce as much as possible
#row_new = {}
row_new = OrderedDict()
b_new = b_ub[row_refi]
# Copy over single entries
if len(row_refd) == 1:
row_new = row_refd
else:
for k, v in row_refd.items():
if k in knowns:
# Remove column and take out corresponding delay
b_new -= v * knowns[k]
# Copy over
else:
row_new[k] = v
# Possibly reduced all usable contants out
if len(row_new) == 0:
continue
if b_new <= 0:
continue
A_ubd_ret.append(row_new)
b_ub_ret.append(b_new)
print(' done')
print('Derive col: %d => %d rows' % (len(b_ub), len(b_ub_ret)))
return A_ubd_ret, b_ub_ret
def col_dist(Ads, desc='of', names=[], lim=0):
'''print(frequency distribution of number of elements in a given row'''
rows = len(Ads)
cols = len(names)
fs = {}
for row in Ads:
this_cols = len(row)
fs[this_cols] = fs.get(this_cols, 0) + 1
print(
'Col count distribution (%s) for %dr x %dc w/ %d freqs' %
(desc, rows, cols, len(fs)))
prints = 0
for i, (k, v) in enumerate(sorted(fs.items())):
if lim == 0 or (lim and prints < lim or i == len(fs) - 1):
print(' %d: %d' % (k, v))
prints += 1
if lim and prints == lim:
print(' ...')
def name_dist(A_ubd, desc='of', names=[], lim=0):
'''print(frequency distribution of number of times an element appears'''
rows = len(A_ubd)
cols = len(names)
fs = {i: 0 for i in range(len(names))}
for row in A_ubd:
for k in row.keys():
fs[k] += 1
print('Name count distribution (%s) for %dr x %dc' % (desc, rows, cols))
prints = 0
for namei, name in enumerate(names):
if lim == 0 or (lim and prints < lim or namei == len(fs) - 1):
print(' %s: %d' % (name, fs[namei]))
prints += 1
if lim and prints == lim:
print(' ...')
fs2 = {}
for v in fs.values():
fs2[v] = fs2.get(v, 0) + 1
prints = 0
print('Distribution distribution (%d items)' % len(fs2))
for i, (k, v) in enumerate(sorted(fs2.items())):
if lim == 0 or (lim and prints < lim or i == len(fs2) - 1):
print(' %s: %s' % (k, v))
prints += 1
if lim and prints == lim:
print(' ...')
zeros = fs2.get(0, 0)
if zeros:
raise Exception("%d names without equation" % zeros)
def filter_ncols(A_ubd, b_ub, cols_min=0, cols_max=0):
'''Only keep equations with a few delay elements'''
A_ubd_ret = []
b_ub_ret = []
#print('Removing large rows')
for rowd, b in zip(A_ubd, b_ub):
if (not cols_min or len(rowd) >= cols_min) and (not cols_max or
len(rowd) <= cols_max):
A_ubd_ret.append(rowd)
b_ub_ret.append(b)
print(
'Filter ncols w/ %d <= cols <= %d: %d ==> %d rows' %
(cols_min, cols_max, len(b_ub), len(b_ub_ret)))
assert len(b_ub_ret)
return A_ubd_ret, b_ub_ret
def Ar_di2ds(rowA, names):
row = OrderedDict()
for k, v in rowA.items():
row[names[k]] = v
return row
def A_di2ds(Adi, names):
rows = []
for row_di in Adi:
rows.append(Ar_di2ds(row_di, names))
return rows
def Ar_ds2di(row_ds, names):
def keyi(name):
if name not in names:
names[name] = len(names)
return names[name]
row_di = OrderedDict()
for k, v in row_ds.items():
row_di[keyi(k)] = v
return row_di
def A_ds2di(rows):
names = OrderedDict()
A_ubd = []
for row_ds in rows:
A_ubd.append(Ar_ds2di(row_ds, names))
return list(names.keys()), A_ubd
def A_ds2np(Ads):
names, Adi = A_ds2di(Ads)
return names, A_di2np(Adi, len(names))
def loadc_Ads_mkb(fns, mkb, filt):
bs = []
Ads = []
for fn in fns:
with open(fn, 'r') as f:
# skip header
f.readline()
for l in f:
cols = l.split(',')
ico = bool(int(cols[0]))
corners = cols[1]
vars = cols[2:]
def mkcorner(bstr):
if bstr == 'None':
return None
else:
return int(bstr)
corners = [mkcorner(corner) for corner in corners.split()]
def mkvar(x):
i, var = x.split()
return (var, int(i))
vars = OrderedDict([mkvar(var) for var in vars])
if not filt(ico, corners, vars):
continue
bs.append(mkb(corners))
Ads.append(vars)
return Ads, bs
def loadc_Ads_b(fns, corner, ico=None):
corner = corner or "slow_max"
corneri = corner_s2i[corner]
if ico is not None:
filt = lambda ico_, corners, vars: ico_ == ico
else:
filt = lambda ico_, corners, vars: True
def mkb(val):
return val[corneri]
return loadc_Ads_mkb(fns, mkb, filt)
def loadc_Ads_bs(fns, ico=None):
if ico is not None:
filt = lambda ico_, corners, vars: ico_ == ico
else:
filt = lambda ico_, corners, vars: True
def mkb(val):
return val
return loadc_Ads_mkb(fns, mkb, filt)
def loadc_Ads_raw(fns):
filt = lambda ico, corners, vars: True
def mkb(val):
return val
return loadc_Ads_mkb(fns, mkb, filt)
def index_names(Ads):
names = set()
for row_ds in Ads:
for k1 in row_ds.keys():
names.add(k1)
return names
def load_sub(fn):
j = json.load(open(fn, 'r'))
for name, vals in sorted(j['subs'].items()):
for k, v in vals.items():
vals[k] = Fraction(v[0], v[1])
return j
def row_sub_vars(row, sub_json, strict=False, verbose=False):
if 0 and verbose:
print("")
print(row.items())
delvars = 0
for k in sub_json['drop_names']:
try:
del row[k]
delvars += 1
except KeyError:
pass
if verbose:
print("Deleted %u variables" % delvars)
if verbose:
print('Checking pivots')
print(sorted(row.items()))
for group, pivot in sorted(sub_json['pivots'].items()):
if pivot not in row:
continue
n = row[pivot]
print(' pivot %u %s' % (n, pivot))
#pivots = set(sub_json['pivots'].values()).intersection(row.keys())
for group, pivot in sorted(sub_json['pivots'].items()):
if pivot not in row:
continue
# take the sub out n times
# note constants may be negative
n = row[pivot]
if verbose:
print('pivot %i %s' % (n, pivot))
for subk, subv in sorted(sub_json['subs'][group].items()):
oldn = row.get(subk, type(subv)(0))
rown = -n * subv
rown += oldn
if verbose:
print(" %s: %d => %d" % (subk, oldn, rown))
if rown == 0:
# only becomes zero if didn't previously exist
del row[subk]
if verbose:
print(" del")
else:
row[subk] = rown
row[group] = n
assert pivot not in row
# after all constants are applied, the row should end up positive?
# numeric precision issues previously limited this
# Ex: AssertionError: ('PIP_BSW_2ELSING0', -2.220446049250313e-16)
if strict:
# verify no subs are left
for subs in sub_json['subs'].values():
for sub in subs:
assert sub not in row, 'non-pivot element after group sub %s' % sub
# Verify all constants are positive
for k, v in sorted(row.items()):
assert v > 0, (k, v)
def run_sub_json(Ads, sub_json, strict=False, verbose=False):
'''
strict: complain if a sub doesn't go in evenly
'''
nrows = 0
nsubs = 0
ncols_old = 0
ncols_new = 0
print('Subbing %u rows' % len(Ads))
prints = set()
for rowi, row in enumerate(Ads):
if 0 and verbose:
print(row)
if verbose:
print('')
print('Row %u w/ %u elements' % (rowi, len(row)))
row_orig = dict(row)
row_sub_vars(row, sub_json, strict=strict, verbose=verbose)
nrows += 1
if row_orig != row:
nsubs += 1
if verbose:
rowt = Ar_ds2t(row)
if rowt not in prints:
print('row', row)
prints.add(rowt)
ncols_old += len(row_orig)
ncols_new += len(row)
if verbose:
print('')
print("Sub: %u / %u rows changed" % (nsubs, nrows))
print("Sub: %u => %u non-zero row cols" % (ncols_old, ncols_new))
def print_eqns(Ads, b, verbose=0, lim=3, label=''):
rows = len(b)
print('Sample equations (%s) from %d r' % (label, rows))
prints = 0
for rowi, row in enumerate(Ads):
if verbose or ((rowi < 10 or rowi % max(1, (rows / 20)) == 0) and
(not lim or prints < lim)):
line = ' EQN: p%u: ' % rowi
for k, v in sorted(row.items()):
line += '%u*t%s ' % (v, k)
line += '= %d' % b[rowi]
print(line)
prints += 1
def print_eqns_np(A_ub, b_ub, verbose=0):
Adi = A_ub_np2d(A_ub)
print_eqns(Adi, b_ub, verbose=verbose)
def Ads2bounds(Ads, bs):
ret = {}
for row_ds, row_bs in zip(Ads, bs):
assert len(row_ds) == 1
k, v = list(row_ds.items())[0]
assert v == 1
ret[k] = row_bs
return ret
def instances(Ads):
ret = 0
for row_ds in Ads:
ret += sum(row_ds.values())
return ret
def acorner2csv(b, corneri):
corners = ["None" for _ in range(4)]
corners[corneri] = str(b)
return ' '.join(corners)
def corners2csv(bs):
assert len(bs) == 4
corners = ["None" if b is None else str(b) for b in bs]
return ' '.join(corners)
def tilej_stats(tilej):
stats = {}
for etype in ('pips', 'wires'):
tm = stats.setdefault(etype, {})
tm['net'] = 0
tm['solved'] = [0, 0, 0, 0]
tm['covered'] = [0, 0, 0, 0]
for tile in tilej['tiles'].values():
for etype in ('pips', 'wires'):
pips = tile[etype]
for k, v in pips.items():
stats[etype]['net'] += 1
for i in range(4):
if pips[k][i]:
stats[etype]['solved'][i] += 1
if pips[k][i] is not None:
stats[etype]['covered'][i] += 1
for corner, corneri in corner_s2i.items():
print('Corner %s' % corner)
for etype in ('pips', 'wires'):
net = stats[etype]['net']
solved = stats[etype]['solved'][corneri]
covered = stats[etype]['covered'][corneri]
print(
' %s: %u / %u solved, %u / %u covered' %
(etype, solved, net, covered, net))

View File

@ -0,0 +1,422 @@
#!/usr/bin/env python3
from timfuz import simplify_rows, print_eqns, print_eqns_np, sort_equations, col_dist, index_names
import numpy as np
import math
import sys
import datetime
import os
import time
import copy
from collections import OrderedDict
def lte_const(row_ref, row_cmp):
'''Return true if all constants are smaller magnitude in row_cmp than row_ref'''
#return False
for k, vc in row_cmp.items():
vr = row_ref.get(k, None)
# Not in reference?
if vr is None:
return False
if vr < vc:
return False
return True
def shared_const(row_ref, row_cmp):
'''Return true if more constants are equal than not equal'''
#return False
matches = 0
unmatches = 0
ks = list(row_ref.keys()) + list(row_cmp.keys())
for k in ks:
vr = row_ref.get(k, None)
vc = row_cmp.get(k, None)
# At least one
if vr is not None and vc is not None:
if vc == vr:
matches += 1
else:
unmatches += 1
else:
unmatches += 1
# Will equation reduce if subtracted?
return matches > unmatches
def reduce_const(row_ref, row_cmp):
'''Subtract cmp constants from ref'''
#ret = {}
ret = OrderedDict()
ks = set(row_ref.keys())
ks.update(set(row_cmp.keys()))
for k in ks:
vr = row_ref.get(k, 0)
vc = row_cmp.get(k, 0)
res = vr - vc
if res:
ret[k] = res
return ret
def derive_eq_by_row(Ads, b, verbose=0, col_lim=0, tweak=False):
'''
Derive equations by subtracting whole rows
Given equations like:
t0 >= 10
t0 + t1 >= 15
t0 + t1 + t2 >= 17
When I look at these, I think of a solution something like:
t0 = 10f
t1 = 5
t2 = 2
However, linprog tends to choose solutions like:
t0 = 17
t1 = 0
t2 = 0
To this end, add additional constraints by finding equations that are subsets of other equations
How to do this in a reasonable time span?
Also equations are sparse, which makes this harder to compute
'''
assert len(Ads) == len(b), 'Ads, b length mismatch'
rows = len(Ads)
# Index equations into hash maps so can lookup sparse elements quicker
assert len(Ads) == len(b)
Ads_ret = copy.copy(Ads)
assert len(Ads) == len(Ads_ret)
#print('Finding subsets')
ltes = 0
scs = 0
b_ret = list(b)
sys.stdout.write('Deriving rows (%u) ' % rows)
sys.stdout.flush()
progress = int(max(1, rows / 100))
for row_refi, row_ref in enumerate(Ads):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if col_lim and len(row_ref) > col_lim:
continue
for row_cmpi, row_cmp in enumerate(Ads):
if row_refi == row_cmpi or col_lim and len(row_cmp) > col_lim:
continue
# FIXME: this check was supposed to be removed
'''
Every elements in row_cmp is in row_ref
but this doesn't mean the constants are smaller
Filter these out
'''
# XXX: just reduce and filter out solutions with positive constants
# or actually are these also useful as is?
lte = lte_const(row_ref, row_cmp)
if lte:
ltes += 1
sc = 0 and shared_const(row_ref, row_cmp)
if sc:
scs += 1
if lte or sc:
if verbose:
print('')
print('match')
print(' ', row_ref, b[row_refi])
print(' ', row_cmp, b[row_cmpi])
# Reduce
A_new = reduce_const(row_ref, row_cmp)
# Did this actually significantly reduce the search space?
#if tweak and len(A_new) > 4 and len(A_new) > len(row_cmp) / 2:
if tweak and len(A_new) > 8 and len(A_new) > len(row_cmp) / 2:
continue
b_new = b[row_refi] - b[row_cmpi]
# Definitely possible
# Maybe filter these out if they occur?
if verbose:
print(b_new)
# Also inverted sign
if b_new <= 0:
if verbose:
print("Unexpected b")
continue
if verbose:
print('OK')
Ads_ret.append(A_new)
b_ret.append(b_new)
print(' done')
#A_ub_ret = A_di2np(Ads2, cols=cols)
print(
'Derive row: %d => %d rows using %d lte, %d sc' %
(len(b), len(b_ret), ltes, scs))
assert len(Ads_ret) == len(b_ret)
return Ads_ret, b_ret
def derive_eq_by_near_row(Ads, b, verbose=0, col_lim=0, tweak=False):
'''
Derive equations by subtracting whole rows
Given equations like:
t0 >= 10
t0 + t1 >= 15
t0 + t1 + t2 >= 17
When I look at these, I think of a solution something like:
t0 = 10f
t1 = 5
t2 = 2
However, linprog tends to choose solutions like:
t0 = 17
t1 = 0
t2 = 0
To this end, add additional constraints by finding equations that are subsets of other equations
How to do this in a reasonable time span?
Also equations are sparse, which makes this harder to compute
'''
rows = len(Ads)
assert rows == len(b)
rowdelta = int(rows / 2)
# Index equations into hash maps so can lookup sparse elements quicker
assert len(Ads) == len(b)
Ads_ret = copy.copy(Ads)
assert len(Ads) == len(Ads_ret)
#print('Finding subsets')
ltes = 0
scs = 0
b_ret = list(b)
sys.stdout.write('Deriving rows (%u) ' % rows)
sys.stdout.flush()
progress = int(max(1, rows / 100))
for row_refi, row_ref in enumerate(Ads):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if col_lim and len(row_ref) > col_lim:
continue
#for row_cmpi, row_cmp in enumerate(Ads):
for row_cmpi in range(max(0, row_refi - rowdelta),
min(len(Ads), row_refi + rowdelta)):
if row_refi == row_cmpi or col_lim and len(row_cmp) > col_lim:
continue
row_cmp = Ads[row_cmpi]
# FIXME: this check was supposed to be removed
'''
Every elements in row_cmp is in row_ref
but this doesn't mean the constants are smaller
Filter these out
'''
# XXX: just reduce and filter out solutions with positive constants
# or actually are these also useful as is?
lte = lte_const(row_ref, row_cmp)
if lte:
ltes += 1
sc = 0 and shared_const(row_ref, row_cmp)
if sc:
scs += 1
if lte or sc:
if verbose:
print('')
print('match')
print(' ', row_ref, b[row_refi])
print(' ', row_cmp, b[row_cmpi])
# Reduce
A_new = reduce_const(row_ref, row_cmp)
# Did this actually significantly reduce the search space?
#if tweak and len(A_new) > 4 and len(A_new) > len(row_cmp) / 2:
#if tweak and len(A_new) > 8 and len(A_new) > len(row_cmp) / 2:
# continue
b_new = b[row_refi] - b[row_cmpi]
# Definitely possible
# Maybe filter these out if they occur?
if verbose:
print(b_new)
# Also inverted sign
if b_new <= 0:
if verbose:
print("Unexpected b")
continue
if verbose:
print('OK')
Ads_ret.append(A_new)
b_ret.append(b_new)
print(' done')
#A_ub_ret = A_di2np(Ads2, cols=cols)
print(
'Derive row: %d => %d rows using %d lte, %d sc' %
(len(b), len(b_ret), ltes, scs))
assert len(Ads_ret) == len(b_ret)
return Ads_ret, b_ret
def derive_eq_by_col(Ads, b_ub, verbose=0):
'''
Derive equations by subtracting out all bounded constants (ie "known" columns)
'''
rows = len(Ads)
# Find all entries where
# Index equations with a single constraint
knowns = {}
sys.stdout.write('Derive col indexing ')
sys.stdout.flush()
progress = max(1, rows / 100)
for row_refi, row_refd in enumerate(Ads):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if len(row_refd) == 1:
k, v = list(row_refd.items())[0]
# Reduce any constants to canonical form
if v != 1:
row_refd[k] = 1
b_ub[row_refi] /= v
knowns[k] = b_ub[row_refi]
print(' done')
#knowns_set = set(knowns.keys())
print('%d constrained' % len(knowns))
'''
Now see what we can do
Rows that are already constrained: eliminate
TODO: maybe keep these if this would violate their constraint
Otherwise eliminate the original row and generate a simplified result now
'''
b_ret = []
Ads_ret = []
sys.stdout.write('Derive col main ')
sys.stdout.flush()
progress = max(1, rows / 100)
for row_refi, row_refd in enumerate(Ads):
if row_refi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
# Reduce as much as possible
#row_new = {}
row_new = OrderedDict()
b_new = b_ub[row_refi]
# Copy over single entries
if len(row_refd) == 1:
row_new = row_refd
else:
for k, v in row_refd.items():
if k in knowns:
# Remove column and take out corresponding delay
b_new -= v * knowns[k]
# Copy over
else:
row_new[k] = v
# Possibly reduced all usable contants out
if len(row_new) == 0:
continue
if b_new <= 0:
continue
Ads_ret.append(row_new)
b_ret.append(b_new)
print(' done')
print('Derive col: %d => %d rows' % (len(b_ub), len(b_ret)))
return Ads_ret, b_ret
# iteratively increasing column limit until all columns are added
def massage_equations(Ads, b, verbose=False, corner=None):
'''
Subtract equations from each other to generate additional constraints
Helps provide additional guidance to solver for realistic delays
Ex: given:
a >= 10
a + b >= 100
A valid solution is:
a = 100
However, a better solution is something like
a = 10
b = 90
This creates derived constraints to provide more realistic results
Equation pipeline
Some operations may generate new equations
Simplify after these to avoid unnecessary overhead on redundant constraints
Similarly some operations may eliminate equations, potentially eliminating a column (ie variable)
Remove these columns as necessary to speed up solving
'''
assert len(Ads) == len(b), 'Ads, b length mismatch'
def debug(what):
if verbose:
print('')
print_eqns(Ads, b, verbose=verbose, label=what, lim=20)
col_dist(Ads, what)
check_feasible_d(Ads, b)
# Try to (intelligently) subtract equations to generate additional constraints
# This helps avoid putting all delay in a single shared variable
dstart = len(b)
cols = len(index_names(Ads))
# Each iteration one more column is allowed until all columns are included
# (and the system is stable)
col_lim = 15
di = 0
while True:
print
n_orig = len(b)
print('Loop %d, lim %d' % (di + 1, col_lim))
# Meat of the operation
Ads, b = derive_eq_by_row(Ads, b, col_lim=col_lim, tweak=True)
debug("der_rows")
# Run another simplify pass since new equations may have overlap with original
Ads, b = simplify_rows(Ads, b, corner=corner)
print('Derive row: %d => %d equations' % (n_orig, len(b)))
debug("der_rows simp")
n_orig2 = len(b)
# Meat of the operation
Ads, b = derive_eq_by_col(Ads, b)
debug("der_cols")
# Run another simplify pass since new equations may have overlap with original
Ads, b = simplify_rows(Ads, b, corner=corner)
print('Derive col %d: %d => %d equations' % (di + 1, n_orig2, len(b)))
debug("der_cols simp")
# Doesn't help computation, but helps debugging
Ads, b = sort_equations(Ads, b)
debug("loop done")
col_dist(Ads, 'derive done iter %d, lim %d' % (di, col_lim), lim=12)
rows = len(Ads)
# possible that a new equation was generated and taken away, but close enough
if n_orig == len(b) and col_lim >= cols:
break
col_lim += col_lim / 5
di += 1
dend = len(b)
print('')
print('Derive net: %d => %d' % (dstart, dend))
print('')
# Was experimentting to see how much the higher order columns really help
# Helps debug readability
Ads, b = sort_equations(Ads, b)
debug("final (sorted)")
print('')
print('Massage final: %d => %d rows' % (dstart, dend))
return Ads, b

View File

@ -0,0 +1,174 @@
#!/usr/bin/env python3
from timfuz import simplify_rows, loadc_Ads_b, index_names, A_ds2np, run_sub_json, print_eqns, Ads2bounds, instances, SimplifiedToZero, allow_zero_eqns
from timfuz_massage import massage_equations
import numpy as np
import sys
def check_feasible(A_ub, b_ub):
'''
Put large timing constants into the equations
See if that would solve it
Its having trouble giving me solutions as this gets bigger
Make a terrible baseline guess to confirm we aren't doing something bad
'''
sys.stdout.write('Check feasible ')
sys.stdout.flush()
rows = len(b_ub)
cols = len(A_ub[0])
progress = max(1, rows / 100)
'''
Delays should be in order of ns, so a 10 ns delay should be way above what anything should be
Series can have several hundred delay elements
Max delay in ballpark
'''
xs = [1e9 for _i in range(cols)]
# FIXME: use the correct np function to do this for me
# Verify bounds
#b_res = np.matmul(A_ub, xs)
#print(type(A_ub), type(xs)
#A_ub = np.array(A_ub)
#xs = np.array(xs)
#b_res = np.matmul(A_ub, xs)
def my_mul(A_ub, xs):
#print('cols', cols
#print('rows', rows
ret = [None] * rows
for row in range(rows):
this = 0
for col in range(cols):
this += A_ub[row][col] * xs[col]
ret[row] = this
return ret
b_res = my_mul(A_ub, xs)
# Verify bound was respected
for rowi, (this_b, this_b_ub) in enumerate(zip(b_res, b_ub)):
if rowi % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
if this_b >= this_b_ub or this_b > 0:
print(
'% 4d Want res % 10.1f <= % 10.1f <= 0' %
(rowi, this_b, this_b_ub))
raise Exception("Bad ")
print(' done')
def filter_bounds(Ads, b, bounds, corner):
'''Given min variable delays, remove rows that won't constrain solution'''
#assert len(bounds) > 0
if 'max' in corner:
# Keep delays possibly larger than current bound
def keep(row_b, est):
return row_b > est
T_UNK = 0
elif 'min' in corner:
# Keep delays possibly smaller than current bound
def keep(row_b, est):
return row_b < est
T_UNK = 1e9
else:
assert 0
ret_Ads = []
ret_b = []
for row_ds, row_b in zip(Ads, b):
# some variables get estimated at 0
est = sum([bounds.get(k, T_UNK) * v for k, v in row_ds.items()])
# will this row potentially constrain us more?
if keep(row_b, est):
ret_Ads.append(row_ds)
ret_b.append(row_b)
return ret_Ads, ret_b
def run(
fns_in,
corner,
run_corner,
sub_json=None,
sub_csv=None,
dedup=True,
massage=False,
outfn=None,
verbose=False,
**kwargs):
print('Loading data')
Ads, b = loadc_Ads_b(fns_in, corner, ico=True)
# Remove duplicate rows
# is this necessary?
# maybe better to just add them into the matrix directly
if dedup:
oldn = len(Ads)
iold = instances(Ads)
Ads, b = simplify_rows(Ads, b, corner=corner)
print('Simplify %u => %u rows' % (oldn, len(Ads)))
print('Simplify %u => %u instances' % (iold, instances(Ads)))
if sub_json:
print('Sub: %u rows' % len(Ads))
iold = instances(Ads)
names_old = index_names(Ads)
run_sub_json(Ads, sub_json, verbose=verbose)
names = index_names(Ads)
print("Sub: %u => %u names" % (len(names_old), len(names)))
print('Sub: %u => %u instances' % (iold, instances(Ads)))
else:
names = index_names(Ads)
'''
Substitution .csv
Special .csv containing one variable per line
Used primarily for multiple optimization passes, such as different algorithms or additional constraints
'''
if sub_csv:
Ads2, b2 = loadc_Ads_b([sub_csv], corner, ico=True)
bounds = Ads2bounds(Ads2, b2)
assert len(bounds), 'Failed to load bounds'
rows_old = len(Ads)
Ads, b = filter_bounds(Ads, b, bounds, corner)
print(
'Filter bounds: %s => %s + %s rows' %
(rows_old, len(Ads), len(Ads2)))
Ads = Ads + Ads2
b = b + b2
assert len(Ads) or allow_zero_eqns()
assert len(Ads) == len(b), 'Ads, b length mismatch'
if verbose:
print
print_eqns(Ads, b, verbose=verbose)
#print
#col_dist(A_ubd, 'final', names)
if massage:
try:
Ads, b = massage_equations(Ads, b, corner=corner)
except SimplifiedToZero:
if not allow_zero_eqns():
raise
print('WARNING: simplified to zero equations')
Ads = []
b = []
print('Converting to numpy...')
names, Anp = A_ds2np(Ads)
run_corner(
Anp,
np.asarray(b),
names,
corner,
outfn=outfn,
verbose=verbose,
**kwargs)

1
fuzzers/007-timing/timgrid/.gitignore vendored Normal file
View File

@ -0,0 +1 @@
build

View File

@ -0,0 +1,12 @@
all: build/timgrid.json
build/timgrid.txt: generate.tcl
mkdir -p build
cd build && vivado -mode batch -source ../generate.tcl
build/timgrid.json: build/timgrid.txt
cd build && python3 ../tile_txt2json.py --speed-json ../../speed/build/speed.json timgrid.txt timgrid-s.json
clean:
rm -rf build

View File

@ -0,0 +1,16 @@
#!/bin/bash -x
source ${XRAY_GENHEADER}
vivado -mode batch -source ../generate.tcl
for x in design*.bit; do
${XRAY_BITREAD} -F $XRAY_ROI_FRAMES -o ${x}s -z -y $x
done
for x in design_*.bits; do
diff -u design.bits $x | grep '^[-+]bit' > ${x%.bits}.delta
done
python3 ../generate.py design_*.delta > tilegrid.json

View File

@ -0,0 +1,119 @@
proc build_project {} {
if 0 {
set grid_min_x -1
set grid_max_x -1
set grid_min_y -1
set grid_max_y -1
} {
set grid_min_x $::env(XRAY_ROI_GRID_X1)
set grid_max_x $::env(XRAY_ROI_GRID_X2)
set grid_min_y $::env(XRAY_ROI_GRID_Y1)
set grid_max_y $::env(XRAY_ROI_GRID_Y2)
}
create_project -force -part $::env(XRAY_PART) design design
read_verilog ../top.v
synth_design -top top
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_00) IOSTANDARD LVCMOS33" [get_ports clk]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_01) IOSTANDARD LVCMOS33" [get_ports di]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_02) IOSTANDARD LVCMOS33" [get_ports do]
set_property -dict "PACKAGE_PIN $::env(XRAY_PIN_03) IOSTANDARD LVCMOS33" [get_ports stb]
create_pblock roi
add_cells_to_pblock [get_pblocks roi] [get_cells roi]
resize_pblock [get_pblocks roi] -add "$::env(XRAY_ROI)"
set_property CFGBVS VCCO [current_design]
set_property CONFIG_VOLTAGE 3.3 [current_design]
set_property BITSTREAM.GENERAL.PERFRAMECRC YES [current_design]
set_param tcl.collectionResultDisplayLimit 0
set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets clk_IBUF]
set luts [get_bels -of_objects [get_sites -of_objects [get_pblocks roi]] -filter {TYPE =~ LUT*} */A6LUT]
set selected_luts {}
set lut_index 0
# LOC one LUT (a "selected_lut") into each CLB segment configuration column (ie 50 per column)
# Also, if GRID_MIN/MAX is not defined, automatically create it based on used CLBs
# See caveat in README on automatic creation
foreach lut $luts {
set tile [get_tile -of_objects $lut]
set grid_x [get_property GRID_POINT_X $tile]
set grid_y [get_property GRID_POINT_Y $tile]
if [expr $grid_min_x < 0 || $grid_x < $grid_min_x] {set grid_min_x $grid_x}
if [expr $grid_max_x < 0 || $grid_x > $grid_max_x] {set grid_max_x $grid_x}
if [expr $grid_min_y < 0 || $grid_y < $grid_min_y] {set grid_min_y $grid_y}
if [expr $grid_max_y < 0 || $grid_y > $grid_max_y] {set grid_max_y $grid_y}
# 50 per column => 50, 100, 150, etc
if [regexp "Y(0|[0-9]*[05]0)/" $lut] {
set cell [get_cells roi/is[$lut_index].lut]
set_property LOC [get_sites -of_objects $lut] $cell
set lut_index [expr $lut_index + 1]
lappend selected_luts $lut
}
}
place_design
route_design
write_checkpoint -force design.dcp
write_bitstream -force design.bit
}
proc write_data {} {
if 0 {
set grid_min_x -1
set grid_max_x -1
set grid_min_y -1
set grid_max_y -1
} {
set grid_min_x $::env(XRAY_ROI_GRID_X1)
set grid_max_x $::env(XRAY_ROI_GRID_X2)
set grid_min_y $::env(XRAY_ROI_GRID_Y1)
set grid_max_y $::env(XRAY_ROI_GRID_Y2)
}
# Get all tiles in ROI, ie not just the selected LUTs
set tiles [get_tiles -filter "GRID_POINT_X >= $grid_min_x && GRID_POINT_X <= $grid_max_x && GRID_POINT_Y >= $grid_min_y && GRID_POINT_Y <= $grid_max_y"]
# Write tiles.txt with site metadata
set fp [open "timgrid.txt" w]
foreach tile $tiles {
set type [get_property TYPE $tile]
set grid_x [get_property GRID_POINT_X $tile]
set grid_y [get_property GRID_POINT_Y $tile]
set items {}
set wires [get_wires -of_objects $tile]
if [llength $wires] {
foreach wire $wires {
set name [get_property NAME $wire]
set speed_index [get_property SPEED_INDEX $wire]
lappend items wire $name $speed_index
}
}
set pips [get_pips -of_objects $tile]
if [llength $pips] {
foreach pip $pips {
set name [get_property NAME $pip]
set speed_index [get_property SPEED_INDEX $pip]
lappend items pip $name $speed_index
}
}
puts $fp "$type $tile $grid_x $grid_y $items"
}
close $fp
}
build_project
write_data

View File

@ -0,0 +1,80 @@
#!/usr/bin/env python3
import sys
import os
import time
import json
SI_NONE = 0xFFFF
def load_speed_json(f):
j = json.load(f)
# Index speed indexes to names
speed_i2s = {}
for k, v in j['speed_model'].items():
i = v['speed_index']
if i != SI_NONE:
speed_i2s[i] = k
return j, speed_i2s
def gen_tiles(fnin, speed_i2s):
for l in open(fnin):
# lappend items pip $name $speed_index
# puts $fp "$type $tile $grid_x $grid_y $items"
parts = l.strip().split()
tile_type, tile_name, grid_x, grid_y = parts[0:4]
grid_x, grid_y = int(grid_x), int(grid_y)
tuples = parts[4:]
assert len(tuples) % 3 == 0
pips = {}
wires = {}
for i in range(0, len(tuples), 3):
ttype, name, speed_index = tuples[i:i + 3]
name_local = name.split('/')[1]
{
'pip': pips,
'wire': wires,
}[ttype][name_local] = speed_i2s[int(speed_index)]
yield (tile_type, tile_name, grid_x, grid_y, pips, wires)
def run(fnin, fnout, speed_json_fn, verbose=False):
speedj, speed_i2s = load_speed_json(open(speed_json_fn, 'r'))
tiles = {}
for tile in gen_tiles(fnin, speed_i2s):
(tile_type, tile_name, grid_x, grid_y, pips, wires) = tile
this_dat = {'pips': pips, 'wires': wires}
if tile_type not in tiles:
tiles[tile_type] = this_dat
else:
if tiles[tile_type] != this_dat:
print(tile_name, tile_type)
print(this_dat)
print(tiles[tile_type])
assert 0
j = {'tiles': tiles}
json.dump(
j, open(fnout, 'w'), sort_keys=True, indent=4, separators=(',', ': '))
def main():
import argparse
parser = argparse.ArgumentParser(description='Solve timing solution')
parser.add_argument(
'--speed-json',
default='../../speed/build/speed.json',
help='Provides speed index to name translation')
parser.add_argument('fnin', default=None, help='input tcl output .txt')
parser.add_argument('fnout', default=None, help='output .json')
args = parser.parse_args()
run(args.fnin, args.fnout, speed_json_fn=args.speed_json, verbose=False)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,49 @@
//Need at least one LUT per frame base address we want
`define N 100
module top(input clk, stb, di, output do);
localparam integer DIN_N = 6;
localparam integer DOUT_N = `N;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi (
.clk(clk),
.din(din),
.dout(dout)
);
endmodule
module roi(input clk, input [5:0] din, output [`N-1:0] dout);
genvar i;
generate
for (i = 0; i < `N; i = i+1) begin:is
LUT6 #(
.INIT(64'h8000_0000_0000_0001 + (i << 16))
) lut (
.I0(din[0]),
.I1(din[1]),
.I2(din[2]),
.I3(din[3]),
.I4(din[4]),
.I5(din[5]),
.O(dout[i])
);
end
endgenerate
endmodule

View File

@ -0,0 +1,62 @@
`define N 100
module top(input clk, stb, di, output do);
localparam integer DIN_N = 8;
localparam integer DOUT_N = `N;
reg [DIN_N-1:0] din;
wire [DOUT_N-1:0] dout;
reg [DIN_N-1:0] din_shr;
reg [DOUT_N-1:0] dout_shr;
always @(posedge clk) begin
din_shr <= {din_shr, di};
dout_shr <= {dout_shr, din_shr[DIN_N-1]};
if (stb) begin
din <= din_shr;
dout_shr <= dout;
end
end
assign do = dout_shr[DOUT_N-1];
roi roi (
.clk(clk),
.din(din),
.dout(dout)
);
endmodule
module roi(input clk, input [7:0] din, output [`N-1:0] dout);
genvar i;
generate
for (i = 0; i < `N; i = i+1) begin:is
(* KEEP, DONT_TOUCH *)
RAMB36E1 #(.INIT_00(256'h0000000000000000000000000000000000000000000000000000000000000000)) ram (
.CLKARDCLK(din[0]),
.CLKBWRCLK(din[1]),
.ENARDEN(din[2]),
.ENBWREN(din[3]),
.REGCEAREGCE(din[4]),
.REGCEB(din[5]),
.RSTRAMARSTRAM(din[6]),
.RSTRAMB(din[7]),
.RSTREGARSTREG(din[0]),
.RSTREGB(din[1]),
.ADDRARDADDR(din[2]),
.ADDRBWRADDR(din[3]),
.DIADI(din[4]),
.DIBDI(din[5]),
.DIPADIP(din[6]),
.DIPBDIP(din[7]),
.WEA(din[0]),
.WEBWE(din[1]),
.DOADO(dout[0]),
.DOBDO(),
.DOPADOP(),
.DOPBDOP());
end
endgenerate
endmodule

View File

@ -0,0 +1,144 @@
#!/usr/bin/env python3
import sys
import os
import time
import json
from collections import OrderedDict
import timfuz
corner_s2i = OrderedDict(
[
('fast_max', 0),
('fast_min', 1),
('slow_max', 2),
('slow_min', 3),
])
corner2minmax = {
'fast_max': max,
'fast_min': min,
'slow_max': max,
'slow_min': min,
}
def build_tilejo(fnins):
'''
{
"tiles": {
"BRKH_B_TERM_INT": {
"pips": {},
"wires": {
"B_TERM_UTURN_INT_ER1BEG0": [
null,
null,
93,
null
],
'''
tilejo = {"tiles": {}}
for fnin in fnins:
tileji = json.load(open(fnin, 'r'))
for tilek, tilevi in tileji['tiles'].items():
# No previous data? Copy
tilevo = tilejo['tiles'].get(tilek, None)
if tilevo is None:
tilejo['tiles'][tilek] = tilevi
# Otherwise combine
else:
def process_type(etype):
for pipk, pipvi in tilevi[etype].items():
pipvo = tilevo[etype][pipk]
for cornerk, corneri in corner_s2i.items():
cornervo = pipvo[corneri]
cornervi = pipvi[corneri]
# no new data
if cornervi is None:
pass
# no previous data
elif cornervo is None:
pipvo[corneri] = cornervi
# combine
else:
minmax = corner2minmax[cornerk]
pipvo[corneri] = minmax(cornervi, cornervo)
process_type('pips')
process_type('wires')
return tilejo
def check_corner_minmax(tilej, verbose=False):
# Post processing pass looking for min/max inconsistencies
# Especially an issue due to complexities around under-constrained elements
# (ex: pivots set to 0 delay)
print('Checking for min/max consistency')
checks = 0
bad = 0
for tilev in tilej['tiles'].values():
def process_type(etype):
nonlocal checks
nonlocal bad
for pipk, pipv in tilev[etype].items():
for corner in ('slow', 'fast'):
mini = corner_s2i[corner + '_min']
minv = pipv[mini]
maxi = corner_s2i[corner + '_max']
maxv = pipv[maxi]
if minv is not None and maxv is not None:
checks += 1
if minv > maxv:
if verbose:
print(
'WARNING: element %s %s min/max adjusted on corner %s'
% (etype, pipk, corner))
bad += 1
pipv[mini] = maxv
pipv[maxi] = minv
process_type('pips')
process_type('wires')
print('')
print('minmax: %u / %u pairs bad pairs adjusted' % (bad, checks))
timfuz.tilej_stats(tilej)
def check_corners_minmax(tilej, verbose=False):
# TODO: check fast vs slow
pass
def run(fnins, fnout, verbose=False):
tilejo = build_tilejo(fnins)
check_corner_minmax(tilejo)
check_corners_minmax(tilejo)
json.dump(
tilejo,
open(fnout, 'w'),
sort_keys=True,
indent=4,
separators=(',', ': '))
def main():
import argparse
parser = argparse.ArgumentParser(
description='Combine multiple tile corners into one .json file')
parser.add_argument(
'--out', required=True, help='Combined timgrid-v.json files')
parser.add_argument('fnins', nargs='+', help='Input timgrid-vc.json files')
args = parser.parse_args()
run(args.fnins, args.out, verbose=False)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,293 @@
#!/usr/bin/env python3
from timfuz import Benchmark, A_di2ds
import glob
import math
import json
import sys
from collections import OrderedDict
# Speed index: some sort of special value
SI_NONE = 0xFFFF
# prefix to make easier to track
# models do not overlap between PIPs and WIREs
PREFIX_W = 'WIRE_'
PREFIX_P = 'PIP_'
def parse_pip(s):
# Entries like
# CLK_BUFG_REBUF_X60Y117/CLK_BUFG_REBUF.CLK_BUFG_REBUF_R_CK_GCLK0_BOT<<->>CLK_BUFG_REBUF_R_CK_GCLK0_TOP
# Convert to (site, type, pip_junction, pip)
pipstr, speed_index = s.split(':')
speed_index = int(speed_index)
site, instance = pipstr.split('/')
#type, pip_junction, pip = others.split('.')
#return (site, type, pip_junction, pip)
return site, instance, int(speed_index)
def parse_node(s):
node, nwires = s.split(':')
return node, int(nwires)
def parse_wire(s):
# CLBLM_R_X3Y80/CLBLM_M_D6:952
wirestr, speed_index = s.split(':')
site, instance = wirestr.split('/')
return site, instance, int(speed_index)
# FIXME: these actually have a delay element
# Probably need to put these back in
def remove_virtual_pips(pips):
return pips
return filter(lambda pip: not re.match(r'CLBL[LM]_[LR]_', pip[0]), pips)
def load_timing3(f, name='file'):
# src_bel dst_bel ico fast_max fast_min slow_max slow_min pips
f.readline()
ret = []
bads = 0
for l in f:
# FIXME: hack
if 0 and 'CLK' in l:
continue
l = l.strip()
if not l:
continue
parts = l.split(' ')
# FIXME: deal with these nodes
if len(parts) != 11:
bads += 1
continue
net, src_bel, dst_bel, ico, fast_max, fast_min, slow_max, slow_min, pips, nodes, wires = parts
pips = pips.split('|')
nodes = nodes.split('|')
wires = wires.split('|')
ret.append(
{
'net': net,
'src_bel': src_bel,
'dst_bel': dst_bel,
'ico': int(ico),
# ps
'fast_max': int(fast_max),
'fast_min': int(fast_min),
'slow_max': int(slow_max),
'slow_min': int(slow_min),
'pips': remove_virtual_pips([parse_pip(pip) for pip in pips]),
'nodes': [parse_node(node) for node in nodes],
'wires': [parse_wire(wire) for wire in wires],
'line': l,
})
print(' load %s: %d bad, %d good' % (name, bads, len(ret)))
#assert 0
return ret
def load_speed_json(f):
j = json.load(f)
# Index speed indexes to names
speed_i2s = {}
for k, v in j['speed_model'].items():
i = v['speed_index']
if i != SI_NONE:
speed_i2s[i] = k
return j, speed_i2s
# Verify the nodes and wires really do line up
def vals2Adi_check(vals, names):
print('Checking')
for val in vals:
node_wires = 0
for _node, wiresn in val['nodes']:
node_wires += wiresn
assert node_wires == len(val['wires'])
print('Done')
assert 0
def vals2Adi(vals, speed_i2s, name_tr={}, name_drop=[], verbose=False):
def pip2speed(pip):
_site, _name, speed_index = pip
return PREFIX_P + speed_i2s[speed_index]
def wire2speed(wire):
_site, _name, speed_index = wire
return PREFIX_W + speed_i2s[speed_index]
# Want this ordered
names = OrderedDict()
print(
'Creating matrix w/ tr: %d, drop: %d' % (len(name_tr), len(name_drop)))
# Take sites out entirely using handy "interconnect only" option
#vals = filter(lambda x: str(x).find('SLICE') >= 0, vals)
# Highest count while still getting valid result
# First index all of the given pip types
# Start out as set then convert to list to keep matrix order consistent
sys.stdout.write('Indexing delay elements ')
sys.stdout.flush()
progress = max(1, len(vals) / 100)
for vali, val in enumerate(vals):
if vali % progress == 0:
sys.stdout.write('.')
sys.stdout.flush()
odl = [(pip2speed(pip), None) for pip in val['pips']]
names.update(OrderedDict(odl))
odl = [(wire2speed(wire), None) for wire in val['wires']]
names.update(OrderedDict(odl))
print(' done')
# Apply transform
orig_names = len(names)
for k in (list(name_drop) + list(name_tr.keys())):
if k in names:
del names[k]
else:
print('WARNING: failed to remove %s' % k)
names.update(OrderedDict([(name, None) for name in name_tr.values()]))
print('Names tr %d => %d' % (orig_names, len(names)))
# Make unique list
names = list(names.keys())
name_s2i = {}
for namei, name in enumerate(names):
name_s2i[name] = namei
if verbose:
for name in names:
print('NAME: ', name)
for name in name_drop:
print('DROP: ', name)
for l, r in name_tr.items():
print('TR: %s => %s' % (l, r))
# Now create a matrix with all of these delays
# Each row needs len(names) elements
# -2 means 2 elements present, 0 means absent
# (could hit same pip twice)
print('Creating delay element matrix w/ %d names' % len(names))
Adi = [None for _i in range(len(vals))]
for vali, val in enumerate(vals):
def add_name(name):
if name in name_drop:
return
name = name_tr.get(name, name)
namei = name_s2i[name]
row_di[namei] = row_di.get(namei, 0) + 1
# Start with 0 occurances
#row = [0 for _i in range(len(names))]
row_di = {}
#print('pips: ', val['pips']
for pip in val['pips']:
add_name(pip2speed(pip))
for wire in val['wires']:
add_name(wire2speed(wire))
#A_ub.append(row)
Adi[vali] = row_di
return Adi, names
# TODO: load directly as Ads
# remove names_tr, names_drop
def vals2Ads(vals, speed_i2s, verbose=False):
Adi, names = vals2Adi(vals, speed_i2s, verbose=False)
return A_di2ds(Adi, names)
def load_Ads(speed_json_f, f_ins):
print('Loading data')
_speedj, speed_i2s = load_speed_json(speed_json_f)
vals = []
for avals in [load_timing3(f_in, name) for f_in, name in f_ins]:
vals.extend(avals)
Ads = vals2Ads(vals, speed_i2s)
def mkb(val):
return (
val['fast_max'], val['fast_min'], val['slow_max'], val['slow_min'])
b = [mkb(val) for val in vals]
ico = [val['ico'] for val in vals]
return Ads, b, ico
def run(speed_json_f, fout, f_ins, verbose=0, corner=None):
Ads, bs, ico = load_Ads(speed_json_f, f_ins)
fout.write('ico,fast_max fast_min slow_max slow_min,rows...\n')
for row_bs, row_ds, row_ico in zip(bs, Ads, ico):
# like: 123 456 120 450, 1 a, 2 b
# first column has delay corners, followed by delay element count
items = [str(row_ico), ' '.join([str(x) for x in row_bs])]
for k, v in sorted(row_ds.items()):
items.append('%u %s' % (v, k))
fout.write(','.join(items) + '\n')
def main():
import argparse
parser = argparse.ArgumentParser(
description=
'Convert obscure timing3.txt into more readable but roughly equivilent timing3.csv'
)
parser.add_argument('--verbose', type=int, help='')
# made a bulk conversion easier...keep?
parser.add_argument(
'--auto-name', action='store_true', help='timing3.txt => timing3.csv')
parser.add_argument(
'--speed-json',
default='build_speed/speed.json',
help='Provides speed index to name translation')
parser.add_argument('--out', default=None, help='Output timing3.csv file')
parser.add_argument('fns_in', nargs='+', help='Input timing3.txt files')
args = parser.parse_args()
bench = Benchmark()
fnout = args.out
if fnout is None:
if args.auto_name:
assert len(args.fns_in) == 1
fnin = args.fns_in[0]
fnout = fnin.replace('.txt', '.csv')
assert fnout != fnin, 'Expect .txt in'
else:
# practically there are too many stray prints to make this work as expected
assert 0, 'File name required'
fnout = '/dev/stdout'
print("Writing to %s" % fnout)
fout = open(fnout, 'w')
fns_in = args.fns_in
if not fns_in:
fns_in = glob.glob('specimen_*/timing3.txt')
run(
speed_json_f=open(args.speed_json, 'r'),
fout=fout,
f_ins=[(open(fn_in, 'r'), fn_in) for fn_in in fns_in],
verbose=args.verbose)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,102 @@
import re
def gen_nodes(fin):
for l in fin:
lj = {}
l = l.strip()
for kvs in l.split():
name, value = kvs.split(':')
'''
NAME:LIOB33_SING_X0Y199/IOB_IBUF0
IS_BAD:0
IS_COMPLETE:1
IS_GND:0 IS_INPUT_PIN:1 IS_OUTPUT_PIN:0 IS_PIN:1 IS_VCC:0
NUM_WIRES:2
PIN_WIRE:1
'''
if name in ('COST_CODE', 'SPEED_CLASS'):
value = int(value)
lj[name] = value
tile_type, xy, wname = re.match(
r'(.*)_(X[0-9]*Y[0-9]*)/(.*)', lj['NAME']).groups()
lj['tile_type'] = tile_type
lj['xy'] = xy
lj['wname'] = wname
lj['l'] = l
yield lj
def run(node_fin, verbose=0):
refnodes = {}
nodei = 0
for nodei, anode in enumerate(gen_nodes(node_fin)):
def getk(anode):
return anode['wname']
#return (anode['tile_type'], anode['wname'])
if nodei % 1000 == 0:
print 'Check node %d' % nodei
# Existing node?
try:
refnode = refnodes[getk(anode)]
except KeyError:
# Set as reference
refnodes[getk(anode)] = anode
continue
# Verify equivilence
for k in (
'SPEED_CLASS',
'COST_CODE',
'COST_CODE_NAME',
'IS_BAD',
'IS_COMPLETE',
'IS_GND',
'IS_VCC',
):
if k in refnode and k in anode:
def fail():
print 'Mismatch on %s' % k
print refnode[k], anode[k]
print refnode['l']
print anode['l']
#assert 0
if k == 'SPEED_CLASS':
# Parameters known to effect SPEED_CLASS
# Verify at least one parameter is different
if refnode[k] != anode[k]:
for k2 in ('IS_PIN', 'IS_INPUT_PIN', 'IS_OUTPUT_PIN',
'PIN_WIRE', 'NUM_WIRES'):
if refnode[k2] != anode[k2]:
break
else:
if 0:
print
fail()
elif refnode[k] != anode[k]:
print
fail()
# A key in one but not the other?
elif k in refnode or k in anode:
assert 0
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(
description=
'Determines which info is consistent across nodes with the same name')
parser.add_argument('--verbose', type=int, help='')
parser.add_argument(
'node_fn_in', default='/dev/stdin', nargs='?', help='Input file')
args = parser.parse_args()
run(open(args.node_fn_in, 'r'), verbose=args.verbose)

View File

@ -0,0 +1,143 @@
#!/usr/bin/env python3
'''
Triaging tool to help understand where we need more timing coverage
Finds correlated variables to help make better test cases
'''
from timfuz import Benchmark, Ar_di2np, loadc_Ads_b, index_names, A_ds2np, simplify_rows
import numpy as np
import glob
import math
import json
import sympy
from collections import OrderedDict
from fractions import Fraction
import random
from sympy import Rational
def intr(r):
DELTA = 0.0001
for i, x in enumerate(r):
if type(x) is float:
xi = int(x)
assert abs(xi - x) < DELTA
r[i] = xi
def fracr(r):
intr(r)
return [Fraction(x) for x in r]
def fracm(m):
return [fracr(r) for r in m]
def symratr(r):
intr(r)
return [Rational(x) for x in r]
def symratm(m):
return [symratr(r) for r in m]
def intm(m):
[intr(r) for r in m]
return m
def create_matrix(rows, cols):
ret = np.zeros((rows, cols))
for rowi in range(rows):
for coli in range(cols):
ret[rowi][coli] = random.randint(1, 10)
return ret
def create_matrix_sparse(rows, cols):
ret = np.zeros((rows, cols))
for rowi in range(rows):
for coli in range(cols):
if random.randint(0, 5) < 1:
ret[rowi][coli] = random.randint(1, 10)
return ret
def run(
rows=35,
cols=200,
verbose=False,
encoding='np',
sparse=False,
normalize_last=True):
random.seed(0)
if sparse:
mnp = create_matrix_sparse(rows, cols)
else:
mnp = create_matrix(rows, cols)
#print(mnp[0])
if encoding == 'fraction':
msym = sympy.Matrix(fracm(mnp))
elif encoding == 'np':
msym = sympy.Matrix(mnp)
elif encoding == 'sympy':
msym = sympy.Matrix(symratm(mnp))
# this actually produces float results
elif encoding == 'int':
msym = sympy.Matrix(intm(mnp))
else:
assert 0, 'bad encoding: %s' % encoding
print(type(msym[0]), str(msym[0]))
if verbose:
print('names')
print(names)
print('Matrix')
sympy.pprint(msym)
print(
'%s matrix, %u rows x %u cols, sparse: %s, normlast: %s' %
(encoding, len(mnp), len(mnp[0]), sparse, normalize_last))
bench = Benchmark()
try:
rref, pivots = msym.rref(normalize_last=normalize_last)
finally:
print('rref exiting after %s' % bench)
print(type(rref[0]), str(rref[0]))
if verbose:
print('Pivots')
sympy.pprint(pivots)
print('rref')
sympy.pprint(rref)
def main():
import argparse
parser = argparse.ArgumentParser(
description='Matrix solving performance tests')
parser.add_argument('--verbose', action='store_true', help='')
parser.add_argument('--sparse', action='store_true', help='')
parser.add_argument('--rows', type=int, help='')
parser.add_argument('--cols', type=int, help='')
parser.add_argument('--normalize-last', type=int, help='')
parser.add_argument('--encoding', default='np', help='')
args = parser.parse_args()
run(
encoding=args.encoding,
rows=args.rows,
cols=args.cols,
sparse=args.sparse,
normalize_last=bool(args.normalize_last),
verbose=args.verbose)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,86 @@
import re
def gen_wires(fin):
for l in fin:
lj = {}
l = l.strip()
for kvs in l.split():
name, value = kvs.split(':')
lj[name] = value
tile_type, xy, wname = re.match(
r'(.*)_(X[0-9]*Y[0-9]*)/(.*)', lj['NAME']).groups()
lj['tile_type'] = tile_type
lj['xy'] = xy
lj['wname'] = wname
lj['l'] = l
yield lj
def run(node_fin, verbose=0):
refnodes = {}
nodei = 0
for nodei, anode in enumerate(gen_wires(node_fin)):
def getk(anode):
return anode['wname']
return (anode['tile_type'], anode['wname'])
if nodei % 1000 == 0:
print 'Check node %d' % nodei
# Existing node?
try:
refnode = refnodes[getk(anode)]
except KeyError:
# Set as reference
refnodes[getk(anode)] = anode
continue
k_invariant = (
'CAN_INVERT',
'IS_BUFFERED_2_0',
'IS_BUFFERED_2_1',
'IS_DIRECTIONAL',
'IS_EXCLUDED_PIP',
'IS_FIXED_INVERSION',
'IS_INVERTED',
'IS_PSEUDO',
'IS_SITE_PIP',
'IS_TEST_PIP',
)
k_varies = ('TILE', )
# Verify equivilence
for k in k_invariant:
if k in refnode and k in anode:
def fail():
print 'Mismatch on %s' % k
print refnode[k], anode[k]
print refnode['l']
print anode['l']
#assert 0
if refnode[k] != anode[k]:
print
fail()
# A key in one but not the other?
elif k in refnode or k in anode:
assert 0
elif k not in k_varies:
assert 0
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(
description=
'Determines which info is consistent across PIPs with the same name')
parser.add_argument('--verbose', type=int, help='')
parser.add_argument(
'node_fn_in', default='/dev/stdin', nargs='?', help='Input file')
args = parser.parse_args()
run(open(args.node_fn_in, 'r'), verbose=args.verbose)

View File

@ -0,0 +1,91 @@
import re
def gen_wires(fin):
for l in fin:
lj = {}
l = l.strip()
for kvs in l.split():
name, value = kvs.split(':')
lj[name] = value
tile_type, xy, wname = re.match(
r'(.*)_(X[0-9]*Y[0-9]*)/(.*)', lj['NAME']).groups()
lj['tile_type'] = tile_type
lj['xy'] = xy
lj['wname'] = wname
lj['l'] = l
yield lj
def run(node_fin, verbose=0):
refnodes = {}
nodei = 0
for nodei, anode in enumerate(gen_wires(node_fin)):
def getk(anode):
return anode['wname']
#return (anode['tile_type'], anode['wname'])
if nodei % 1000 == 0:
print 'Check node %d' % nodei
# Existing node?
try:
refnode = refnodes[getk(anode)]
except KeyError:
# Set as reference
refnodes[getk(anode)] = anode
continue
k_invariant = (
'COST_CODE',
'IS_INPUT_PIN',
'IS_OUTPUT_PIN',
'IS_PART_OF_BUS',
'NUM_INTERSECTS',
'NUM_TILE_PORTS',
'SPEED_INDEX',
'TILE_PATTERN_OFFSET',
)
k_varies = (
'ID_IN_TILE_TYPE',
'IS_CONNECTED',
'NUM_DOWNHILL_PIPS',
'NUM_PIPS',
'NUM_UPHILL_PIPS',
'TILE_NAME',
)
# Verify equivilence
for k in k_invariant:
if k in refnode and k in anode:
def fail():
print 'Mismatch on %s' % k
print refnode[k], anode[k]
print refnode['l']
print anode['l']
#assert 0
if refnode[k] != anode[k]:
print
fail()
# A key in one but not the other?
elif k in refnode or k in anode:
assert 0
elif k not in k_varies:
assert 0
if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser(
description=
'Determines which info is consistent across wires with the same name')
parser.add_argument('--verbose', type=int, help='')
parser.add_argument(
'node_fn_in', default='/dev/stdin', nargs='?', help='Input file')
args = parser.parse_args()
run(open(args.node_fn_in, 'r'), verbose=args.verbose)