* SER_CLK support
* Update constids
* wip
* CLK_FEEDBACK
* Handle SER_CLK and SER_CLK_N
* clangformat
* Cleanup
* Use _ as separator for PLL CFGs
* Remove unused clocking cells
* Do not use same name for IO models
* Fix IDDR merge
* Cleanup
* Properly handle user global signals
* Move signal inversion in bitstream creation
* Start adding multi die support
* Display die location for pins used
* Do not use constant s as locations
* Cleanup SB_DRIVE handling
* Use DDR locations from chip database
* Place only in prefered die for now
* Set D2D
* Fixed typos
A segment router replaces the source-to-sink connection by
general-purpose PIPs with bus-branch segment network connections.
The problem arises when the source is connected to the sinks directly
without switching as in the case of LUT->DFF, such wires should be left
as is, which is what this PR does.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* gatemate: clock router
Co-authored-by: Miodrag Milanovic <mmicko@gmail.com>
* Re-add clock router pip binding
* Refactoring
* Require globals to use a BUFG
* Fix misunderstanding of GPIO/RAM clocking
* Add plane info to chipdb
* Force clock routing along a specific plane
* Remove overly-limiting condition
* Move clock router into its own file
* Clock router based on delay
* Refine clock router conditions
* More detailed clock routing output
* Clean up debug messages
* clangformat
---------
Co-authored-by: Miodrag Milanovic <mmicko@gmail.com>
Fill in the delays for PIP classes related to HCLK and IODELAY. Also:
- if clock routing fails, we try to use the next fastest mechanism - segment networks;
- fixing harmless typos.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Initial code for GateMate
* Initial work on forming bitstream
* Add CCF parsing
* Use CCF to set IO location
* Propagate errors
* Restructure code
* Add support for reading from config
* Start adding infrastructure for reading bitstream
* Fix script
* GPIO initial work
* Add IN1->RAM_O2 propagation
* Fixed typo
* Cleanup
* More parameter checks
* Add LVDS support
* Cleanup
* Keep just used connections for now
* Naive lut tree CPE pack
* Naive pack CC_DFF
* pack DFF fixes
* Handle MUX flags
* Fix DFF pack
* Prevent pass trough issues
* Cleanup
* Use device wrapper class
* Update due to API changes
* Use pin connection aliases
* Start work on BUFG support
* Fix CC_L2T5 pack
* Add CPE input inverters
* Constrain routes to have correct inversion state
* Add clock inversion pip
* Added MX2 and MX4 support
* Fix script
* BUFG support
* debug print if route found with wrong polarity
* Some CC_DFF improvements
* Create reproducible chip database
* Simplify inversion of special signals
* Few more DFF features
* Add forgotten virtual port renames
* Handle muxes with constant inputs
* Allow inversion for muxes
* cleanup
* DFF input can be constant
* init DFF only when needed
* cleanup
* Add basic PLL support
* Add some timings
* Add USR_RSTN support
* Display few more primitives
* Use pass trough signals to validate architecture data
* Use extra tile information from chip database
* Updates needed for a build system changes
* Implement SB_DRIVE support
* Properly named configuration bits
* autogenerated constids.inc
* small fix
* Initial code for CPE halfs
* Some cleanup
* make sure FFs are compatible
* reverted due to db change
* Merge DFF where applicable
* memory allocation issue
* fix
* better MX2
* ram_i handling
* Cleanup MX4
* Support latches
* compare L_D flag as well
* Move virtual pips
* Naive addf pack
* carry chains grouping
* Keep chip database reproducible
* split addf vectors
* Block CPEs when GPIO is used
* Prepare placement code
* RAM_I/RAM_O rewrite
* fix ram_i/o index
* Display RAM and add new primitives
* PLL wip code
* CC_PLL_ADV packing
* PLL handling cleanup
* Add PLL comments
* Keep only high fan-out BUFG
* Add skeleton for tests
* Utilize move_ram_o
* GPIO wip
* GPIO wip
* PLL fixes
* cleanup
* FF_OBF support
* Handle FF_IBF
* Make SLEW FAST if not defined as in latest p_r
* Make sure FF_OBF only driving GPIO
* Moved pll calc into separate file
* IDDR handling and started ODDR
* Route DDR input for CC_ODDR
* Notify error in case ODDR or IDDR are used but not with I/O pin
* cleanup for CC_USR_RSTN
* Extract proper RAM location for bitstream
* Code cleanup
* Allow auto place of pads
* Use clock source flag
* Configure GPIO clock signals
* Handle conflicting clk
* Use BUGF in proper order
* Connected CLK, works without but good for debugging
* CC_CFG_CTRL placement
* Group RAM data 40 bytes per row
* Write BRAM content
* RAM wip
* Use relative constraints from chipdb
* fix broken build
* Memory wip
* Handle custom clock for memories
* Support FIFO
* optimize move_ram_io
* Fix SR signal handling acorrding to findings
* set placer beta
* Pre place what we can
* Revert "debug print if route found with wrong polarity"
This reverts commit cf9ded2f18.
* Revert "Constrain routes to have correct inversion state"
This reverts commit 795c284d48.
* Remove virtual pips
* Implement post processing inversion
* ADDF add ability to route additional CO
* Merge two ADDFs in one CPE
* Added TODO
* clangformat
* Cleanup
* Add serdes handling in config file
* Cleanup
* Cleanup
* Cleanup
* Fix in PLL handling
* Fixed ADDF edge case
* No need for this
* Fix latch
* Sanity checks
* Support CC_BRAM_20K merge
* Start creating testing environment
* LVDS fixes
* Add connection helper
* Cleanup
* Fix tabs
* Formatting fix
* Remove optimization tests for now
* remove read_bitstream
* removed .c_str()
* Removed config parsing
* using snake_case
* Use bool_or_default where applicable
* refactored bitstream write code
* Add allow-unconstrained option
* Update DFF related messages
* Add clock constraint propagation
---------
Co-authored-by: Lofty <dan.ravensloft@gmail.com>
* Gowin. BUGFIX Use a separate net for segment gates
We use a temporary separate small network (typically 2 - 3 sinks) for
routing from the segment network source to the segment gate. This fixes
the rare but unpleasant case of self-intersection when a route to a gate
is routed using PIPs after the gate, this is no longer allowed when
using a separate small network.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix style.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
DLLDLY is the clock delay primitive that adjust the input clock
according to the DLLSTEP signal and outputs the delayed clock.
These primitives are associated with clock pins and are "tapped" between
the output of this IBUF and the clock networks, leaving the possibility
to connect to the original unshifted signal as well, although the latter
is not very practical because it is no longer possible to use fast
wires.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Gowin chips have an interesting mechanism - wires that run vertically
through several rows (at least 10) in each column of the chip. In each
row a particular wire has branches to the left and right, covering on
average 4 neighboring cells in the row. For lack of a better term, I
further call such a wire a segment.
So a segment can provide a direct connection in a local rectangle. There
are no special restrictions on the sinks, so segment networks can be
used for ClockEnable, LocalSetReset, as well as for LUT and DFF inputs.
The sources are not so simple - the sources can be the upper or lower
end of the segment, which in theory can lead to unfortunate consequences
if the signal is applied from both ends.
The matter is complicated by the fact that there are default
connections, i.e. in the absence of any set fuse the segment input is
still connected to something (VCC for example) and to disable the unused
end of the segment you need to set a special combination of fuses.
Taking into account which end of which segment is used is one of the
tasks of this router. In addition, segment ends can physically coincide
with PLL, DSP and BSRAM inputs, which can also lead to unexpected
effects. Some of these things are tracked when generating the base, some
in this router, some when packing in gowin_pack.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds the ability to use high-speed clock lines (together with CLKDIV2
type frequency dividers operating on them) as sieve signals for the
CLKIN and CLKFB inputs of the rPLL and PLLVR primitives (these cover the
full range of supported Gowin chips).
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add I3C io buffer.
A buffer is added that can operate as a normal IOBUF in PUSH-PULL mode
or switch to open-drain IOBUF mode.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Turn a variable into a set of flags
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds output (MIPI_OBUF and MIPI_OBUF_A) and input (MIPI_IBUF) primitives
to allow the use of “real” MIPI (not emulation) ports capable of
operating in both HS and LP modes.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Due to the way CMake-generated Makefiles evaluate dependencies, this
calls the `.bba` generation custom command twice, which then fails as
they both use the same `.bba.new` file as an output and one of them
moves it first.
This broke builds using `make -j` but not builds using
`make -j nextpnr-himbaechel-example`.
This is useful for certain cross-compilation workloads, and to cache
rarely changing build products.
To use this functionality, build e.g. as follows:
cmake . -B build-export -DEXPORT_BBA_FILES=../bba-files -DARCH=all
cmake --build build-export -t nextpnr-all-bba
cmake . -B build-import -DIMPORT_BBA_FILES=../bba-files -DARCH=all
cmake --build build-import
Two user-visible changes were made:
* `-DUSE_RUST` is replaced with `-DBUILD_RUST`, by analogy with
`-DBUILD_PYTHON`
* `-DCOVERAGE` was removed as it doesn't work with either modern GCC
or Clang
Primarily, this commit makes both of them use the `BBAsm` functions
to build and compile `.bba` files.
In addition, Himbaechel targets are now aligned with the rest in
how they are configured: instead of having all uarches enabled with
all of the devices disabled (the opposite of the rest of nextpnr),
uarches must be enabled explicitly but they come with all devices
enabled (except for Xilinx, which does not have a list of devices).
While it served a purpose (granting the ability to build `.bba` files
separately from the rest of nextpnr), it made things excessively
convoluted, especially around paths.
This commit removes the ability to pre-generate chip databases. As far
as I know, I was the primary user of that feature. It can be added back
if there is demand for it.
In exchange the per-family `CMakeLists.txt` files are now much easier
to understand.
Erroneously created wires for specific IOs on the underside of some
chips.
Fixes https://github.com/YosysHQ/nextpnr/issues/1417
Also cosmetic edits.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add the ability to place registers in IOB
IO blocks have registers: for input, for output and for OutputEnable
signal - IREG, OREG and TREG respectively.
Each of the registers has one implicit non-switched wire, which one
depends on the type of register (IREG has a Q wire, OREG has a D wire).
Although the registers can be activated independently of each other they
share the CLK, ClockEnable and LocalSetReset wires and this places
restrictions on the possible combinations of register types in a single
IO.
Register placement in IO blocks is enabled by specifying the command
line keys --vopt ireg_in_iob, --vopt oreg_in_iob, or --vopt ioreg_in_iob.
It should be noted that specifying these keys leads to attempts to place
registers in IO blocks, but no errors are generated in case of failure.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO
Check for unconnected ports.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. IO regs. Verbose warnings.
If an attempt to place an FF in an IO block fails, issue a warning
detailing the reason for the failure, whether it is a register type
conflict, a network requirement violation, or a control signal conflict.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. BUGFIX. Fix FFs compatibility.
Flipflops with a fixed ClockEnable input cannot coexist with flipflops
with a variable one.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. FFs in IO. Changing diagnostic messages.
Placement modes are still specified by the command line keys
ireg_in_iob/oreg_in_iob/ioreg_in_iob, but also introduces more granular
control in the form of attributes at I/O ports:
(* NOIOBFF *) - registers are never placed in this IO,
(* IOBFF *) - registers must be placed in this IO, in case of failure
a warning (not an error) with the reason for nonplacement is issued,
_attribute_absence_ - no diagnostics will be issued: managed to place - good, failed - not bad either.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO.
Change the logic for handling command line keys and attributes -
attributes allow routines to be placed in IO regardless of global mode.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO. Fix style.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds additional restrictions on the first PIP after the clock source -
only connections to SPINEs are allowed. This allowed to correct the
behaviour of DQCEs since the latter can only disable/enable SPINEs.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* ng-ultra: new architecture
* Implementation as in D2 deliverable
* Support for nxdesignsuite-24.0.0.0-20240429T102300
* Save memory by directly outputing json
* Add support for bidirectional IOs
* cleanup
* Create BFRs properly
* Add IOM insertion
* Cleanup
* Block certain pips depending of DDFR mode
* Add LUT bypass to improve routability
* Add bypass for CSC mode of GCK
* Fix IOM case
* Initial memory support
* Better RF/XRF handling
* fix
* RF placement and legalization
* Disconnect non available ports for NX_RAM
* cleanup
* Add RFB/RAM context support for latest release
* Remove ports that must not be used
* Proper port used only on RFB
* Add structure for clock sinks
* Use cell type where applicable
* Add clock sinks for other cell types
* Validation check fixes
* Commented too restrictive placement
* Added more crossbar wire type
* Hande IO termination input
* Fail early due to NX tools limitation for now
* Validations and fixes for RAM I/Os
* Fix for latest version of tools
* Use ctx->idf where applicable
* warn if RAM ports are not actually used
* Fix IOM packing
* Fix CY packing
* Change how constants are handled on CY
* Post placement optimization for CY
* Address comments for PR
* pack and export GCK, WFG and PLL
* Cover more global routing cases
* Constraing to location if provided
* Place at LOC
* Pack and export DSP
* wip
* wip
* notes
* wip
* wip
* Validate DSPs
* DSP cascading
* Check mandatory parameters for DSP
* existing gck
* wip
* export all the rest for bitstream
* CDC packing
* add more sinks
* place FIFO
* map rest of FIFO ports
* enable pll by default
* cleanup
* Initial XLUT support
* Fix statistics
* Properly duplicate GCKs
* RRSTO and WRSTO are not used on XFIFO
* Fix for latest version of JSON format
* Implement GCK limitations
* cleanup
* cleanup
* Add more signals and use lowskew name
* cleanup code a bit
* Fix wfb
* detect cascaded GCKs
* Handle DFR
* Route dfr clock properly
* Cleanup
* Cleanup bitstream code
* Review issues addressed
* Move helper routines
* Expose private members for unit tests
* cleanup
* remove scale factor
* make all location helper arrays static
* Addressed review comments
* Support post-routing CSC and SCC
* Support NX_BFF
* Place CSS and SCC only on allowed locations
* Support latest Impulse
* ng_ultra: Expand bounding box further for left-edge IO
Signed-off-by: gatecat <gatecat@ds0.me>
* Export all IO parameters in bitstream
* Handle new CSV order or parameters and additional validation
* Add some more undocumented values for CSV
* Support for old and new CSV formats
* Initial DDFR support
* Display warning message once per file
* Address review issues
* Fix crash on memory access
* Make boundbox fit NG-Ultra internal design
* Update attributes after dff rewrite
* Implement basic NG-Ultra LUT-DFF unit tests
* Always use first seen xbar input
Signed-off-by: gatecat <gatecat@ds0.me>
* Simplified crossbar pip detection
* Change order to prevent issues with some unconnected constants
* Pack LUT and multiple DFF in stripe
* Place DFF chains
* Improve large DFF chains
* Rename to pack_dff_chains
* Better use XLUTs when possible
* pack output DFF together with XLUT
* option to disable XLUT optimiziations
* Make more optimizations optional
* fix to use pre-increment
* GCK for lowskew signals
* Bugfix for nets that are not part of lowskew network
* Fix bitstream export for PLL cell
* Remove separate route lowskew
* Allow WFG mode 2
* Merge inverter into GCK
* Add CSC per TILE when needed
* Improve reusage of existing cell for CSC
* Take preferred CSC
* Cleanup
* When in place CSC size not important
* Cleanup
* Reset and Load restriction
* make csc optimisation optional
* Proper count for IO resources
* Detect when there is no next cell for DSP chain
* Do not incorporate loops in XLUT
* Check if output exists
* Update copyright for delivery
* Make building NG-Ultra chip database optional, follow filename convention
* Ported drawing code to new API
* Update expandBoundingBox for NG-Ultra
* Copyright and license update
* Add README information
* cleanup and constids
* Using ctx->idf where applicable
* remove if_using_basecluster
* refactor extra data usage
* refactor to use create_cell_ptr only
* optimized getCSC
* optimize critical path a bit
* clangformat
* disable clangformat where applicable
---------
Signed-off-by: gatecat <gatecat@ds0.me>
Co-authored-by: Lofty <dan.ravensloft@gmail.com>
Co-authored-by: gatecat <gatecat@ds0.me>
* Gowin. Add IODELAY.
Input/Output delay (IODELAY) is programmable delay uint in IO block.
This delay line is enabled before/after the IO pad and allows the signal
to be delayed statically or dynamically during 0-127 stages each lasting
from 18 to 30 picoseconds depending on the chip family.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Replacing assertions with log_error.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Add sampling part to IO blocks (input only). This edge detector will
allow to dynamically adjust DDR decoding window in the future.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Extend Himbaechel API with gfx drawing methods
* Add bel drawing in example uarch
* changed API and added tile wire id in db
* extend API so we can distinguish CLK wires
* added bit more wires
* less horrid way of handling gfx ids
* loop wire range
* removed not needed brackets
* bump database version to 5
* Removed not used GfxFlags
* Gowin. FFs placement.
* Allow clusters to be created from FFs and LUTs;
* Immediately create pass-through LUTs from free LUTs adjacent to FF - at the same time ensure alternating use of LUT inputs;
* In case of constant networks, such pass-through LUTs are disconnected from networks altogether;
* Allow FF to be placed directly into SSRAM slides - this is useful when using synchronous reading.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix aux name creation
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Use I3 for pass-trough LUTs
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix the port check for connectivity.
What happens is that it's not enough to check for a network, we also
need to make sure that the network is functional: has src and sinks.
And the style edits - they get automatically when I make sure to run
clang-format10.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix the port check for connectivity.
What happens is that it's not enough to check for a network, we also
need to make sure that the network is functional: has src and sinks
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Implement the EMCU primitive.
Add support for the GW1NSR-4C's embedded Cortex-M3 processor. Since it
uses flash in its own way, we disable additional flash processing for
this case.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix merge.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add DHCEN primitive.
This primitive allows you to dynamically turn off and turn on the
networks of high-speed clocks.
This is done tracking the routes to the sinks and if the route passes
through a special HCLK MUX (this may be the input MUX or the output MUX,
as well as the interbank MUX), then the control signal of this MUX is
used.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Change the DHCEN binding
Use the entire PIP instead of a wire - avoids normalisation and may also
be useful in the future when calculating clock stuff.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Some Clocks PIPS were not created due to a check for the presence of a
delay class, now all wires are attributed to the class so that there is
no longer any need for this check.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Implement the UserFlash primitive
Some Gowin chips have embedded flash memory accessible from the fabric.
Here we add primitives that allow access to this memory.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix cell creation
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Himbaechel Gowin: Add support for CLKDIV and CLKDIV2
* Himbaechel Gowin: Add support for CLKDIV and CLKDIV2
* Gowin Himbaechel: HCLK Bug fixes and corrections
DQCE and DCS primitives are added.
DQCE allows the internal logic to enable or disable the clock network in
the quadrant. When clock network is disabled, all logic drivern by this
clock is no longer toggled, thus reducing the total power consumtion of
the device.
DCS allows you to select one of four sources for two clock wires (6 and 7).
Wires 6 and 7 have not been used up to this point.
Since "hardware" primitives operate strictly in their own quadrants,
user-specified primitives are converted into one or more "hardware"
primitives as needed.
Also:
- minor edits to make the most of helper functions like connectPorts()
- when creating bases, the corresponding constants are assigned to the
VCC and GND wires, but for now huge nodes are used because, for an
unknown reason, the constants mechanism makes large examples
inoperable. So for now we remain on the nodes.
Compatible with older Apicula databases.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
It was not taken into account that there are only 6 ALUs per cell. As a
result, on complex designs where ALUs and LUT-based memory are involved
and there are many LUTs (like in the RISCV emulator), there were
sometimes false positives about placement conflicts.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The statement in the Gowin documentation that in the reading mode
"READ_MODE=0" the output register is not used and the OCE signal is
ignored is not confirmed by practice - if the OCE was left unconnected
or connected to the constant network, then a change in output data was
observed even with CE=0, as well as the absence of such at CE=1.
Synchronizing CE and OCE helps and the memory works properly in complex
systems such as RISC-V emulation and i8080 emulation (with 32K RAM and
16K BSRAM based ROM), but there is no theoretical basis for this fix, so
it is a hack.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
For pROM(X9) primitives in images generated by Gowin IDE, there is an
interesting recommunication of inputs depending on the data bit depth.
For example, in some cases, a high logical level may be applied to the
Write Enable input, which, let’s say, is not entirely usual for Read
Only memory.
Here we will do similar manipulations.
In addition, several minor bug fixes are included:
- Fixed bit numbering for non-X9 series primitives.
- Fixed decoder generation for BLKSEL - do not assume unused inputs are
connected to GND.
- Use default values for BSRAM parameters - don't assume their
mandatory presence.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
As the board on the GW1N-1 chip becomes a rarity, its replacement is the
Tangnano1k board with the GW1NZ-1 chip. This chip has a unique mechanism
for turning off power to important things such as OSC, PLL, etc.
Here we introduce a primitive that allows energy saving to be controlled
dynamically.
We also bring the names of some functions to uniformity.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
In the images generated by Gowin IDE, the signals for dynamic BSRAM
block selection (BLKSEL[2:0]) are not always connected directly to the
ports - some chips add LUT2, LUT3 or LUT4 to turn these signals into
Clock Enable. Apparently there are chips with an error in the operation
of these ports.
Here we make such a decoder instead of using ports directly.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
It seems that the internal registers on the BSRAM output pins in
READ_MODE=1'b1 (pipeline) mode do not function properly because in the
images generated by Gowin IDE an external register is added to each pin,
and the BSRAM itself switches to READ_MODE=1'b0 (bypass) mode .
This is observed on Tangnano9k and Tangnano20k boards.
Here we repeat this fix.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Add description of BSRAM harness
In some cases, Gowin IDE adds a number of LUTs and DFFs to the BSRAM. Here we are trying to add similar elements.
More details with pictures: https://github.com/YosysHQ/apicula/blob/master/doc/bsram-fix.md
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
When multiplying 36 bits by 36 bits using four 18x18 multipliers, the
sign bits of the higher 18-bit parts of the multipliers were correctly
switched, but what was incorrect was leaving the sign bits of the lower
parts of the multipliers uninitialized. They now connect to VSS.
Addresses https://github.com/YosysHQ/apicula/issues/242
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Do not search for pads if the signal source for the PLL is something
other than the IO pin - these are guaranteed to already be placed and
have a bound Bel.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
If the CLKIN input of the PLL is connected to a special pin, then it
makes sense to try to place the PLL so that it uses a direct implicit
non-switched connection to this pin.
The transfer of information about pins for various purposes has been
implemented (clock input signal, feedback, etc), but so far only CLKIN
is used.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
For the following primitives:
- PADD9
- PADD18
- MULT9X9
- MULT18X18
- MULT36X36
- MULTALU18X18
- MULTALU36X18
- MULTADDALU18X18
- ALU54D
packing and processing of fixed wires between macro and between DSP
blocks is implemented.
Clusters of DSP and macro blocks are processed using custom placement of
cluster elements.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Corrects the situation when it is impossible to use IOBUF with two
IOLOGIC elements at the same time - input and output.
Addresses https://github.com/YosysHQ/nextpnr/issues/1275
This is done by dividing one IOLOGIC Bel into two - input IOLOGIC and
output IOLOGIC plus checking for compatibility of the cells located
there.
At the moment, this check is simple and allows only the combination of
DDR and DDRC primitives.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
A small improvement - do not waste time analyzing already processed
networks in the previous step (and possibly steps).
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Semi-dual port BSRAM (in Gowin terminology) has the same feature as
Single Port - the CE and OCE signals must be synchronized.
Such a sin has not yet been noticed for Dual Port.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Don't stop at the first bad "arc", but use the global network to the
maximum.
* Report partial/full use of global wires for the network.
* In case of complete routing failure, releasing the source - this is
actually a BUGFIX.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
This type setting is not needed here - the packer distinguishes memory
features by the X9 attribute, which will be correct anyway.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The OCE signal in the SP(X)9B primitive is intended to control the
built-in output register. The documentation states that this port is
invalid when READ_MODE=0 is used. However, it has been experimentally
established that you cannot simply apply VCC or GND to it and forget it
- the discrepancy between the signal on this port and the signal on the
CE port leads to both skipping data reading and unnecessary reading
after CE has switched to 0.
Here we force these ports to be connected to the network, except in the
case where the user controls the OCE signal using non-constant signals.
Also:
* All PIPs for clock spines are made inaccessible to the common router
- in general, using these routes for signals that have not been
processed by a special globals router is fraught with effects that
are difficult to detect.
* The INV primitive has been added purely to speed up development -
this primitive is not generated by Yosys, but is almost always
present in vendor output files.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The following primitives are implemented for the GW1NZ-1 chip:
* pROM - read only memory - (bitwidth: 1, 2, 4, 8, 16, 32).
* pROMX9 - read only memory - (bitwidth: 9, 18, 36).
* SDPB - semidual port - (bitwidth: 1, 2, 4, 8, 16, 32).
* SDPX9B - semidual port - (bitwidth: 9, 18, 36).
* DPB - dual port - (bitwidth: 16).
* DPX9B - dual port - (bitwidth: 18).
* SP - single port - (bitwidth: 1, 2, 4, 8, 16, 32).
* SPX9 - single port - (bitwidth: 9, 18, 36).
Also:
- The creation of databases for GW1NS-2 has been removed - this was not
planned to be supported in Himbaechel from the very beginning and
even examples were not created in apicula for this chip due to the
lack of boards with it on sale.
- It is temporarily prohibited to connect DFFs and LUTs into clusters
because for some reason this prevents the creation of images on lower
chips (placer cannot find the placement), although without these
clusters the images are quite working. Requires further research.
- Added creation of ALU with mode 0 - addition. Such an element is not
generated by Yosys, but it is a favorite vendor element and its
support here greatly simplifies the compilation of vendor netlists.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Now the clock router can place a buffer into the specified network,
which divides the network into two parts: from the source to the buffer,
routing occurs through any available PIPs, and after the buffer to the
sink, only through a dedicated global clock network.
This is made specifically for the Tangnano20k where the external
oscillator is soldered to a regular non-clock pin. But it can be used
for other purposes, you just need to remember that the recipient must be
a CLK input or output pin.
The port/network to set the buffer to is specified in the .CST file:
CLOCK_LOC "name" BUFG;
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Slightly change the Gowin device selection mechanism for database generation.
By default, nothing is generated as before.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>