With the release of Apicula 0.22, the GW5A series gained support for
simple IO, LUTs (including Widw LUTs), and DFFs (including flip-flops 6
and 7 specific to the GW5A series), so we can include the GW5A-25A among
Gowin devices.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The GW-5A series has 8 flip-flops in a cell instead of 6. These
additional flip-flops can be used if the control network matches that
for the 4th and 5th DFFs in this cell.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Prior to the 5A series, pin functions (GPIO/SSPI/JTAG/DONE/etc) were
switched using fuses. This was done during the binary image formation
stage for loading into the FPGA using the command line keys of the
gowin_pack program.
The 5A series features certain ports that connect to VCC or GND
depending on whether the pin is used as SSPI or GPIO, for example. This
mechanism exists in parallel with fuses, but it is not described
anywhere, nor is there a corresponding primitive.
To generate working images, we have no choice but to simulate this thing
at the nextpnr stage, since VCC/GND routing is required.
For now, two flags are added, responsible for the SSPI and I2C pin
functions.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Preparing to support the 5A series.
Family recognition is added, as well as minor fixes, but base generation
itself is not allowed for GW5 - this gives the ability to test the next
Apicula release and still not break installations for those who simply
specify `HIMBAECHEL_GOWIN_DEVICES = "all"`.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Recognize GW5A family chips.
Construct chip base name for
- GW5A-LV25MG121C1/l0 - TangPrimer 25k
- GW5AT-LV60PG484A - TangMega 60k
- GW5AST-LV138PG484A - TangMega 138k
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds automatic connection of a general-purpose pin to the global clock
network.
The old behaviour, where such networks have to be explicitly specified,
can be activated with the command line key
"--vopt disable_gp_clock_routing".
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Use loop enumeration of PIPs instead of direct name construction for the
upper and lower ends of the segment wire.
Also do not allow clock wires for segments.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
A segment router replaces the source-to-sink connection by
general-purpose PIPs with bus-branch segment network connections.
The problem arises when the source is connected to the sinks directly
without switching as in the case of LUT->DFF, such wires should be left
as is, which is what this PR does.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Fill in the delays for PIP classes related to HCLK and IODELAY. Also:
- if clock routing fails, we try to use the next fastest mechanism - segment networks;
- fixing harmless typos.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. BUGFIX Use a separate net for segment gates
We use a temporary separate small network (typically 2 - 3 sinks) for
routing from the segment network source to the segment gate. This fixes
the rare but unpleasant case of self-intersection when a route to a gate
is routed using PIPs after the gate, this is no longer allowed when
using a separate small network.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix style.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
DLLDLY is the clock delay primitive that adjust the input clock
according to the DLLSTEP signal and outputs the delayed clock.
These primitives are associated with clock pins and are "tapped" between
the output of this IBUF and the clock networks, leaving the possibility
to connect to the original unshifted signal as well, although the latter
is not very practical because it is no longer possible to use fast
wires.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Gowin chips have an interesting mechanism - wires that run vertically
through several rows (at least 10) in each column of the chip. In each
row a particular wire has branches to the left and right, covering on
average 4 neighboring cells in the row. For lack of a better term, I
further call such a wire a segment.
So a segment can provide a direct connection in a local rectangle. There
are no special restrictions on the sinks, so segment networks can be
used for ClockEnable, LocalSetReset, as well as for LUT and DFF inputs.
The sources are not so simple - the sources can be the upper or lower
end of the segment, which in theory can lead to unfortunate consequences
if the signal is applied from both ends.
The matter is complicated by the fact that there are default
connections, i.e. in the absence of any set fuse the segment input is
still connected to something (VCC for example) and to disable the unused
end of the segment you need to set a special combination of fuses.
Taking into account which end of which segment is used is one of the
tasks of this router. In addition, segment ends can physically coincide
with PLL, DSP and BSRAM inputs, which can also lead to unexpected
effects. Some of these things are tracked when generating the base, some
in this router, some when packing in gowin_pack.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds the ability to use high-speed clock lines (together with CLKDIV2
type frequency dividers operating on them) as sieve signals for the
CLKIN and CLKFB inputs of the rPLL and PLLVR primitives (these cover the
full range of supported Gowin chips).
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add I3C io buffer.
A buffer is added that can operate as a normal IOBUF in PUSH-PULL mode
or switch to open-drain IOBUF mode.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Turn a variable into a set of flags
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds output (MIPI_OBUF and MIPI_OBUF_A) and input (MIPI_IBUF) primitives
to allow the use of “real” MIPI (not emulation) ports capable of
operating in both HS and LP modes.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
This is useful for certain cross-compilation workloads, and to cache
rarely changing build products.
To use this functionality, build e.g. as follows:
cmake . -B build-export -DEXPORT_BBA_FILES=../bba-files -DARCH=all
cmake --build build-export -t nextpnr-all-bba
cmake . -B build-import -DIMPORT_BBA_FILES=../bba-files -DARCH=all
cmake --build build-import
Two user-visible changes were made:
* `-DUSE_RUST` is replaced with `-DBUILD_RUST`, by analogy with
`-DBUILD_PYTHON`
* `-DCOVERAGE` was removed as it doesn't work with either modern GCC
or Clang
Primarily, this commit makes both of them use the `BBAsm` functions
to build and compile `.bba` files.
In addition, Himbaechel targets are now aligned with the rest in
how they are configured: instead of having all uarches enabled with
all of the devices disabled (the opposite of the rest of nextpnr),
uarches must be enabled explicitly but they come with all devices
enabled (except for Xilinx, which does not have a list of devices).
Erroneously created wires for specific IOs on the underside of some
chips.
Fixes https://github.com/YosysHQ/nextpnr/issues/1417
Also cosmetic edits.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add the ability to place registers in IOB
IO blocks have registers: for input, for output and for OutputEnable
signal - IREG, OREG and TREG respectively.
Each of the registers has one implicit non-switched wire, which one
depends on the type of register (IREG has a Q wire, OREG has a D wire).
Although the registers can be activated independently of each other they
share the CLK, ClockEnable and LocalSetReset wires and this places
restrictions on the possible combinations of register types in a single
IO.
Register placement in IO blocks is enabled by specifying the command
line keys --vopt ireg_in_iob, --vopt oreg_in_iob, or --vopt ioreg_in_iob.
It should be noted that specifying these keys leads to attempts to place
registers in IO blocks, but no errors are generated in case of failure.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO
Check for unconnected ports.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. IO regs. Verbose warnings.
If an attempt to place an FF in an IO block fails, issue a warning
detailing the reason for the failure, whether it is a register type
conflict, a network requirement violation, or a control signal conflict.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. BUGFIX. Fix FFs compatibility.
Flipflops with a fixed ClockEnable input cannot coexist with flipflops
with a variable one.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. FFs in IO. Changing diagnostic messages.
Placement modes are still specified by the command line keys
ireg_in_iob/oreg_in_iob/ioreg_in_iob, but also introduces more granular
control in the form of attributes at I/O ports:
(* NOIOBFF *) - registers are never placed in this IO,
(* IOBFF *) - registers must be placed in this IO, in case of failure
a warning (not an error) with the reason for nonplacement is issued,
_attribute_absence_ - no diagnostics will be issued: managed to place - good, failed - not bad either.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO.
Change the logic for handling command line keys and attributes -
attributes allow routines to be placed in IO regardless of global mode.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Registers in IO. Fix style.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Adds additional restrictions on the first PIP after the clock source -
only connections to SPINEs are allowed. This allowed to correct the
behaviour of DQCEs since the latter can only disable/enable SPINEs.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add IODELAY.
Input/Output delay (IODELAY) is programmable delay uint in IO block.
This delay line is enabled before/after the IO pad and allows the signal
to be delayed statically or dynamically during 0-127 stages each lasting
from 18 to 30 picoseconds depending on the chip family.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Replacing assertions with log_error.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Add sampling part to IO blocks (input only). This edge detector will
allow to dynamically adjust DDR decoding window in the future.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. FFs placement.
* Allow clusters to be created from FFs and LUTs;
* Immediately create pass-through LUTs from free LUTs adjacent to FF - at the same time ensure alternating use of LUT inputs;
* In case of constant networks, such pass-through LUTs are disconnected from networks altogether;
* Allow FF to be placed directly into SSRAM slides - this is useful when using synchronous reading.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix aux name creation
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Use I3 for pass-trough LUTs
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix the port check for connectivity.
What happens is that it's not enough to check for a network, we also
need to make sure that the network is functional: has src and sinks.
And the style edits - they get automatically when I make sure to run
clang-format10.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix the port check for connectivity.
What happens is that it's not enough to check for a network, we also
need to make sure that the network is functional: has src and sinks
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Implement the EMCU primitive.
Add support for the GW1NSR-4C's embedded Cortex-M3 processor. Since it
uses flash in its own way, we disable additional flash processing for
this case.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix merge.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Add DHCEN primitive.
This primitive allows you to dynamically turn off and turn on the
networks of high-speed clocks.
This is done tracking the routes to the sinks and if the route passes
through a special HCLK MUX (this may be the input MUX or the output MUX,
as well as the interbank MUX), then the control signal of this MUX is
used.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Change the DHCEN binding
Use the entire PIP instead of a wire - avoids normalisation and may also
be useful in the future when calculating clock stuff.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
Some Clocks PIPS were not created due to a check for the presence of a
delay class, now all wires are attributed to the class so that there is
no longer any need for this check.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Implement the UserFlash primitive
Some Gowin chips have embedded flash memory accessible from the fabric.
Here we add primitives that allow access to this memory.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Gowin. Fix cell creation
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
---------
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
* Himbaechel Gowin: Add support for CLKDIV and CLKDIV2
* Himbaechel Gowin: Add support for CLKDIV and CLKDIV2
* Gowin Himbaechel: HCLK Bug fixes and corrections
DQCE and DCS primitives are added.
DQCE allows the internal logic to enable or disable the clock network in
the quadrant. When clock network is disabled, all logic drivern by this
clock is no longer toggled, thus reducing the total power consumtion of
the device.
DCS allows you to select one of four sources for two clock wires (6 and 7).
Wires 6 and 7 have not been used up to this point.
Since "hardware" primitives operate strictly in their own quadrants,
user-specified primitives are converted into one or more "hardware"
primitives as needed.
Also:
- minor edits to make the most of helper functions like connectPorts()
- when creating bases, the corresponding constants are assigned to the
VCC and GND wires, but for now huge nodes are used because, for an
unknown reason, the constants mechanism makes large examples
inoperable. So for now we remain on the nodes.
Compatible with older Apicula databases.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
It was not taken into account that there are only 6 ALUs per cell. As a
result, on complex designs where ALUs and LUT-based memory are involved
and there are many LUTs (like in the RISCV emulator), there were
sometimes false positives about placement conflicts.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
The statement in the Gowin documentation that in the reading mode
"READ_MODE=0" the output register is not used and the OCE signal is
ignored is not confirmed by practice - if the OCE was left unconnected
or connected to the constant network, then a change in output data was
observed even with CE=0, as well as the absence of such at CE=1.
Synchronizing CE and OCE helps and the memory works properly in complex
systems such as RISC-V emulation and i8080 emulation (with 32K RAM and
16K BSRAM based ROM), but there is no theoretical basis for this fix, so
it is a hack.
Signed-off-by: YRabbit <rabbit@yrabbit.cyou>