manta/doc/io_core.md


## Overview
Registers are a fundamental building block of digital hardware, and the IO core provides a simple way of interacting with them from the host machine. It allows you to define a set of inputs and outputs of arbitrary width, and then set values to the outputs and read values from the inputs.

This is a very, very simple task - and while configuration is straightforward, there are a few caveats. More on both topics below:

## Configuration

Just like the rest of the cores, the IO core is configured via an entry in a project's configuration file. This is easiest to show by example:

```yaml
---
the_muppets:
  type: io

  inputs:
    kermit: 3
    piggy: 1
    animal: 38
    scooter: 4

  outputs:
    fozzy: 1
    gonzo: 3
```
This configuration specifies four parameters:

- `name`: The name of the IO core. This name is used to reference the core when working with the API, and can be whatever you'd like.
- `type`: This signals to the parser that this is an IO core. All cores contain a `type` field, which must be set to `io` to be recognized as an IO core.
- `inputs`: This lists all inputs from from the FPGA fabric to the host machine. Signals in this list may be read by the host, but ___cannot___ be written to.
- `outputs`: This lists all outputs from the host machine to the FPGA fabric. Signals in this list may be written by the host, but ___can___ also be read from, and doing so returns the value last written to the register.

Lastly, the name of the core and the names of the probes are referenced in the autogenerated Verilog. This means that while the names can be arbitrary, they must be unique within your project and not contain any characters that your synthesis engine won't appreciate. As an example, here's an instance of what the autogenerated module would look like for the configuration above:

```verilog
manta manta_inst (
    .clk(clk),

    .rx(rx),
    .tx(tx),

    .kermit(kermit),
    .piggy(piggy),
    .animal(animal),
    .scooter(scooter),
    .fozzy(fozzy),
    .gonzo(gonzo));
```


## Python API

The IO core functionality is stored in the `Manta.IOCore` and `Manta.IOCoreProbe` classes in [src/manta/io_core/__init__.py](https://github.com/fischermoseley/manta/blob/main/src/manta/io_core/__init__.py), and it may be controlled with the two functions:

`Manta.IOCoreProbe.set(int, bool data)`

- [`int`, `bool`] _data_: The value to write to an output probe. May be signed or unsigned, but will raise an exception if the value is too large for the width of the port.
- _returns_: None

This method is blocking. When called it will dispatch a request to the FPGA, and wait until a response has been receieved.

---

`Manta.IOCoreProbe.set()`

- _returns_: The value of an input or output probe. In the case of an output probe, the value returned will be the last value written to the probe.

This method is blocking. When called it will dispatch a request to the FPGA, and wait until a response has been receieved.

---


### Example

A small example is shown below, using the [example configuration](#configuration) above. More extensive examples can also be found in the repository's [examples/](https://github.com/fischermoseley/manta/tree/main/examples) folder.

```python
>>> import Manta
>>> m = Manta
>>> m.my_io_core.fozzy.set(True)
>>> m.my_io_core.gonzo.set(4)
>>> m.my_io_core.scooter.get()
5
```

## Caveats

While the IO core performs a very, very simple task, it carries a few caveats.

- First, __it's not instantaneous__. Manta has designed to be as fast as possible, but setting and querying registers relies on passing messages between the host and FPGA, which is slow relative to FPGA clock speeds! If you're trying to set values in your design with cycle-accurate timing, this will not do that for you. However, the [Logic Analyzer's playback feature](./logic_analyzer.md#playback) might be helpful.

- Second, __the API methods are blocking__, and will wait for a response from the FPGA before resuming program execution. Depending on your application, you might want to run your IO Core operations in a seperate thread, but you can also decrease the execution time by using a faster interface between the host and FPGA. This means using a higher UART baudrate, or using Ethernet.

## How It Works

This is done with the architecture shown below:

<img src="/assets/io_core_block_diagram.png" alt="drawing" width="400"/>

. A series of connections are made to the user’s logic. These are called \textit{probes}, and each may be either an input or an output. If the probe is an input, then its value is taken from the user’s logic, and stored in a register that may be read by the host machine. If the probe is an output, then its value is provided to the user’s logic from a register written to by the host. The widths of these probes is arbitrary, and is set by the user at compile-time.

However, the connection between these probes and the user’s logic is not direct. The state of each probe is buffered, and the buffers are updated when a \textit{strobe} register within the IO core is set by the host machine. During this update, new values for output probes are provided to user logic, and new values for input probes are read from user logic.

This is done to mitigate the possibility of an inconsistent system state. Although users may configure registers of arbitrary width, Manta’s internal bus uses 16-bit data words, meaning operations on probes larger than 16 bits require multiple bus transactions. These transactions occur over some number of clock cycles, with an arbitrary amount of time between each.

This can easily cause data corruption if the signals were unbuffered. For instance, a read operation on an input probe would read 16 bits at a time, but the probe’s value may change in the time that passes between transactions. This would cause the host to read a value for which each 16 bit chunk corresponds to a different moment in time. Taken together, these chunks may represent a value that the input probe never had. Similar corruption would occur when writing to an unbuffered output probe. The value of the output probe would take multiple intermediate values as each 16-bit section is written by the host. During this time the value of the output probe is not equal to either the incoming value from the host, or the value the host had previously written to it. The user logic connected to the output probe has no idea of this, and will dutifully use whatever value it is provided. This can very easily induce undesired behavior in the user’s logic, as it is being provided inputs that the user did not specify.

Buffering the probes mitigates these issues, but slightly modifies the way the host machine uses the core. When the host wishes to read from an input probe, it will set and then clear the strobe register, which pulls the current value of the probe into the buffer. The host then reads from buffer, which is guaranteed to not change as it is being read from. Writing to an output probe is done in much the same way. The host writes a new value to the buffer, which is flushed out to the user’s logic when the strobe register is set and cleared. This updates every bit in the output probe all at once, guaranteeing the user logic does not observe any intermediate values.

These buffers also provide a convenient location to perform clock domain crossing. Each buffer is essentially a two flip-flop synchronizer, which allows the IO core to interact with user logic on a different clock than Manta’s internal bus.