This commit is contained in:
Fischer Moseley 2023-02-24 19:44:05 -05:00
parent 3728a5263d
commit 2d92b6a290
4 changed files with 171 additions and 28 deletions

View File

@ -1,28 +0,0 @@
![](assets/manta.png)
## Manta: An In-Situ Debugging Tool for Programmable Hardware
![functional_simulation](https://github.com/fischermoseley/manta/actions/workflows/functional_simulation.yml/badge.svg)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
Manta is a tool for debugging FPGA designs over an interface like UART or Ethernet. It has two modes for doing this, downlink and uplink. The downlink mode feels similar to a logic analyzer, in that Manta provides a waveform view of a configurable set of signals, which get captured when some trigger condition is met. The uplink mode allows a host machine to remotely set values of registers on the FPGA via a python interface. This permits rapid prototyping of logic in Python, and a means of incrementally migrating it to HDL. A more detailed description of each mode is below.
Manta is written in Python, and generates SystemVerilog HDL. It's cross-platform, and its only dependencies are pySerial and pyYAML. The SystemVerilog templates are included in the Python source, so only a single python file must be included in your project.
## Design Philosophy
- Things that are easy to break should be easy to fix. For instance, it's pretty easy to put the wrong number of clock cycles of holdoff in your configuration, but it's a lot harder to accidentally put the wrong number of stop bits in your serial port. Manta supports changing the former post-upload, but not the latter.
- Features are added when they're needed. We won't add features until there's been a use case shown that would benefit from them. This keeps manta lightweight.
## Downlink
Manta's downlink mode works by taking a YAML/JSON file describing the ILA configuration, and autogenerating a debug core with SystemVerilog. This gets included in the rest of the project's HDL, and is synthesized and flashed on the FPGA. It can then be controlled by a host machine connected over a serial port. The host can arm the core, and then when a trigger condition is met, the debug output is wired back to the host, where it's saved as a waveform file. This can then be opened and inspected in a waveform viewer like GTKWave.
This is similar to Xilinx's Integrated Logic Analyzer (ILA) and Intel/Altera's SignalTap utility.
## Getting Started
Manta is installed with `pip3 install mantaray`. Or at least it will be, once it's out of alpha. For now, it's installable with `pip install -i https://test.pypi.org/simple/ mantaray`, which just pulls from the PyPI testing registry.
## Examples
Examples can be found under `examples/`. These target the [Nexys4 DDR](https://digilent.com/reference/programmable-logic/nexys-4-ddr/start) and [Nexys A7-100T](https://digilent.com/reference/programmable-logic/nexys-a7/start) from Digilent, which are functionally equivalent.
## About
Manta was originally developed as part of my [Master's Thesis at MIT](dspace.mit.edu) in 2023, done under the supervision of Joe Steinmeyer. But I think it's a neat tool, so I'm still working on it :)

90
doc/cores.md Normal file
View File

@ -0,0 +1,90 @@
# Cores
Manta has two types of debug cores: a logic analyzer core, and an IO core.
## Logic Analyzer Core
This emulates the look and feel of a logic analyzer, both benchtop and integrated. These work by continuously sampling a set of digital signals, and then when some condition (the _trigger_) is met, recording these signals to memory, which are then read out to the user.
Manta works exactly the same way, and the behavior of the logic analyzer is defined entirely in the Manta configuration file. Here's an example:
```yaml
---
logic_analyzer:
sample_depth: 4096
clock_freq: 100000000
probes:
larry: 1
curly: 1
moe: 1
shemp: 4
triggers:
- larry && curly && ~moe
uart:
baudrate: 115200
port: "/dev/tty.usbserial-2102926963071"
data: 8
parity: none
stop: 1
timeout: 1
```
There's a few parameters that get configured here, including:
### Probes
Probes are the signals read by the core. These are meant to be connected to your RTL design when you instantiate your generated copy of Manta. These can be given whatever name and width you like (within reason). You can have up to 256 probes in your design.
### Sample Depth
Sample depth controls how many samples of the probes get read into the buffer.
### Triggers
Triggers are things that will cause the logic analyzer core to capture data from the probes. These get specified as a Verilog expression, and are partially reconfigurable on-the-fly. This will get elaborated on more as it's implemented, but if your trigger condition can be represented as a sum-of-products with each product being representable as an operator from the list [`==`, `!=`,`>`, `<`,`>=`, `<=`, `||`,`&&`, `^`] along with a configurable register and a probe, you won't need to rebuild the bitstream to update the trigger condition. Whew, that was a mouthful.
### Operating Modes
The logic analyzer can operate in a number of modes, which govern what trigger conditions start the capture of data:
* __Single-Shot__: When the trigger condition is met, grab the whole thing.
* __Incremental__: Only pull values when the trigger condition is met. Ignore values received while the trigger condition is not met,
* __Immediate__: Read the probe states into memory immediately, regardless of if the trigger condition is met.
### Holdoff
The logic analyzer has a programmable _holdoff_, which sets when probe data is captured relative to the trigger condition being met. For instance, setting the holdoff to `100` will cause the logic analyzer to start recording probe data 100 clock cycles after the trigger condition occuring.
Holdoff values can be negative! When this is configured, new probe values are being continuously pushed to the buffer, while old ones are pushed off. This measns that the probe data for the last `N` timesteps can be saved, so long as `N` is not larger than the depth of the memory.
Manta uses a default holdoff value of `-SAMPLE_DEPTH/2`, which positions the data capture window such that the trigger condition lives in the middle of it. Here's a diagram:
Similarly, a holdoff of `-SAMPLE_DEPTH` would place the trigger condition at the right edge of the trigger window. A holdoff of `0` would place the trigger at the left edge of the window. Postive holdoff would look like this:
## IO Core
_More details to follow here as this gets written out, for now this is just a sketch_
This emulates the look and feel of an IO pin, much like what you'd find on a microcontroller.
Manta provides a Python API to control these - which allows for behavior like:
```python
>>> import manta.api
>>> cores = manta.api.generate('manta.yaml')
>>> io = cores.my_io_core
>>> io.probe0.set(True)
>>> io.probe0.set(False)
>>> io.probe1.read()
True
```
The caveat being that Manta is limited by the bandwidth of PySerial, which is limited by your operating system and system hardware. These calls may take significant time to complete, and __they are blocking__. More details can be found in the API reference.
## Everything Else
Manta needs to know what clock frequency you plan on running it at so that it can progperly generate the baudrate you desire. It also needs to know what serial port your FPGA is on, as well as how to configure the interface. Right now only standard 8N1 serial is supported by the FPGA.

37
doc/roadmap.md Normal file
View File

@ -0,0 +1,37 @@
# Planned Work:
- _Verify Manta on non-Xilinx FPGAs_: This is in progress for the Lattice iCE40 on the Icestick, and the Altera Cyclone IV on the DE0 Nano.
- _More Examples_, such as
- SD card controller
- BRAM controller
- Pong, with controls via played through python on your machine.
- _Configurable Trigger Location:_ Instead of always centering the downlink core's waveform around where the trigger condition is met, you might want to grab everything before or after the trigger. Or even things that are some number of clock cycles ahead or behind of the trigger. Being able to specify this 'holdoff' or 'position' in the downlink core configuration would be nice. Especially if it's something as simple as `beginning`, `middle`, `end`, or just a number of clock cycles.
- _Incremental Triggering_: Only add things to the buffer when the trigger condition is met. Going to be super useful for audio applications.
- _Python API_: You should be able to run manta and scrape waveforms from the command line - but let's say you're working on a project that loads audio from an SD card, and you want to have a downlink core in incremental mode to pull your audio samples, but you want to export that as a .wav file. Or you want to do some filtering of the data with numpy. You should have a python API that lets you do that.
- _OpenCores Listing_: Might want to chuck this up on [https://opencores.org/projects](https://opencores.org/projects), just for kicks.
- _FuseSoC integration_: This will probably exist in some headless-ish mode that separates manta's core generation and operation, but it'd be kinda nice for folks who package their projects with FuseSoC.
# Potential Future Work:
The guiding principle behind adding features here is to just do a bunch of projects, run into annoying bugs, and see what'd be useful to have as a tool, and then implement that. That said, there's a few ideas I've been kicking around at the moment:
* _Reconfigurable Trigger Modes_: Being able to switch between an incremental trigger and a single-shot trigger while the HDL's on the board might be useful.
* _Configurable Clock Edge:_ Right now when we add a waveform to a VCD file, we assume that all the values change on the rising edge of the ILA clock. And that's true - we sample them on the rising edge of the input clock. I don't know if we'd want to add an option for clocking in things on the falling edge - I think that's going to make timing hard and students confused.
* _Uplink Cores_: Similar to how a donwlink core receives a trigger condition and dumps it to UART, an uplink core would be loaded with some values the host machine, and then dump them onto a set of probes - one after another on every clock cycle. I don't know how useful this would be though.
* _Reconfigurable uplink cores:_ Instead of loading a BRAM with some fixed content and calling it a day, we should be able to load new data into that memory, and then dump it to the system when needed.
* _Clock Domain Crossing:_ You should be able to put cores in different clock domains - although I'm struggling to figure out where exactly this would be useful. Xilinx's ILA will let you have multiple cores and it doesn't care much which
clock domain those are under, so some more investigation will be needed there.
# Completed Features:
* _Packaging_: Manta should fundamentally be out of the way of the hardware developer, so it needs to live on the system, not as source code in the project repo. We learned this with `lab-bc` last semester - we couldn't update it easily and it ended up living in people's git repos. Which shouldn't be necessary since they're not responsible for versioning it - we are. Same mentality here.

View File

@ -0,0 +1,44 @@
# Theory of Operation
- Manta works by having a set of configurable debug cores daisy-chained together across a simple bus. Each core exposes some region of addressible memory that can be controlled by sending read and write commands to the FPGA over UART.
- These registers are 32-bits wide, and have a configurable address width. Manta will default to using the smallest possible address bus to minimize the burden on the place and route engine, but this can be overridden. This might be desirable if you wish to put other devices on the bus, such as a softcore.
- The regions of memory assigned to each core are determined by Manta when it autogenerates the Verilog HDL. Address space is assigned sequentially.
- Some registers, like captured sample data in a logic analyzer core, are not writeable by the host machine.
- Reading from a register will return the contents of the register over serial. Writing to a register will return nothing over serial. If you want to verify that the data you wrote to some location is valid, read from it after the write. This lack of a return makes things simpler for the state machines and faster for the user, since the OS on the host machine doesn't have to empty it's UART RX buffer before moving on in the Python.
These registers exist within whatever core you've asked manta to generate - be that an logic analyzer, or I/O. Each core is daisy-chained after the previous one in the arrangement shown below. This is done to provide maximum flexibility for place-and-route, as the critical timing path only exists between adjacent cores. If a hub-and-spoke arrangement were used, the critical timing path would exist between the hub and every spoke. For designs that span multiple clock domains and need to use BRAMs on the edges of clock domains for CDC, this makes designs that are very difficult to route.
# Block Diagram
# Message-Passing Format
Data moves between the host computer and the FPGA over UART. UART's just an interface though, so the choice of what data to send is arbitrary. Manta encodes data exchanged between devices as messages, which are ASCII text in the following format:
```[preamble] [address] [data (optional)] [EOL]```
- The __preamble__ is just the character `M`, encoded as ASCII.
- The __address__ is the memory location we wish to access. This must exist somewhere in the address space consumed by the cores. If it does not, then a write operation addressed here will do nothing, and a read operation addressed here will return nothing. The address itself is transmitted as hex values, encoded as ASCII using the characters `0-9` and `A-F`. All addresses are a single byte, so ther can not be more than 256 register locations onboard.
- The __data__ gets stored in the memory location provided by __address__. The presence of any number of data bytes indicates a write operation, while no data bytes indicates a read operation.
- An __EOL__ indicates the end of the message. CR, LF, or both are considered valid delimiters to for messages sent to the FPGA. For messages sent to the host machine, the FPGA will send CRLF.
## Example Messages
Some examples of valid messages to the FPGA are:
```MBEEF\r\n```, which writes `0xEF` to the memory at location `0xBE`.
```MBE\r\n```, which reads the value of the memory at location `0xBE`.
Some examples of invalid messages to the FPGA are:
```MBEEEF\r\n```f, which contains 12 bits of data, which isn't a multiple of 8.
```NBEEF\r\n```, which contains the wrong preamble.
# AXI-ish Interfaces