finish most of docs ahead of v1
This commit is contained in:
parent
23418066f9
commit
f60ec0c0e3
Binary file not shown.
|
After Width: | Height: | Size: 653 KiB |
|
|
@ -68,59 +68,29 @@ A Block Memory core is used in the [video_sprite](https://github.com/fischermose
|
|||
|
||||
Each Block Memory core is actually a set of 16-bit wide BRAMs with their ports concatenated together, with any spare bits masked off. Here's a diagram:
|
||||
|
||||
<img src="/assets/block_memory_architecture.png" alt="drawing" width="800"/>
|
||||
|
||||
|
||||
|
||||
This has one major consequence: if the core doesn't have a width that's an exact multiple of 16, Vivado will throw some warnings during synthesis as it optimizes out the unused bits. This is expected behavior (and rather convenient, actually).
|
||||
|
||||
The warnings are a little annoying, but not having to manually deal with the unused bits simplifies the implementation immensely - no Python is needed to generate the core, and it'll configure itself just based on Verilog parameters. This turns the block memory core from complicated beast requring a bunch of conditional instantiation in Python to a simple ~_100 line_ [Verilog file](https://github.com/fischermoseley/manta/blob/main/src/manta/block_memory.v).
|
||||
This has one major consequence: if the core doesn't have a width that's an exact multiple of 16, synthesis engines (Vivado in particular) will throw some warnings as they optimize out the unused bits. This is expected behavior, and while the warnings are a little annoying, not having to manually deal with the unused bits simplifies the implementation immensely. No Python is needed to generate the core, and it'll configure itself just based on Verilog parameters. This turns the block memory core from a complicated conditionally-instantiated beast to a simple ~_100 line_ [Verilog file](https://github.com/fischermoseley/manta/blob/main/src/manta/block_memory.v).
|
||||
|
||||
### Address Assignment
|
||||
|
||||
Since each $n$-bit wide block memory is actually $ceil(n/16)$ BRAMs under the hood, addressing the BRAMs correctly from the bus is important. BRAMs are organized such that the 16-bit words that make up each entry in the Block Memory core are next to each other in bus address space. For instance, if one was to configure a core of width 34, then the memory map would be:
|
||||
Since each $n$-bit wide block memory is actually $ceil(n/16)$ BRAMs under the hood, addressing the BRAMs correctly from Manta's internal bus is important. BRAMs are organized such that each 16-bit slice of a $N$-bit word in the Block Memory core are placed next to each other in bus address space. For instance, a 34-bit wide block memory would exist on Manta's internal bus as:
|
||||
|
||||
```
|
||||
bus address : | bram address
|
||||
BUS_BASE_ADDR + 0 : address 0, bits [0:15]
|
||||
BUS_BASE_ADDR + 1 : address 0, bits [16:31]
|
||||
BUS_BASE_ADDR + 2 : address 0, bits [32:33]
|
||||
BUS_BASE_ADDR + 3 : address 1, bits [0:15]
|
||||
BUS_BASE_ADDR + 4 : address 1, bits [16:31]
|
||||
...
|
||||
```
|
||||
|
||||
corresponding to each
|
||||
| Bus Address Space | BRAM Address Space |
|
||||
| ----------- | -------------------- |
|
||||
| BASE_ADDR + 0 | address 0, bits 0-15 |
|
||||
| BASE_ADDR + 1 | address 0, bits 16-31|
|
||||
| BASE_ADDR + 2 | address 0, bits 32-33|
|
||||
| BASE_ADDR + 3 | address 1, bits 0-15 |
|
||||
| BASE_ADDR + 4 | address 1, bits 16-31|
|
||||
| BASE_ADDR + 5 | address 1, bits 32-33|
|
||||
|
||||
...and so on.
|
||||
|
||||
### Synchronicity
|
||||
|
||||
Since Manta's [data bus](../system_architecture) is only 16-bits wide, it's only possible to manipulate the BRAM core in 16-bit increments. This means that if you have a BRAM that's ≤16 bits wide, you'll only need to issue a single bus transaction to read/write one entry in the BRAM. However, if you have a BRAM that's ≥16 bits wide, you'll need to issue a bus transaction to update each 16-bit slice of it. For instance, updating a single entry in a 33-bit wide BRAM would require sending 3 messages to the FPGA: one for bits 1-16, another for bits 17-32, and one for bit 33. If your application expects each BRAM entry to update instantaneously, this could be problematic. Here's some exapmles:
|
||||
Since Manta's [data bus](../system_architecture) is only 16-bits wide, it's only possible to manipulate the BRAM core in 16-bit increments. This means that if you have a BRAM that's ≤16 bits wide, you'll only need to issue a single bus transaction to read/write one entry in the BRAM. However, if you have a BRAM that's ≥16 bits wide, you'll need to issue a bus transaction to update each 16-bit slice of it. For instance, updating a single entry in a 33-bit wide BRAM would require sending 3 messages to the FPGA: one for bits 1-16, another for bits 17-32, and one for bit 33. If your application expects each BRAM entry to update instantaneously, this could be problematic.
|
||||
|
||||
!!! warning "Choice of interface matters here!"
|
||||
There's a few different ways to solve this - you could use an IO core to signal when a BRAM's contents or valid - or you could ping-pong between two BRAMs while one is being modified. The choice is yours, and Manta makes no attempt to presribe any particular approach.
|
||||
|
||||
The interface you use (and to a lesser extent, your operating system) will determine the space between bus transactions. For instance, 100Mbit Ethernet is a thousand times faster than 115200bps UART, so issuing three bus transactions will take a thousanth of the time.
|
||||
|
||||
### Example 1 - ARP Caching
|
||||
For instance, if you're making a network interface and you'd like to peek at your ARP cache that lives in a BRAM, it'll take three bus transactions to read each 48-bit MAC address. This will take time, during which your BRAM cache could update, leaving you with 16-bit slices that correspond to different states of the cache.
|
||||
|
||||
In a situation like this, you might want to pause writes to your BRAM while you dump its contents over serial. Implementing a flag to signal when a read operation is underway is simple - adding an [IO core](../io_core) to your Manta instance would accomplish this. You'd assert the flag in Python which disables writes to the user port on the FPGA, perform your reads, and then deassert the flag.
|
||||
|
||||
### Example 2 - Neural Network Accelerator
|
||||
This problem would also arise if you were making a NN accelerator, with 32-bit weights stored in a BRAM updated by the host machine. Each entry would need two write operations, and during the time between the first and second write, the entry would contain a MSB from one weight, and a LSB from another. This may not be desirable - depending on what you do with your inference results, running the network with the invalid weight might be problematic.
|
||||
|
||||
If you can pause inference, then the flag-based solution with an IO core described in the prior example could work. However if you cannot pause inference, you could use a second BRAM as a cache. Run inference off one BRAM, and write new weights into another. Once all the weights have been written, assert a flag with an IO Core, and switch the BRAM that weights are obtained from. This guaruntees that the BRAM contents are always valid.
|
||||
|
||||
\section{Block Memory Core}
|
||||
\subsection{Description}
|
||||
Block memory, also referred to as block RAM (BRAM), is a staple of FPGA designs. It consists of dedicated blocks of memory spaced throughout the FPGA die, and is very commonly used in hardware designs due to its configurability, simplicity, and bandwidth. Although each block memory primitive is made of fixed-function silicon, EDA tools allow them to be mapped to logical memories of arbitrary width and depth, combining and masking off primitives when necessary. These are exposed to the user’s logic over \textit{ports}, which contain four signals for reading and writing to the BRAM. These signals specify the address, input data, output data, and the desired operation (read/write) to the core. Most BRAM primitives include two ports, each of which may live on a separate clock domain, making them useful for clock domain crossing in addition to data storage. Each port can handle a memory operation on every clock edge, which is practically the maximum memory bandwidth possible in any digital system.
|
||||
|
||||
Central to Manta’s design objectives is the ability to debug user logic in an intuitive and familiar manner. Practically, this means being able to interact with bits on the FPGA in whatever method they’re presented. Block memory is one such method, and their pervasive use is acknowledged by the inclusion of a Block Memory Core in Manta. This core takes a standard dual-port, dual-clock BRAM and connects one port to Manta’s internal bus, and gives the other port to the user. This means that both the host machine and the user’s logic have access to the BRAM, allowing large amounts of data to be shared between both devices.
|
||||
|
||||
This is accomplished by architecting the Block Memory Core as shown in Figure \ref{fig_block_mem_core_arch}. Internally, the Block Memory Core consists of multiple BRAMs connected in parallel. This is done to maintain the ability to create block memory of arbitrary width and depth. Manta’s internal bus uses 16-bit data words, so if a user wishes to create a BRAM of width $N$ where $N$ is larger than 16 bits, then multiple addresses in Manta’s memory are required to contain the data at a single BRAM address. These multiple addresses are created by creating many smaller block memories, each of which stores a 16-bit slice of the $N$-bit wide data. As a result, $ceil(\frac{N}{16})$ smaller BRAMs are needed to present a BRAM of width $N$ to the user. One set of ports on these smaller BRAMs are concatenated together, which presents a $N$ bit wide BRAM to the user. The other set of ports are individually connected to Manta’s internal bus.
|
||||
|
||||
\begin{figure}[h!]
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{block_memory_architecture.png}
|
||||
\caption[Block diagram of the Block Memory Core.]{Block diagram of the Block Memory Core. Blocks in blue are clocked on the bus clock, and blocks in orange are clocked on the user clock.}
|
||||
\label{fig_block_mem_core_arch}
|
||||
\end{figure}
|
||||
Lastly, the interface you use (and to a lesser extent, your operating system) will determine the space between bus transactions. For instance, 100Mbit Ethernet is a thousand times faster than 115200bps UART, so the time where the BRAM is invalid is a thousand times smaller.
|
||||
|
|
|
|||
26
doc/index.md
26
doc/index.md
|
|
@ -14,6 +14,7 @@ You may find this core useful for:
|
|||
|
||||
* _Verifying specification adherence for connected hardware_ - for instance, you're writing a S/PDIF decoder that works in simulation, but fails in hardware. The logic analyzer core can record a cycle-by-cycle capture of what's coming off the cable, letting you verify that your input signals are what you expect. Even better, Manta will let you play that capture back in your preferred simulator, letting you feed the exact same inputs to your module in simulation and check your logic.
|
||||
|
||||
* _Capturing arbitrary data_ - you're working on a DSP project, and you'd like to grab some test data from your onboard ADCs to start prototyping your signal processing with. Manta will grab that data, and export it for you.
|
||||
|
||||
### __I/O Core__
|
||||
|
||||
|
|
@ -21,10 +22,11 @@ _More details available on the [full documentation page](./io_core.md)._
|
|||
|
||||
This core presents a series of user-accessbile registers to the FPGA fabric, which may be configured as either inputs or outputs. The value of an input register can be read off the FPGA by the host machine, and the value of an output register on the FPGA may be set by the host machine. This is handy for getting small amounts of information into and out of the FPGA, debugging, configuration, or experimentation. This concept is very similar to the Xilinx [Virtual IO](https://docs.xilinx.com/v/u/en-US/pg159-vio) and Intel [In-System Sources and Probes](https://www.intel.com/content/www/us/en/docs/programmable/683552/18-1/in-system-sources-and-probes-66964.html) tools.
|
||||
|
||||
You may find this core useful for:
|
||||
|
||||
* _Prototyping designs in Python, and incrementally migrating them to hardware_ - you're working on some real-time signal processing, but you want to prototype it with some sample data in Numpy before meticulously implementing everything in Verilog.
|
||||
|
||||
* _Making dashboards_
|
||||
* _Making dashboards_ - you'd like to get some telemetry out of your existing FPGA design and display it nicely, but you don't want to implement an interface, design a packetization scheme, and write a library.
|
||||
|
||||
### __Block Memory Cores__
|
||||
|
||||
|
|
@ -32,15 +34,25 @@ _More details available on the [full documentation page](./block_memory_core.md)
|
|||
|
||||
This core creates a two-port block memory on the FPGA, and gives one port to the host machine, and the other to your logic on the FPGA. The width and depth of this block memory is configurable, allowing large chunks of arbitrarily-sized data to be shuffled onto and off of the FPGA by the host machine, via the Python API. This lets you establish a transport layer between the host and FPGA, that treats the data as exactly how it exists on the FPGA.
|
||||
|
||||
* _Moving generic data between a host and connected FPGA_ - you're working on a cool new ML accerleator, but you don't want to think about how to get training data and weights out of TensorFlow, across some interface, and into your core.
|
||||
You may find this core useful for:
|
||||
|
||||
* _Hand-tuning image sprites_
|
||||
* _Moving data between a host and connected FPGA_ - you're working on a cool new machine learning accelerator, but you don't want to think about how to get training data and weights out of TensorFlow, and into your core.
|
||||
|
||||
* _Hand-tuning ROMs_ - you're designing a digital filter for a DSP project and would like to tune it in real-time, or you're developing a soft processor and want to upload program code without rebuilding a bitstream.
|
||||
|
||||
|
||||
## Dependencies
|
||||
|
||||
Mant is written in Python, and generates Verilog-2001 HDL. It's cross-platform, and its only strict dependency is pyYAML. However, [pySerial](https://github.com/pyserial/pyserial) is required for using UART, [scapy](https://github.com/secdev/scapy) is required for using Ethernet, and [pyvcd](https://github.com/westerndigitalcorporation/pyvcd) is required if you want to export a waveform from the Logic Analyzer core to a `.vcd` file.
|
||||
|
||||
Manta is written in Python, and generates Verilog-2001 HDL. It's cross-platform, and its only strict dependency is pyYAML. However, [pySerial](https://github.com/pyserial/pyserial) is required for using UART, [scapy](https://github.com/secdev/scapy) is required for using Ethernet, and [pyvcd](https://github.com/westerndigitalcorporation/pyvcd) is required if you want to export a waveform from the Logic Analyzer core to a `.vcd` file.
|
||||
|
||||
## About
|
||||
Manta was originally developed as part of my [Master's Thesis at MIT](./thesis.pdf) in 2023, done under the supervision of Dr. Joe Steinmeyer. But I think it's a neat tool, so I'm still working on it :)
|
||||
Manta and its source code are released under a [GPLv3 license](https://github.com/fischermoseley/manta/blob/main/LICENSE.txt), and it was originally developed as part of my [Master's Thesis at MIT](https://hdl.handle.net/1721.1/151223) in 2023, done under the supervision of [Dr. Joe Steinmeyer](https://www.jodalyst.com/). The thesis itself is copyrighted by Fischer Moseley (me!), but feel free to use the following Bibtex if you'd like to cite it:
|
||||
|
||||
```
|
||||
@misc{manta2023,
|
||||
author={Fischer Moseley},
|
||||
title={Manta: An In-Situ Debugging Tool for Programmable Hardware},
|
||||
year={2023},
|
||||
month={may}
|
||||
howpublished={\url{https://hdl.handle.net/1721.1/151223}}
|
||||
}
|
||||
```
|
||||
|
|
@ -1,17 +0,0 @@
|
|||
## Welcome Back!
|
||||
|
||||
We're going to jump right on in on this one. Today's testing is going to focus on one of the cornerstones of our medium-scale FPGA projects - the BRAM! Manta's been designed primarily as a debugging tool - but more generally its purpose is to shuffle data about. And a BRAM is one of the more useful places on a FPGA that it can go.
|
||||
|
||||
In today's exercise, we'll be revisitng our lab03 (popcat pong) code, which used a BRAM to store the contents of an image, which we rendered as a sprite. Here we'll be doing almost exactly the same thing, except we'll be hooking our BRAM up to Manta, which will let us put whatever image we'd like into the BRAM. We'll just be sending data _into_ the BRAM, but we could just as easily pull data out of it - say if we had a VGA camera connected to our board that dumped images into a framebuffer, which we wanted to dump to a host machine.
|
||||
|
||||
This should hopefully be nice and quick. Go ahead and grab the starter code from here:
|
||||
|
||||
|
||||
And just like last time, we'll need to create a config file that defines our BRAM - what it's called, how many bits wide the input is, and how many entries it has (depth). Here's an example configureation:
|
||||
|
||||
|
||||
```yaml
|
||||
mam: bro
|
||||
```
|
||||
|
||||
Go ahead and make a configuration of your own like this, and name it something super creative and interesting. I named mine `manta.yaml`.
|
||||
|
|
@ -1,64 +0,0 @@
|
|||
You'll need a working installation of Manta, which you can get by following the [installation instructions](./installation.md). You'll also likely want a copy of the GitHub repo, which contains the code for this tutorial in the `examples/` folder.
|
||||
|
||||
This tutorial configures Manta with an __IO Core__, which creates a `manta` module in Verilog that exposes a set of registers. These registers connect to your HDL, and each may be configured as either an input or an output. This module is configured from a YAML file, which looks like this:
|
||||
|
||||
```yaml
|
||||
---
|
||||
cores:
|
||||
my_io_core:
|
||||
type: io
|
||||
|
||||
inputs:
|
||||
spike: 1
|
||||
jet: 12
|
||||
valentine: 6
|
||||
ed: 9
|
||||
ein: 16
|
||||
|
||||
outputs:
|
||||
shepherd: 10
|
||||
wrex: 1
|
||||
tali: 5
|
||||
garrus: 3
|
||||
|
||||
uart:
|
||||
port: "auto"
|
||||
baudrate: 115200
|
||||
clock_freq: 100000000
|
||||
```
|
||||
|
||||
There's two things going on in this file. First, we've added an IO core to our Manta module, and named it `my_io_core`. We've also specified what registers we'd like it to expose, and provided names and bit widths for each. Feel free to name these however you'd like, but for simplicity it's usually best to give them the same name as what they connect to in your code. If you'd like to know more about what the IO core can do, check out it's [docs](./io_core.md).
|
||||
|
||||
Second, this file specifies that we'll be using UART to communicate between the host machine and the FPGA. We've asked Manta to try and find which serial port on the host machine is connected to the serial port by specifying `port: "auto"`, but if this doesn't work you can specify `"COM1"`, "/dev/ttyUSB0", or whatever descriptor your operating system gives it. Because of the way UART works, the baudrate must be set beforehand, and Manta needs to know how fast the FPGA clock is so that it can match it on the FPGA. If you'd like to more about the UART interface, check out the [docs](./uart.md)!
|
||||
|
||||
It's worth noting that we could also add more cores to our Manta configuration. Depending on your applicaition, a [Logic Analyzer](./logic_analyzer_core.md) core or [Block Memory](./block_memory_core.md) core might be useful! Manta supports any amount of any cores, so ou could even add another IO core (although you might want to consider just expanding your existing one!)
|
||||
|
||||
The snippet shown above is just an example, and our actual configuration is in the `examples/` folder of the GitHub repo. Feel free to grab either the UART or Ethernet variant - the only difference is the interface used. Both variants create a Manta instance with an IO core, where the onboard switches and buttons are wired as inputs, and the LEDs are connected as outputs.
|
||||
|
||||
Once the configuration has been specified, we'll need to generate the Verilog source for the module we'd like to instantiate on the FPGA. This is done by:
|
||||
|
||||
`manta gen <path_to_config_file> <path_to_output_verilog>`
|
||||
|
||||
In the case of the example code in the GitHub repo is:
|
||||
|
||||
`manta gen manta.yaml src/manta.v`
|
||||
|
||||
Go ahead and have a look at the Verilog file it just spat out - it contains a definition for a module called `manta`, which we'll instantiate in our design. There's also a copy-and-pasteable module instantiation at the top of the generated Verilog file. The GitHub example does this in the top-level module, where it wires the IO core to the Nexys A7's onboard IO.
|
||||
|
||||
Feel free to build this however you'd like - we like running Vivado in batch mode with the provided build script, which you can do with `vivado -mode batch -source build.tcl`. Upload the generated bitstream to your board.
|
||||
|
||||
### Using the Python API
|
||||
|
||||
Now that Manta's on the FPGA, we can control the IO core from our host machine. Using the API looks about like the following:
|
||||
|
||||
```python
|
||||
from manta import Manta
|
||||
m = Manta('manta.yaml')
|
||||
m.my_io_core.led.set(1)
|
||||
|
||||
print(my_io_core.btnc.get())
|
||||
```
|
||||
|
||||
This creates a Manta object from the same configuration file we used earlier, which contains all of the cores we specified. In this case it's just the single IO core, which can have its outputs registers written to (and input registers read from) with the methods above. The [`examples/api_example.py`](https://github.com/fischermoseley/manta/tree/main/examples/nexys_a7/io_core_uart/api_example.py) script uses this to display a pattern on the onboard LEDs, and report the status of the onboard buttons and switches.
|
||||
|
||||
This is just a quick example! More details about the IO core can be found on [its page](./io_core.md).
|
||||
|
|
@ -1,133 +0,0 @@
|
|||
## Welcome back!
|
||||
Howdy and welcome back to the party! Today's format is going to be a little different - we'll be splitting ya'll up into two groups and having one set test Manta, and one set test lab-bc 2.0 - in _parallel_. This will hopefully take less time, give us some more meaningful tests, and just be better on the whole. It'd be kinda embarassing if this doesn't work, parallelization is kind of our _whole thing_.
|
||||
|
||||
## Update Manta
|
||||
There's been a fair bit of work since this time last week! Actually I'm curious how many commits there've been...
|
||||
|
||||
```
|
||||
$ git log --since='1 week ago' | grep Fischer | wc -l
|
||||
|
||||
41
|
||||
```
|
||||
|
||||
Oh god - we should definitely update Manta. And I should definitely, like, go outside or something. Go ahead and use the instructions on the [installation](../installation) page to get yourself up to date.
|
||||
|
||||
## Boilerplate
|
||||
While you've got a terminal open, go ahead and grab the starter code from fischer's [super exclusive, boutique, and bouguie code hosting site](https://github.com/fischermoseley/tutorial_2_template)
|
||||
|
||||
## The fun part!
|
||||
Today we'll be experimenting with the most powerful Manta feature - the Logic Analyzer core. If we ever connected an ILA to your code last semester (or used a proper, benchtop logic analyzer like the ones on top the tables), then this will feel pretty familiar. But if not, perfect :)
|
||||
|
||||
The logic analyzer core connects to a set of singals that you want to investigate, which you do by _capturing_ them. When a _trigger condition_ is met, the logic analyzer core will record the value of each signal to internal memory, until that memory is full. That memory is then read back by the host machine, and exported to a `.vcd` file which we can open in GTKWave and poke around with.
|
||||
|
||||
And later, we'll "play back" that capture data in our own simulation, where we'll prototype a PS2 decoder. And if it works there on data captured from the real world, it should work just dandy when we go to implement it in hardware.
|
||||
|
||||
We'll be kicking the tires on the Logic Analyzer core in the context of the PS/2 keyboards we used in [lab02](https://fpga.mit.edu/6205/F22/labs/lab02). If you remember, we had a catsoop checker on the page that ran a testbench on your code, but it was a little unreliable and would often fail code that would actually work perfectly fine in hardware. This was our fault - our testbench didn't model how the keyboard worked completely corectly - but in this exercise we'll work around that by just yoinking data from the real world.
|
||||
|
||||
## Quick Blast to the Past
|
||||
|
||||
Here's a quick refresher on PS/2, lifted straight from the [lab02](https://fpga.mit.edu/6205/F22/labs/lab02#section_8) text:
|
||||
|
||||
PS/2 works by representing each character on the keyboard as a one-byte value from a predefined table, and sending that across the interface when a key is pressed.
|
||||
|
||||
Whoa whoa whoa, what does "sent" mean?
|
||||
|
||||
Basically, the PS/2 protocol runs over two connections / 'lines' between the device (i.e. your keyboard) and the PS/2 controller. The first of these connections is a clock, driven at a few kilohertz2 by the keyboard. The second of these connections is where the actual data flows. When the clock line drops from high to low, we can grab the value of the data line and store it for later use - eventually stacking up a full eight bits of information. This byte of information corresponds to a "scancode" in the above table, which maps to the character that you just pushed on your keyboard. However, as is often the case in communication of data, more than just the message must be sent in order to avoid ambiguity.
|
||||
|
||||
The transmission of an entire byte, therefore looks like the following: When you press down a key (or release a key...see below), you're gonna see the following happen:
|
||||
|
||||
- The clock line, which is high at idle time, is going to start ticking at its frequency of a few kilohertz. Meanwhile, a start bit, which signals the beginning of the transmission (and is always zero), is asserted on the data line. So at the first falling edge of the clock in a given sequence, you're going to see a 'zero' value asserted on the data line.
|
||||
- The next eight falling clock edges will bring along 8 data bits, which contain the byte that represents the key you pressed.
|
||||
- Next you'll see a parity bit, which is used for error checking. This uses the same method as you saw in pset 03. PS/2 uses odd parity which means if there are an even number of 1's in the 8 bits of the actual message a 1 is in the parity bit slot. If not, then a 0 is in the parity bit slot.
|
||||
- Finally, you'll see a stop bit which is always a one. This signals the end of the transmission, and its receipt corresponds with the last falling clock edge you'll see until the next key is pressed.
|
||||
|
||||
For you visual folks, here's a quick diagram summarizing the above.
|
||||
|
||||

|
||||
|
||||
## Adding a logic analyzer
|
||||
Just like last time, we'll be configuring out `manta` instance with a configuration file called `manta.yaml`. There's a template in the starter code, go ahead and tweak it to add in a logic analyzer core according to the [documentatation](../logic_analyzer_core). There's a few parameters we'll want to pay close attention to:
|
||||
|
||||
- __Probes__: The signals we want to record. In our case, that's the PS/2 clock and data lines.
|
||||
- __Sample Depth__: How many samples of them we want to record. In this particular configuration we can have up to ~64k samples, but transferring data is a little slow, so let's crank it down to 32k.
|
||||
- __Triggers__: We want our capture to contain a valid PS/2 scancode, so we'll want to trigger when it starts transmitting it. What signal does what in order to begin the transaction?
|
||||
- __Trigger Position__: Let's set this to 200 or so, so that we can make sure that our bus is idling properly before it starts sending data.
|
||||
|
||||
If you've got any questions about the configuration - let me know! Once you're happy with it, go ahead and generate the core, synthesize it, and flash the FPGA:
|
||||
|
||||
```
|
||||
manta gen manta.yaml src/manta.v
|
||||
vivado -mode batch -source build.tcl # or python3 lab-bc.py
|
||||
openFPGALoader -b arty_a7_100t obj/out.bit
|
||||
```
|
||||
|
||||
## Running the Logic Analyzer
|
||||
Lovely! Now we'll want to run our core and __capture__ our signals. We'll throw these into a `.vcd` file, as well as a `.mem` file with the following:
|
||||
|
||||
```
|
||||
manta capture manta.yaml my_logic_analyzer capture.vcd capture.mem
|
||||
```
|
||||
|
||||
Assuming your config file is named `manta.yaml` and your logic analyzer core is named `my_logic_analyzer`. This will tell Manta that you'd like to run your logic analyzer, set the triggers, and wait for the trigger condition - you pressing a key on the keyboard. Once you do, the trigger will capture the signals, and your computer will read the data. Neato.
|
||||
|
||||
Go ahead and open the `caputure.vcd` file with `gtkwave capture.vcd`. This should look like our diagram from above! If it doesn't, lemme know.
|
||||
|
||||
## Onto something useful...
|
||||
This is great! We can see our data and clock line, and it looks like what we expect. If we were working with something less standard than a PS2 keyboard, we could use Manta to double check that the signals received by the FPGA are the signals it expects.
|
||||
|
||||
Let's go one step farther - we're going to write a PS/2 decoder in Verilog (I know, I know it's been a while - I tried to pick something easy), but we're going to bypass the annoyness of setting up a testbench. Instead, we're just going to use the capture data we got from before, chunk that into our decoder, and see if we can get it to work in simulation. And once we do, we'll put our module on the FPGA, and see how we did.
|
||||
|
||||
## Playing Back Capture Data
|
||||
There's a little bit of Verilog required to load our capture data from `capture.mem` into the simulation, but conveniently Manta will auto-generate this wrapper for you. If you run
|
||||
|
||||
```
|
||||
manta playback manta.yaml my_logic_analyzer sim/playback.v
|
||||
```
|
||||
|
||||
It'll create a module that outputs `ps2_clk` and `ps2_data`, and place it in `sim/playback.v`. There's an empty testbench in `sim/ps2_decoder_tb.sv`, go ahead and instantiate a copy of the playback module in the testbench. If you have a look at top of `sim/playback.v`, there's a little instantiation template you can copy-paste. Easy peasy.
|
||||
|
||||
While you're there, go ahead and instantiate a copy of ps2_decoder too, and wire it up to the output of the playback module.
|
||||
|
||||
## Writing ps2_decoder
|
||||
Once you've got the plumbing all sorted, go ahead and type up the `ps2_decoder` module itself. VS Code should be installed on all the machines, and if you'd like to simulate with your captured data, just run:
|
||||
|
||||
```
|
||||
iverilog -g2012 -o sim.out sim/ps2_decoder_tb.sv sim/playback.v src/ps2_decoder.sv
|
||||
vvp sim.out
|
||||
```
|
||||
|
||||
!!! tip
|
||||
|
||||
Feel free to ignore the parity bit - just making sure that there's a start and stop bit is suffecient for the time being.
|
||||
|
||||
|
||||
Make sure that the output of your `ps2_decoder` matches the key you pressed when you made the capture. Once you think you've got it working, feel free to throw it on the FPGA and build. The decoder is already instantiated in `top_level.sv`, so you shouldn't have to change anything outside of `ps2_decoder.v`.
|
||||
|
||||
Once you've got it working on hardware, congratulations!
|
||||
|
||||
## Debrief
|
||||
We made something that worked right the first time, without spending a ton of time making a testbench beforehand. That's pretty cool. But it's got some big caveats:
|
||||
|
||||
- __It only tests the nominal case__: What if we received data whose parity bit didn't match? What if we only received 10 data bits intead of 11? We won't know how our ps2_decoder behaves in this case - we'd have to code those cases in manually, and Manta can't help us with that.
|
||||
|
||||
- __It requires hardware__: This is somewhat obvious, but we needed to have our FPGA and a PS/2 keyboard next to us before we ever started writing Verillog. That's not the most convenient thing - especially when PS/2 is simple enough to the point where I could probably write a PS/2 testbench in about the same amount of time it'd take to get the capture data into a simulation.
|
||||
|
||||
However, there were a few things that were pretty cool:
|
||||
|
||||
- __We had an exact representation of the nominal case__: If all data looks like our capture data did, we _know_ our design will work. That's powerful.
|
||||
- __We didn't need to know what PS/2 looks like to start writing the simulation__: If this was your first time working with PS/2, this changes the development process from:
|
||||
- Understand PS/2.
|
||||
- Write PS/2 signal generator in testbench.
|
||||
- Write PS/2 decoder in testbench.
|
||||
|
||||
- and changes it to:
|
||||
- Import PS/2 signal generator to testbench.
|
||||
- Understand PS/2.
|
||||
- Write PS/2 decoder in testbench.
|
||||
|
||||
And that might be useful, especially if you've got a device that you think is misbehaving. Being able to check it while you're writing your decoder is powerful.
|
||||
|
||||
|
||||
Anyway, neither approach is perfect, and Manta's not meant to deliver something that is. It's meant to give you more options, all of which have tradeoffs.
|
||||
|
||||
That's about all I've got for now. Thanks for coming, and grab some pizza and stickers if you don't have some already :) Catch ya next week!
|
||||
Loading…
Reference in New Issue