Fast Versatile DMA
Copyright (c) 2019-2021 Antmicro
FastVDMA is a DMA controller designed with portability and customizability in mind.
- 2D transfers with configurable stride
- External frame synchronization inputs
FastVDMA performance was tested in synthetic tests that consisted of transferring an
NxM buffer with data where
N represents the number of 32-bit words and
M represents the number of
N word rows to transfer.
FastVDMA was verified in the
xc7z030fbg676-2 chip achieving an average throughput of 750MB/s, while being clocked at 250MHz, and average of 330MB/s at 100MHz under the same workload. Both the speeds were performed in a Memory-Stream-Memory configuration using two controllers configured with AXI4 and AXI-Stream buses. The first controller reads data from memory and sends it out via an AXI-Stream interface, while the second receives the stream and writes the data received to a second buffer in memory.
Wishbone and AXI4 busses were connected to a LiteDRAM controller providing access to DDR3 memory. Both busses used a 32-bit data bus to connect to the DDR3 controller.
In both cases the data transferred consisted of a 4MB block of randomly produced data which was subsequently verified for possible transmission errors after each transfer.
The AXI4=>AXI-Stream (MM2S) configuration utilized 425 slices on a
xc7z030fbg676-2 chip which was used for tesing the design.
AXI-Stream=>AXI4 (S2MM) requires 455 slices on the same chip.
Both configurations were instantiated in the same design and connected in a back-to-back configuration that allowed memory-to-memory transfers while still using configurations equipped with AXI-Stream interfaces.
Because the controller is written in Chisel, it requires
java to be installed; additionally the tests require
FastVDMA can be simulated as a whole but certain components can be tested separately.
You can simulate the full design by running:
To run all tests, including the full test mentioned above, execute:
Each testrun generates a
.vcd file which can be opened using GTKWave or any other
Output files are located in a separate subdirectories inside the
The full test should generate an
out.png file demonstrating a 2D transfer with configurable stride. The resulting image should look similar to:
To generate a synthesizable verilog file, run:
The generated file will be named
Current register layout is shown in the table below:
||Interrupt mask regiser|
||Interrupt status register|
||Reader start address|
||Reader line length|
||Reader line count|
||Reader stride between lines|
||Writer start address|
||Writer line length|
||Writer line count|
||Writer stride between lines|
For a detailed description of register fields check Register fields.
You can also check WorkerCSRWrapper for more details on how the CSRs are attached to the DMA logic (
io.csr(0) refers to
0x04 and so on).
Configuration for the DMA is located in the DMATop file.
Most of the settings are defined in the
DMATop companion object but to change which busses are used, the
DMATop class must be modified to contain correct
io bundles and
After making changes to interfaces used in
DMATop class make sure to verify that companion object is correctly configured.
Source code structure
src/main/scala/DMAController contains sources of the DMA controller
src/test/scala/DMAController contains tests
Apologies, but no results were found.