You searched for FPGA - imperix

Getting started with FPGA control development

Benoît Steinmann — Wed, 02 Jun 2021 11:37:17 +0000

This note explains how to get started with the implementation of power converter control algorithms in the FPGA of imperix power electronic controllers. The benefit of offloading all or parts of the computations from the CPU to the FPGA is that it often results in much faster closed-loop control systems.

First, the FPGA control starter template is presented and a tutorial on how to create this template is provided. Then, the reader will learn how to retrieve ADC results from the FPGA as well as to exchange data with the CPU. The page ends with a simple hello-world example illustrating all the keys steps of FPGA control implementation.

This page is the first of a 3-part tutorial explaining step-by-step how to implement the closed-loop control of a buck converter in FPGA without using VHDL or Verilog. The second note explains how to generate a PWM modulator in FPGA using the Simulink blockset Xilinx System Generator or MATLAB HDL Coder. The last note shows how to create the PI-based current control using high-level synthesis with Xilinx Model Composer (Simulink blockset) or Xilinx Vitis HLS (C++).

To find all FPGA-related notes, please visit FPGA development homepage.

Presentation of the FPGA control starter template

The FPGA control starter template allows for easy integration of custom FPGA-based control algorithms in the sandbox area of the B-Box RCP or the B-Board PRO. As shown in the image below, it consists of:

the obfuscated “imperix firmware IP” which contains the FPGA logic required to operate imperix controllers (documented in PN116), and
the “ix axis interface” module which provides easy-to-use AXI4-Stream interfaces to exchange data with the user logic.

The provided AXI4-Stream interface module connects to the data interfaces and timing signals of the imperix firmware IP, interprets these signals, and converts them into much more user-friendly AXI4-Stream (AXIS) interfaces (ADC, CPU2FPGA and FPGA2CPU). The AXI4-Stream protocol is a widely used standard to interconnect components that exchange data. This means that the provided template can directly be connected to a wide range of Xilinx-provided IPs or to user-made algorithms developed using High-Level Synthesis (HLS) design tools such as Vitis HLS (C++) or Model Composer (Simulink).

This AXI4-Stream interface module is written in VHDL. It is provided as a starting point and will meet the need of most applications. However, if required, it can be edited by the user to add extra input/outputs, rename them, change their data sizes, etc.

Creating the FPGA control starter template

This tutorial requires Xilinx Vivado which is available at no-cost.
The Vivado Design Suite installation guide explains how to download and install Xilinx Vivado for free.

Downloading the required sources

All the required sources are packed into the FPGA_Sandbox_template archive which can be downloaded from the Download and update imperix IP page.

Since version 3.9, the template file structure is the following

constraints
- sandbox_pins_*.xdc: top-level ports to a physical package pin assignation
hdl
- AXIS_interface.vhd: AXI4-Stream interface module, presented on the next chapter
- user_cb_pwm.vhd: simple carrier-based modulator, described in TN141
ix_repo: Vivado IP Catalog repository. Contains the imperix firmware IP and its interfaces.
scripts:
- create_project.bat: launch Vivado and call create_project.tcl
- create_project.tcl: contains the TCL commands that create and configure the sandbox Vivado project
vivado: contains the Vivado projects generated by the create_project script

Starting a new imperix sandbox project

Download FPGA_Sandbox_template_*.zip
Unzip it and save the content somewhere on the PC
Rename the folder to something more explicit

Open scripts/create_project.bat using a text editor
Set the vivado_path variable to match the Vivado version installed on the PC

Double click on scripts/create_project.bat
Windows Defender SmartScreen may display a warning pop-up. Simply click More info then Run anyway.
Enter a project name and click enter

The Vivado sandbox project will be created and configured, its block design is shown below.

imperix sandbox template Vivado block design

If, for some reason, the script is not working properly then the Vivado project can be created manually by following the procedure below:

Open Vivado.
Click Create Project.
Chose a name and a location.
Select project type RTL Project and check the box Do not specify sources at this time.
Select the part named xc7z030fbg676-3.
Hit Finish. The project should open.
Go to the IP Catalog, right-click on Vivado Repository, hit Add repository…
Select /ix_repo/
The IMPERIX_FW IP, clock_gen, and user_regs interfaces should be found. Press OK.
Click on Create block design, name it “top” and click OK.
Open the freshly created block design, do a right-click in a blank area of the design, select Add IP… and search for “IMPERIX_FW” and hit ENTER.
Keep the [Ctrl] key pressed and select the IP pins flt, gpi, private_in, DDR, FIXED_IO, BBOS, USR, gpo, pwm and private_out. Hit [Ctrl+T] to create top-level ports.
By default Vivado adds “_0” after each port name. Remove the “_0” from every port. For that, click on each port and change the name property in the Block Pin Properties block (appearing on the left of the diagram by default). For instance, change “flt_0[15:0]” to “flt[15:0]“.
The user_fw_id input may be used to identify the firmware version. We recommend instantiating a Constant IP (Right-click, Add IP…, search for Constant) to give an identification number to the design. To change the constant width, double-click on the Constant block and set the Const Width to 16.
Go to the Sources tab, right-click on the block design file (top.bd) and select Create HDL Wrapper…
In the dialog box choose Let Vivado manage wrapper and auto-update and hit OK.
Right-click on the Design Sources folder
Choose Add Sources…
Check Add or create constraints
Click on Add Files
Select /sandbox_pins.xdc
Uncheck Copy constraints files into project
Hit Finish

From this point the project is synthesizable.

Adding the AXI4-Stream interface

Adding the AXI4-Stream interface is optional. It is useful if the FPGA control design uses AXI4-Stream interfaces.

Right-click on Design Sources
Choose Add Sources….
Check Add or create design sources.
Press Add Files. Go to your repository and select /AXIS_interface.vhd.
We recommend unchecking “Copy sources into project” and working directly from the files in the folder /hdl/ so the sources can be shared across multiple projects. Press Finish and wait for the update to finish.
Right-click somewhere in the block design and choose Add module… and select the ix_axis_interface module. Alternatively, the file listed in the Design Sources can be drag and dropped it on the block diagram.
Connect the pins as follows (to get a clear layout, change “Default View” to “No Loops” in the top bar of the Diagram block, then right-click somewhere in the block design and press Regenerate Layout)

imperix sandbox template Vivado block design

Using the SBIO_BUS

The SBIO_BUS (SandBox IO bus) is a 16-bit memory-mapped bus allowing the CPU to addressing up to 1024 register in the FPGA.

Memory-mapped SBIO_BUS signals

SBIO modules

Two SBIO modules are provided in the template:

the AXI4-Stream interface (AXIS_interface.vhd) which is described in the next section,
the SBIO registers (sbio_registers.vhd) shown below, which provides 16-bit registers that can easily be connected to user logic.

The user can read and write from this bus using the SBO and SBI blocks as shown below. During execution, SBIs are read before each CPU task executions and SBOs written at the end of each CPU task execution.

SBIO interconnect

The SBIO interconnect increases the number of SBIO_BUS interfaces, allowing to connect multiple SBIO modules as illustrated below.

sbio_interconnect

The address mapping of the SBIO interconnect is shown below, it divides the SBIO addressable range in 4 smaller areas.

sbio_interconnect memory mapping

As an example, to write to SBO_reg_03 of an sbio_registers block connected to S2_SBIO_BUS, The user has to use an SBO block to register number 512+3=515.

How the AXI4-Stream interface operates

This section focuses on the AXI-Stream interface module (ix_axis_interface). For further information on the imperix firmware IP (IMPERIX_FW) please refer to the imperix firmware IP product guide.

If required, AXIS_interface.vhd can easily be edited to improve the readability, by renaming the interfaces and removing the unused ones for instance. In this example, the interface is kept as it is.

Retrieving analog measurement with the M_AXIS_ADC interfaces

The Master AXI4-Stream interfaces M_AXIS_ADC_00 to M_AXIS_ADC_15 correspond to the 16 analog inputs of the imperix device.

They return the raw 16-bit signed integer result from the ADC each time conversion results are available. Consequently, users should manually perform the data-type conversion and apply correct gains in their FPGA projects, to transform the acquired value in its physical unit. To learn how to compute this gain, please refer to the last section of the ADC page.

The ADC sampling frequency can only be configured using the CONFIG block. ADC acquisition can not be triggered from within the FPGA.

Exchanging data using M_AXIS_CPU2FPGA and S_AXIS_FPGA2CPU

The Master AXI4-Stream interfaces M_AXIS_CPU2FPGA and the Slave AXI4-Stream interfaces S_AXIS_FPGA2CPU serve to exchange 32-bit data between the CPU code and the FPGA.

To read/write values on the FPGA2CPU/CPU2FPGA ports, the user can download the Simulink model from the step-by-step hello world section below and re-use the following blocks

write a float value from the CPU to the FPGA

The provided template uses the following mapping between the 16-bit SBI/SBO registers and the 32-bit AXI4-Stream interfaces:

If the user chooses to write a single-precision floating-point data on the AXI4-Stream interfaces M_AXIS_CPU2FPGA_00, then he has to:

use a MATLAB Function block to transform a single value into two uint16 values (see code below)
and then use the SBO block to send these two uint16 values to SBO_reg_00 and SBO_reg_01 (CPU2FPGA_00).

function [y1,y2] = single2sbo(u)
  temp = typecast(single(u),'uint16');
  y1 = temp(1);
  y2 = temp(2);

And if he wishes to read a result from the FPGA to the CPU (still in single-precision floating-point format) using S_AXIS_FPGA2_CPU_01, then he has to:

use the SBI block to retrieve the two uint16 values from SBO_reg_02 and SBO_reg_03 (FPGA2CPU_01)
and then use a MATLAB Function block to transform these two uint16 values into a single value (see code below).

function y = sbi2single(u1,u2)
  y = single(0); % fix simulink bug: force compiled size of output
  y = typecast([uint16(u1) uint16(u2)], 'single');

Getting the sample time Ts

The M_AXIS_Ts interface provides the sample period in nanoseconds in a 32-bit unsigned integer format. This signal may be used, for instance, by the integrators of PI controllers. This value is measured by counting the time difference between two adc_done_pulse.

Using reset signals

The AXI4-Stream module also provides two reset signals:

nReset_sync: this reset signal is activated each time the user code is loaded through Cockpit. It can be used as a standard reset signal.
nReset_ctrl: this reset is triggered from the CPU through SBO_reg_63 using a core state block. Its intended use is, for instance, to reset the PI controller integrator when the converter is not operating (when the PWM outputs are disabled).

Both signals are active-low and activated for 4 periods of clk_250_mhz.

Step-by-step “hello world” example

This simple “hello world” example serves to showcase the complete FPGA development workflow on Vivado.

As illustrated on the image below, it does the following:

From the CPU, a gain value (single-precision) is transferred to the FPGA using the CPU2FPGA_00 interface (SBO_00 and SBO_01).
In the FPGA, the data coming from the ADC_00 interface is converted into a single value.
The ADC value is then multiplied by the gain.
Finally, the multiplication result is sent back to the CPU through FPGA2CPU_00.
The raw value of the ADC is also sent to the CPU using FPGA2CPU_01.

The FPGA logic will be implemented using exclusively readily available Xilinx Vivado IP, namely:

Floating-point IP configured for Fixed-to-float operation to convert an int16 to a single-precision floating-point value.
Floating-point IP configured for the Multiply operation to multiply two single-precision floating-point values.
AXI4-Stream Broadcast IP to duplicate the output of the int16 to single block.

Other useful Xilinx Vivado IPs are listed in the AXI4-Stream IPs from Xilinx page. The user can also implement his own IP blocks, either using directly VHDL or Verilog, or high-level design tools such as Vitis HLS (C++) or Model Composer (Simulink).

CPU-side implementation

The CPU-side code provided below has been implemented using Simulink and the imperix ACG SDK. To make these variables available during run-time, the gain is set using a tunable parameter and the adc_raw and result are read using probes. These 3 values are encoded as single (32-bit single-precision floating-point). The MATLAB Function blocks single2sbo and sbi2single allow to easily map single values to SBI and SBO blocks.

Click to download PN159_Getting_Started_With_FPGA.slx

FPGA-side implementation

As mentioned earlier, only standard Xilinx Vivado IP blocks are used to implement the algorithm in this example. The PN159_vivado_design.pdf file below shows the full Vivado FPGA design. Here are the step-by-step instructions to reproduce it.

Click to download PN159_vivado_design.pdf

Add a block to convert the int16 data of ADC_00 to a single-precision floating-point
1. Right-click somewhere in the block design and choose Add IP…
2. Search for the Floating-point IP, drag-and-drop it on the diagram.
3. Rename it as int16_to_single.
4. Double-click on the int16_to_single block. In the pop-up window, select the Fixed-to-float operation, change Auto to Manual and set the precision type to Custom. Then, set the integer width to 16, and select Single as precision for the result.

Add the multiplier block
1. Add another Floating-point IP and rename it as single_multiplier.
2. Double-click on the block.
3. Select the Multiply operation and Single input.

Broadcast one stream to two streams
1. Right-click somewhere in the block design and choose Add IP…
2. Search for the AXI4-Stream Broadcaster, drag it and drop it on the diagram.
3. Keep all options to auto.

Connect all the blocks as follow:
- M_AXIS_ADC_00 to S_AXIS_A of int16_to_single
- M_AXIS_RESULTof int16_to_single to S_AXIS of axis_boradcaster_0
- M00_AXIS of axis_broadcaster_0 to S_AXIS_FPGA2CPU_00
- M01_AXIS of axis_broadcaster_0 to S_AXIS_B of single_multiplier
- M_AXIS_CPU2FPGA_00 to S_AXIS_A of single_multiplier
- M_AXIS_RESULTof single_multiplier to S_AXIS_FPGA2CPU_01
- clk_250_mhz to all aclk inputs
- nReset_sync to aresetn of axis_broadcaster_0

Finally, the design can be synthesized and the bitstream generated. Click Generate bitstream. It will launch the synthesis, implementation and bitstream generation.
Always make sure that your design meets the timing requirements!
This information is available from the Project Summary
To learn more please visit the Xilinx documentation on Timing Closure.

Opening the Vivado Project Summary

If the timing requirement are met, click on File → Export → Export Bitstream File… to save the bitstream somewhere on the computer.

Loading the bitstream into the imperix controller

Using imperix Cockpit, the bitstream is loaded into the imperix controller device from the target configuration window.

Loading an FPGA bitstream in the controller

Experimental validation

Finally, the CPU code is generated from the Simulink model and loaded into the device, as explained in the programming and operating imperix controllers page.

To test the design, a sinusoidal signal is fed to the analog input of the B-Box. Then, using Cockpit’s scope, the result = 0.5*adc_raw is observed:

Going further

Testing M_AXIS_Ts

This section goes a bit further in the demonstration of the provided AXI4-Stream M_AXIS_Ts to help understand the M_AXIS_Ts interface, and show how to transfer an uint32 value from the FPGA to the CPU.

The modification of the FPGA bitstream is quite simple: simply connect M_AXIS_Ts to S_AXIS_FPGA2CPU_02 as shown in orange on the image below. This allows reading the sample time Ts value from the CPU.

On the CPU, to retrieve the Ts value, the value is read from S_AXIS_FPGA2_CPU_02 (SBI_04 and SBI_05). Because the value is a 32-bit unsigned integer, the transformation is different from before as shown below.

Option 1

Option 2

function y = sbi2uint32(u1,u2)
  y = uint32(0); % fix compiled size of output
  y = typecast([uint16(u1) uint16(u2)], 'uint32');

Finally, using Cockpit, the result can be observed. If the control task frequency (CLOCK_0) is set to 50 kHz, then M_AXIS_Ts will return 20’000 ns.

If CLOCK_0 is kept at 50 kHz and the oversampling is activated with an oversampling ratio of 20, then M_AXIS_Ts will return 1000 ns, which corresponds to the actual sampling period.

Using the USR pins

The imperix controllers feature 36 user-configurable 3.3V I/Os (the USR pins) that are directly accessible from the FPGA. Their physical locations are available in the B-Board PRO datasheet and B-Box RCP datasheet.

By default, the USR pins are connect to the imperix IP. Currently they are only used to communicate with the with the motor interface. If the motor interface is not used, then the USR port can safely be deleted.

These pins are then available for other use. The screenshot show an example where the USR pins 0, 1, and 2 are used.

The constraints\sandbox_pins.xdc must be edited accordingly. As shown below, we recommend commenting (#) the unused pins to avoid generated unnecessary warning the Vivado.For more information on constraints in Xilinx FPGA please refer to the using constraints in Vivado Design Suite user guide.

When top level ports are modified, the block design wrapper must be updated accordingly to reflect the changes. To make sure top_wrapper.vhd is up-to-date, the recommended procedure is to remove the current one and generate a new one:
– Right click on top_wrapper -> Remove File from Project…
– Check “Also delete the project file from disk” -> OK
– Right click on top -> Create HDL Wrapper

The FPGA-based SPI communication IP for ADC page shows an example where USR pins are used to drive an external ADC using the SPI protocol.

Additional tutorials

The page custom FPGA PWM modulator explains how to drive PWM outputs or to use the CLOCK interfaces through a simple example. The PWM modulator sources are provided as VHDL, as a Xilinx System Generator model, and as a MATLAB HDL Coder model.

The page high-level synthesis for FPGA developments shows how to integrate HLS-generated IPs in an FPGA control implementation using a PI-based current control of a buck power converter as an example.

Back to FPGA development homepage

The post Getting started with FPGA control development appeared first on imperix.

FPGA-based control of a grid-tied inverter

Shu Wang — Wed, 02 Jun 2021 11:40:35 +0000

This note presents an FPGA control implementation of a grid-tied current-controlled inverter. It combines several control modules presented in different Technical Notes to form a complete converter control, executed entirely in the FPGA of a B-Box RCP controller.

Thanks to the FPGA programmability of the B-Box controller, complex control algorithms can be effectively executed at high rates and with minimal latency. In particular, this example shows that a grid-oriented current control algorithm can be executed as fast as 650 kHz, whereas the equivalent CPU-based execution is “limited” to 210 kHz (which is already an industry-leading figure amongst prototyping controllers).

Besides, the fast switching frequency used in this example takes full advantage of the imperix SiC phase leg module.

To find all FPGA-related notes, you can visit FPGA development homepage.

Grid-tied inverter control

The controlled system is a standard current-controlled voltage-source inverter, connected to a 3-phase grid. This converter is built using imperix power modules in the experimental validation section.

Grid-tied voltage-source inverter

The control algorithm is entirely executed in the controller FPGA and implements a three-phase PLL for grid synchronization coupled to a standard dq current control in the grid-oriented reference frame. Based on user-defined current references, the controller computes the voltages that the inverter should produce in order to match the required current. These voltages are then modulated in PWM signals and fed to the gates of the inverter.

The overall inverter control algorithm is shown below, and each of the elementary control blocks is further described in the following sections.

Block diagram of the implemented FPGA-based control algorithm (simplified view)

Overview of the FPGA-based inverter control task

Below is shown the implemented Vivado block design used to generate the FPGA bitstream. The creation of the Vivado block design section describes in more detail how to reproduce this block design. All the sources can be downloaded by clicking on the button below.

Download TN147_block_design.pdf

The design uses the following IPs

ADC conversion module (Vitis HLS)
Ts conversion module (Vitis HLS)
Grid synchronization module (Vitis HLS or Model Composer, documented in TN143)
Dq current controller (Vitis HLS or Model Composer, documented in TN144)
duty cycle computation module (Vitis HLS)
Carrier-based PWM module (System Generator or HDL Coder, documented in TN141)

Download TN147_FPGA_Grid_Tied_Inverter.zip

These IPs have been implemented using High-Level Synthesis tools such as Vitis HLS (free of cost, C++) and Model Composer (~500$, requires MATLAB Simulink). These tools offer a simple yet powerful way of developing control algorithms in FPGA.

Performance analysis of the control task

Without any special optimization, the latency of the presented inverter control algorithm is roughly 1.5µs (see details below), which means that it can run above 650 kHz. Comparatively, the similar CPU-based algorithm presented in TN106 can run at up to 210 kHz.

Latency and control delay

The total latency can be estimated by simply adding up the latency of each module, which are:

Module	Latency Vitis HLS	Latency Model Composer
ADC conversion	24 cycles	24 cycles
Grid synchronization	145 cycles	146 cycles
DQ current control	72 cycles	56 cycles
Duty cycles computation	138 cycles	137 cycles
Total latency	379 cycles (1.52µs)	363 cycles (1.45µs)

That estimated latency is comparable to the measured latency of 385 cycles (1.54µs) with the Vitis HLS implementation. The Model Composer approach achieves a slightly lower latency thanks to automatic optimization but at the expense of slightly higher resource utilization.

Considering a conversion time of the ADC chip of 2µs, the total control delay is 3.54µs, which is larger than one sampling period (chosen 2.5µs – 400kHz). Therefore, the execution of the control task is pipelined with the ADC acquisition, as shown in the figure below. The control delay to consider when tuning the current controller is therefore $T_{d,ctrl}=2T_{s}$.

Further details on control delay identification can be found in the PN142.

Controller tuning

The gains of the PI controllers are tuned using the Magnitude Optimum, as introduced in Vector current control. The total delay of the control loop is identified as:

Control delay: $T_{d,ctrl}=2T_{s}=5\,\text{µs}$
Modulator delay (double-rate update): $T_{d,\text{PWM}}=T_{sw}/4=T_{s}/2=1.25\,\text{µs}$
Sensing delay: (16 kHz filter) $T_{d,sens}\approx 10\,\text{µs}$
Total loop delay: $T_{d,tot}=T_{d,ctrl}+T_{d,\text{PWM}}+T_{d,sens}\approx 16.25\,\text{µs}$

Resource utilization

The resource utilization is estimated by Vivado after the implementation and is shown below, for the Vitis HLS approach.

Resource	Utilization (inverter control)	Utilization (imperix firmware 3.6)	Total utilization
LUT	18’733	24’220	42’953 (54.65%)
LUTRAM	115	798	913 (3.43%)
FF	30’713	50’732	81’445 (51.81%)
BRAM	2	37	39 (14.72%)
DSP	104	0	104 (26%)

FPGA resource utilization of the inverter control algorithm and the imperix firmware

Resource utilization and latency may vary depending on the software’s versions, selected synthesis/placement optimizations, etc.

Experimental validation

Testbench description

A physical testbench is built to validate the developed control strategy experimentally, using the following equipment:

Controller: B-Box RCP with ACG SDK for Simulink
Inverter: 3x PEB8024 phase leg module with fast-switching SiC MOSFETs
Grid inductors: from passive filters box

The main operating parameters are summarized in the tables below.

Parameter	Value
DC bus voltage	750 V
Grid voltage	380 V
Grid inductor (from filter box)	2.36 mH

Parameter	Value
Sampling frequency	400 kHz
Control frequency (FPGA)	400 kHz
Switching frequency	200 kHz

Due to the high switching frequency of this application (200kHz), the PWM modulators are configured with a small deadtime (200ns). Therefore, only PEB8024 power modules, which support smaller deadtimes can be used for this testbench. Using other types of modules may result in irreversible damage to the semiconductors.

Experimental results

Various reference steps are performed on the d- and q-axis current references to validate the reference tracking and perturbation rejection abilities of the developed algorithm.

The graph below shows the measured grid currents when the d-axis current reference takes the values 5, 12, and 8 A.

Measured grid currents when changing the d-axis current reference

When projected into the dq rotating reference frame, the measured grid currents give the following results. It can be seen that the current reference is successfully and rapidly tracked by the d-axis controllers and that the perturbation is well rejected on the q axis.

Measured grid currents in the dq reference frame when changing the d-axis current reference

If the same reference steps are performed on the q-axis, the same observations can be made.

Measured grid currents in the dq reference frame when changing the q-axis current reference

Creation of the Vivado block design

This section provides a step-by-step explanation of how to re-create the Vivado project to generate the FPGA bitstream of the Grid-tied inverter control.

Information on how to create an FPGA control template is available on the Getting started with FPGA control impl.

To find all FPGA-related notes, you can visit the FPGA development homepage.

Raw ADC data conversion

The ADC conversion results are available from the AXI4-Stream interfaces M_AXIS_ADC of the “ix axis interface” module. They return the raw 16-bit signed integer result from the ADC chips.

The ADC conversion IP shown below serves to convert that raw data into the actual measured quantities in the float datatype. Knowing the sensitivity $S$ of the sensor and the B-Box RCP front-end gain $G$, the formula below can be used:

$\alpha [\text{bit/A}] = S\cdot G\cdot 32768/10\,$

A numerical example of gain computation is available in the last section of the B-Box analog frontend configuration page.

#include "ADC_conversion.h"

#include 
#include 

void adc_conversion(
  hls::stream<int16_t>& adc_in,
  hls::stream<float>& adc_gain,
  hls::stream<float>& adc_offset,
  hls::stream<float>& adc_out)
{
// see https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/pragma-HLS-interface
// "both" means registers are placed on TDATA, TVALID, and TREADY
#pragma HLS INTERFACE axis port=adc_in register_mode=both register
#pragma HLS INTERFACE axis port=adc_gain register_mode=both register
#pragma HLS INTERFACE axis port=adc_offset register_mode=both register
#pragma HLS INTERFACE axis port=adc_out register_mode=both register
// turns off block-level I/O protocols
#pragma HLS interface ap_ctrl_none port=return

  int16_t adc = adc_in.read();
  float gain = adc_gain.read();
  float offset = adc_offset.read();
  float result = (float)adc * gain - offset;
  adc_out.write(result);
}

This ADC conversion module is connected as shown in the two screenshots below. The AXI4-Stream Broadcaster IP (included with Vivado) serves to propagate an AXI4-Stream to multiple ports.

The “adc” hierarchy contains the conversion logic

Content of the “adc” hierarchy

Sample time (Ts) conversion

The M_AXIS_Ts port of the “ix axis interface” provides the sample period $T_s$ in nanoseconds in a 32-bit unsigned integer format. This signal is the time distance between two consecutive samples.

The Ts conversion module shown below converts this signal into a floating-point value in seconds. The result will be used by the PI controllers of the Grid synchronization module and the dq current control module.

#include "Ts_conversion.h"

#include 
#include 

void ts_conversion(
  hls::stream<uint32_t>& Ts_ns_in,
  hls::stream<float>& Ts_s_out)
{

// see https://docs.xilinx.com/r/en-US/ug1399-vitis-hls/pragma-HLS-interface
// "both" means registers are placed on TDATA, TVALID, and TREADY
#pragma HLS INTERFACE axis port=Ts_ns_in register_mode=both register
#pragma HLS INTERFACE axis port=Ts_s_out register_mode=both register
// turns off block-level I/O protocols
#pragma HLS interface ap_ctrl_none port=return

  uint32_t Ts_ns = Ts_ns_in.read();
  // convert from nanoseconds to seconds
  float Ts_s  = (float)Ts_ns * 0.000000001f; 
  Ts_s_out.write(Ts_s);
}

Grid synchronization

The grid synchronization is done using the dq-type PLL which transforms both the grid voltages and currents into dq components, that are used in the dq current controller.

The grid synchronization IP shown below is documented in the FPGA impl. of a PLL for grid sync page.

In the Vivado project, the IP is connected as follows. An AXI4-Stream Broadcaster is used because the sample time (Ts) signal is also connected to the dq current controller as shown in the next section.

DQ current control

The implementation of the dq current controller is detailed in FPGA-based PI for dq current control. It consists of two identical PI controllers with a decoupling network for independent control of the d- and q-components of the grid current.

The kiTs_dq port takes as input the results of Ts multiplied by Ki (Ki is a parameter set from the CPU, using the port CPU2FPGA_10). The multiplication is performed using a Floating-Point IP as shown below.

The dq current controller is connected as shown below. The inputs Kp, Id_ref, and Iq_ref are connected to CPU2FPGA ports 11, 12, and 13.

Duty cycles computation

This block converts the voltage references computed by the dq current controller to abc quantities $E_{g,abc}$, and computes the corresponding duty cycles (in the range [0,1]) according to

$$ d_{abc} = \left(\frac{E_{g,abc}}{V_{dc}}+0.5\right)\cdot T_\text{clk},$$

where $T_\text{clk}$ is the period of the clock used in the PWM modulator, expressed in ticks (1 tick = 4 ns). The duty cycles are converted into uint16 numbers with a unit of ticks to be compatible with the PWM modulator.

It is connected as shown below. in_CLOCK_period is connected to CLOCK_1_period of the IMPERIX_FW IP.

#include "duty_cycles.h"

#include 
#include 

#include "ap_fixed.h"
#include "hls_math.h"

void dq02abc(float d, float q, float zero, float wt, float& A, float& B, float& C)
{
#pragma HLS inline

  const ap_fixed<16,2> sqrt3_2 = 0.86602540378444;// sqrt(3)/2

  ap_fixed<32,16> d_fix    = (ap_fixed<32,16>)d;
  ap_fixed<32,16> q_fix    = (ap_fixed<32,16>)q;
  ap_fixed<32,16> zero_fix = (ap_fixed<32,16>)zero;
  ap_fixed<16,4> wt_fix	   = (ap_fixed<16,4>)wt;

  ap_fixed<16, 2> cos_wt = hls::cos(wt_fix);
  ap_fixed<16, 2> sin_wt = hls::sin(wt_fix);

  ap_fixed<32,16> alpha_fix = d_fix*cos_wt - q_fix*sin_wt;
  ap_fixed<32,16> beta_fix  = d_fix*sin_wt + q_fix*cos_wt;

  ap_fixed<32,16> A_fix = alpha_fix + zero_fix;
  ap_fixed<32,16> B_fix = zero_fix - alpha_fix/2 + sqrt3_2*beta_fix;
  ap_fixed<32,16> C_fix = zero_fix - alpha_fix/2 - sqrt3_2*beta_fix;

  A = (float)A_fix;
  B = (float)B_fix;
  C = (float)C_fix;
}

float sat(float input, float max_sat, float min_sat)
{
#pragma HLS inline

  if(input > max_sat) {
    return max_sat;
  } else if(input < min_sat) {
    return min_sat;
  } else {
    return input;
  }
}

void vitis_duty_cycles(	hls::stream<float>& in_Udc,
  hls::stream<float>& in_Ed_ref,
  hls::stream<float>& in_Eq_ref,
  hls::stream<float>& in_E0_ref,
  hls::stream<float>& in_theta,
  uint16_t in_CLOCK_period,
  uint16_t& out_dutycycle_A,
  uint16_t& out_dutycycle_B,
  uint16_t& out_dutycycle_C)
{
#pragma HLS INTERFACE axis port=in_Udc register_mode=both register
#pragma HLS INTERFACE axis port=in_Ed_ref register_mode=both register
#pragma HLS INTERFACE axis port=in_Eq_ref register_mode=both register
#pragma HLS INTERFACE axis port=in_E0_ref register_mode=both register
#pragma HLS INTERFACE axis port=in_theta register_mode=both register
#pragma HLS interface ap_ctrl_none port=return

  float Udc = in_Udc.read();
  float Ed_ref = in_Ed_ref.read();
  float Eq_ref = in_Eq_ref.read();
  float E0_ref = in_E0_ref.read();
  float theta = in_theta.read();

  float A,B,C;
  dq02abc(Ed_ref, Eq_ref, E0_ref, theta, A, B, C);

  float next_d_A = A/Udc + 0.5;
  float next_d_B = B/Udc + 0.5;
  float next_d_C = C/Udc + 0.5;

  float d_A = sat(next_d_A, 1.0, 0.0);
  float d_B = sat(next_d_B, 1.0, 0.0);
  float d_C = sat(next_d_C, 1.0, 0.0);

  ap_fixed<16,2> d_A_fix = (ap_fixed<16,2>)d_A;
  ap_fixed<16,2> d_B_fix = (ap_fixed<16,2>)d_B;
  ap_fixed<16,2> d_C_fix = (ap_fixed<16,2>)d_C;

  out_dutycycle_A = (uint16_t)(d_A_fix * in_CLOCK_period);
  out_dutycycle_B = (uint16_t)(d_B_fix * in_CLOCK_period);
  out_dutycycle_C = (uint16_t)(d_C_fix * in_CLOCK_period);
}

PWM generation

Finally, the duty cycles are transformed into PWM signals in the PWM modulator block. The implementation details are presented in PWM modulator implementation in FPGA.

The PWM block uses the clock signal CLOCK_1 as a reference to generate the PWM triangular carrier signal. The switching frequency is therefore equal to the frequency of CLOCK_1, which can be configured using a CLK block.

The high-side switching signals are obtained by comparing the 3 duty cycles $d_{abc}$ with the triangular carrier. The generation of the low-side switching signals is done by the SB-PWM driver incorporated into the imperix IP and the dead time duration is specified in the CPU block Sandbox PWM configurator.

Configuration of the SB-PWM block in Simulink

Mapping between sb_pwm and pwm ports of the imperix IP in Vivado

Thanks to the SB-PWM driver, the safety mechanism of the B-Box RCP is also available for custom-made PWM modulators. In case an over-value is detected during operation, the PWM outputs are immediately blocked and the operation is safely stopped.

CPU-side implementation

The CPU model shown below is only used for the following tasks:

Configuration:

Configure the sampling frequency using the Configuration block (CLOCK_0 = 100 kHz, oversampling = 4)
Configure the modulator clocks using the CLK block (CLOCK_1 = 200 kHz)
Configure the ADC conversion parameters (gain and offset)

Parameter tuning:

Tune the controller gains (Kp and Ki)
Transfer the dq current references $I_{g,d}^*$ and $I_{g,q}^*$ to the FPGA

The mapping between SBI/SBO registers and CPU2FPGA/FPGA2CPU port is explained on the Getting started page.

Debugging and monitoring

For debugging and monitoring purposes, internal signals of the FPGA inverter control model can be split using AXI4-Stream Broadcaster IPs and routed to FPGA2CPU ports of the imperix IP. This way, the signals can be accessed from the CPU, connected to probe variable blocks, and observed using Cockpit.

Additionally, ADC blocks can still be used to observe the analog input signals. For accurate readings, make sure the sensitivities of the ADC blocks match the gain parameter sent to the FPGA!

Please note that if the FPGA control is running faster than the CPU, then the CPU will only see a downsampled version of the observed signals.

Vivado block design with debug probes

Download TN147_block_design_with_debug_probes.pdf

Complete CPU model with debug probes and ADC blocks

To find all FPGA-related notes, you can visit FPGA development homepage.

The post FPGA-based control of a grid-tied inverter appeared first on imperix.

High-Level Synthesis for FPGA developments

Benoît Steinmann — Tue, 15 Jun 2021 11:49:48 +0000

High-level synthesis (HLS) tools greatly facilitate the implementation of complex power electronics controller algorithms in FPGA. Indeed HLS tools allow the user to work at a higher level of abstraction. For instance, the user can use Xilinx Vitis HLS to develop FPGA modules using C/C++ or the Model Composer plug-in for Simulink to use graphical programming instead.

This page shows how IPs generated using high-level synthesis tools can be integrated into the FPGA of an imperix power controller. To this end, the example of a PI-based current controller for a buck converter is used to illustrate all the required steps.

Power converter control FPGA

To find all FPGA-related notes, you can visit FPGA development homepage.

Integrating HLS designs in the FPGA

Description of the design

The image below shows the example that will be implemented on this page. It is a PI-based current controller for a buck converter, based on the algorithm presented on the PI controller implementation for current control technical note. This example uses the following resources

the FPGA control starter template from the getting started with FPGA guide
the PWM modulator IP from the FPGA PWM modulator example
the high-level synthesis PI-based current control IP from either
- the C++ implementation presented in the Xilinx Vitis HLS guide
- or the Simulink implemention presented in the Model Composer guide

The axis interface provides the inputs of the current control algorithm in form of AXI4-Stream ports. The following ports are used:

CPU2FPGA_00 for the current reference Il_ref (32-bit single-precision)
CPU2FPGA_01 for the parameter Kp (32-bit single-precision)
CPU2FPGA_02 for the parameter Ki (32-bit single-precision)
ADC_00 for the measured current Il (16-bit signed integer)
ADC_01 for the measured output voltage of the converter Vout (16-bit signed integer)
ADC_02 for the measured input voltage of the converter Vint (16-bit signed integer)
Ts for the sampling period in nanoseconds (32-bit unsigned integer)

Aside from AXI4-Stream data, the current control IP also uses the ports:

CLOCK_period for the PWM period in ticks (16-bit unsigned)
nReset_ctrl to reset the PI when the controller is not in OPERATING state

Using these signals, the HLS IP computes a 16-bit unsigned duty_cycle_ticks that is forwarded to the PWM IP. And finally, the PWM IP uses the sb_pwm driver to output the PWM signals to optical fibers of the B-Box RCP controller. The PWM IP and the SB-PWM driver are further documented on the FPGA PWM modulator page.

The ADC values provided by the starter template are the raw result from the ADC chips. They are multiplied by a gain inside the HLS IP to obtain physical values. An example of gain computation is available on the ADC block help page.

CPU-side implementation using Simulink

The CPU-side model is quite simple, as the control algorithm runs entirely in the FPGA. The CPU code provides the current reference and Kp/Ki parameters, operates the PI reset signal, and configures the PWM outputs.

The single2sbo MATLAB Function blocks are used to map the current reference Il_ref and the Kp, Ki parameter to the CPU2FPGA ports.

This nReset_ctrl signal is used to keep the PI integrator at reset when the controller is not in OPERATING state. As documented in Getting started with FPGA, this reset signal is controlled using SBO_63. To obtain the desired behavior, we’ll simply connect the reset output of a Core state block to SBO_63.

And finally, the SB-PWM block is used to activate the output PWM channel 0 (CH0) (lane #0 and lane #1). The output is configured as Dual (PWM_H + PWM_L) with a deadtime of 1 µs. This configuration expects a PWM signal coming to sb_pwm[0] input of the imperix firmware IP and will automatically generate the complementary signals with the configured deadtime.

The ADC blocks are only used to retrieve the analog input signals at the CPU level for real-time monitoring. They do not affect the closed-loop control behavior.

FPGA-side implementation using Vivado

The TN142_vivado_design.pdf file below shows the full Vivado FPGA design. Here are the step-by-step instructions to reproduce it.

Create an FPGA control implementation starter template by following the Getting started with FPGA control implementation.

Add the PWM IP (from the custom PWM in FPGA page) and current control IP (from the Xilinx Vitis HLS guide or the Model Composer guide) into your Vivado project. In the screenshots of this example, we’ll use the IPs generated using System Generator and Vitis HLS, respectively.
To read the duty_cycle_ticks only when duty_cycle_ticks_ap_vld is ‘1’, the RAM-based Shift Register IP is used. With the configuration shown in the screenshot below, this block adds one register stage that acts as a buffer. It keeps the last computed duty cycle until a new value has been computed. When a new value is available, it replaces the old one.

Add a Constant IP to set all the 31 unused sb_pwm outputs to ‘0’. Set its Const Width to 31 and its Const Val to 0.
Add a Concat IP. It will serve to concat the pwm output of the PWM IP with the zeros of the Constant IP.

Add a Constant IP to set to set the update rate. ‘0’ = single rate, ‘1’ = double rate.

Connect the clock signals as below:

Connect the AXI4-Streams
- M_AXIS_CPU2FPGA_00 to Il_ref_V
- M_AXIS_CPU2FPGA_01 to Kp_V
- M_AXIS_CPU2FPGA_02 to Ki_V
- M_AXIS_ADC_00 to Il_raw_v
- M_AXIS_ADC_01 to voltage Vout_raw_V
- M_AXIS_ADC_02 to Vint_raw_V
- M_AXIS_Ts to Ts_V

The provided Delay Counter VHDL module (delay_counter.vhd) measures the elapsed time between two signals and outputs a time in nanoseconds, encoded as a uint32.
In this design, the delay counter modules are used purely for debugging purposes. As shown in the image below, one is used to measure the FPGA processing delay, which is the delay between the adc_done_pulse and the duty_cycle_ticks_ap_vld. Another module is used to measure the FPGA cycle delay by measuring the delay between the sampling_pulse and the duty_cycle_ticks_ap_vld. More information on what these delays represent are available on the discrete control delay product node.

Connect the nReset_ctrl signal to ap_rst_n.

And finally connect the clk signals to clk_250_mhz.
Click Generate bitstream. It will launch the synthesis, implementation, and bitstream generation
Once the bitstream generation is completed, click on File → Export → Export Bitstream File… to save the bitstream somewhere on your computer.

Back to FPGA development homepage

The post High-Level Synthesis for FPGA developments appeared first on imperix.

Custom PWM modulator implementation in FPGA

Benoît Steinmann — Thu, 10 Jun 2021 11:35:41 +0000

To implement power converter control algorithms in an FPGA, it is often required to develop an FPGA-based pulse-width modulation (PWM) module. Therefore, this note presents how to implement a custom PWM modulator in the Xilinx FPGA of the imperix controller (B-Box RCP or B-Board PRO).

The presented modulator uses FPGA pulse-width modulation with a triangular carrier. The sources of this example can be re-used in an FPGA-based power converter control design. Alternatively, it can be used as a starting point to develop more complex custom PWM modulators.

The first section of this page explains how the PWM module fits into an FPGA-based control design as well as the FPGA design of the pulse-width modulation. Next, the actual implementation of the FPGA modulator is presented using 3 different tools:

the Xilinx blockset for Simulink System Generator
the Simulink add-on MATLAB HDL Coder
hand-written VHDL coding

Then, a Simulink testbench is built to test the FPGA design in simulation. Finally, the modulator is integrated into the imperix controller FPGA to be validated.

This page addresses advanced content for users who require implementing converter control algorithms with FPGA logic or implementing non-standard modulation techniques.
For most use-cases, using CPU-based control and the pre-implemented carrier-based PWM modulators of the imperix library is widely sufficient and should be preferred.

To find all FPGA-related notes, you can visit FPGA development homepage.

Design choices for the FPGA-based PWM modulator

The Pulse Width Modulator (PWM) is intended to be used in a larger design such as the FPGA-based buck converter control example shown in the image below. The imperix firmware IP and ix axis interface are explained in the getting started with FPGA control page and the current control module is presented in the high level synthesis for FPGA tutorial.

Power converter control in an imperix controller

The FPGA-based PWM modulator must be connected to one of the 4 clock generators (CLK). In this example, the CLK allows synchronizing the CPU control task with the PWM carrier. For further details on the CLK FPGA signals, please refer to the “CLOCK interface” section of imperix firmware IP user guide.

The Sandbox PWM (SB-PWM) block makes it possible to drive the same PWM output chain as that used by other modulators (CB-PWM, PP-PWM, DO-PWM, and SS-PWM). This allows the user to generate complementary signals with dead-time, use the standard activate and deactivate functions and rely on the protection mechanism that blocks PWM outputs when a fault is detected.

The figure below shows how the FPGA modulator operates. The pulse-width modulation is obtained by comparing a duty cycle value with the triangular carrier wave.

PWM modulator based on a triangular carrier

The duty cycle can be updated using a single rate or double rate. When using the single-rate update, the duty cycle value is applied when the triangular carrier reaches its minimum. With the double-rate update, the duty cycle is updated twice per period: when the carrier reaches its maximum and when it reaches its minimum.

In summary, this FPGA PWM modulator behaves as a carrier-based PWM (CB-PWM) modulator that is configured with a triangular carrier, and a phase of 0.

The FPGA-based PWM module is shown below. The screenshot shows the IP generated with System Generator, but the input and output ports are identical when using MATLAB HDL Coder or VHDL. The ports are the following:

CLOCK: the clock interface that is meant to be connected to the CLOCK output of imperix firmware IP. It contains:
- CLOCK_prescaler, the CLK_timer ticking rate, 1 tick = (4 ns/CLK_prescaler)
- CLOCK_clk_en, asserted to indicate a new tick.
- CLOCK_period, the PWM period in ticks
- CLOCK_timer, a counter that goes from 0 to CLK_period-1
next_dutycycle: the next duty cycle to be updated. This signal is a 16-bit unsigned integer with a unit of ticks.
update_rate: controls the update rate. ‘0’ is single-rate and ‘1’ is double-rate.
pwm: the pulse-width modulation output signal

FPGA-based PWM module developed using Xilinx System Generator

Below is shown the high-level schematic of the FPGA-implemented PWM modulator. The duty cycle is stored in a register, whose enable port is controlled by the update rate. The triangular carrier is generated from the CLOCK input using an up/down counter that behaves as follows:

it resets each time CLOCK_timer is equal to zero
after a reset, it counts UP until CLOCK_timer reaches CLOCK_period/2
then, it counts down until it reaches zero.

High-level schematic of the PWM block

How to implement pulse-width modulation in FPGA?

This section provides 3 possible approaches for implementing the FPGA PWM modulator, using Xilinx System Generator, MATLAB Simulink HDL Coder, or hand-written VHDL.

1) FPGA PWM using Xilinx System Generator

The implementation of the FPGA PWM modulator using Xilinx System Generator is given below. The sources are available on the System Generator introduction page.

Please note the following important points:

The input and output ports are represented with Gateway in and Gateway out blocks. The sample time is set to 4ns to ensure that the model represents the real behavior of the FPGA.
All the input signals are registered to improve performance.
In System Generator, users can configure the latency for each block. If there is a timing violation in the generated IP, try to increase the latency or insert registers between operations.
The carrier is generated using a free-running counter and a state machine that controls the counter. System Generator provides MCode block where users can convert MATLAB code to VHDL. The MATLAB code for the state machine is given below.

FPGA-based modulator designed using Xilinx System Generator

function [up,rst] = state_machine(reg_HalfPeriodMinusOne, reg_Timer)
persistent state, state = xl_state(0,{xlUnsigned, 1, 0});

% default value
up = 1;
rst = 0;

switch state
  case 0 % counting up
    up = 1;
    rst = 0;
    if reg_Timer > reg_HalfPeriodMinusOne
      state = 1;
    end
  case 1 % counting down
    up = 0;
    if reg_Timer == 0
      state = 0;
      rst = 1;
    else
      rst = 0;
    end
end

2) FPGA PWM using HDL Coder

The implementation of the FPGA PWM modulator using HDL Coder is given above. The sources are available on the MATLAB HDL Coder introduction page.

Please note the following important points:

The input and output ports are represented with Simulink input and output ports. The sample time is set to 4ns to ensure the model represents the real behavior in FPGA.
The delay block can be used to represent the register in FPGA.
The carrier is generated using a free-running counter and a state machine that controls the counter. Here, the state machine is implemented using a MATLAB Function block. The MATLAB code for the state machine is given below.

FPGA-based PWM module designed using MATLAB HDL Coder

function [rst, up] = state_machine(reg_HalfPeriodMinusOne, reg_Timer)

% define states
state_up = uint8(0);
state_down = uint8(1);
persistent state
if isempty(state)
  state = state_up;   
end

% default value
up = true;
rst = false;

switch state
    
  case state_up % counting up
    up = true;
    rst = false;
    if reg_Timer > reg_HalfPeriodMinusOne
      state = state_down;
    end
    
  case state_down % counting down
    up = false;
    if reg_Timer == 0
      state = state_up;
      rst = true;
    else
      rst = false;
    end
end

3) FPGA PWM using VHDL

The following VHDL code implements the FPGA pulse-width modulation design but in hand-written VHDL.

----------------------------------------------------------------------------------
-- Create Date: 10/02/2021
----------------------------------------------------------------------------------
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity UserCbPwm is

  Port (
    CLOCK_period : in std_logic_vector(15 downto 0);
    CLOCK_timer : in std_logic_vector(15 downto 0);
    CLOCK_prescaler : in std_logic_vector(15 downto 0);
    CLOCK_clk_en : in std_logic;

    -- must be between 0 and CLOCK_period
    -- loaded when the carrier reaches zero
    i_nextDutyCycle : in std_logic_vector(15 downto 0);
    i_enableDoubleRate : in std_logic;

    o_carrier : out std_logic_vector(15 downto 0);
    o_pwm : out std_logic;

    clk_250_mhz : in std_logic
  );
end UserCbPwm;

architecture impl of UserCbPwm is

  ATTRIBUTE X_INTERFACE_INFO : STRING;
  ATTRIBUTE X_INTERFACE_INFO of clk_250_mhz: SIGNAL is "xilinx.com:signal:clock:1.0 clk_250_mhz CLK";

  ATTRIBUTE X_INTERFACE_INFO of CLOCK_period:    SIGNAL is "imperix.ch:ix:clock_gen_rtl:1.0 CLOCK period";
  ATTRIBUTE X_INTERFACE_INFO of CLOCK_timer:     SIGNAL is "imperix.ch:ix:clock_gen_rtl:1.0 CLOCK timer";
  ATTRIBUTE X_INTERFACE_INFO of CLOCK_prescaler: SIGNAL is "imperix.ch:ix:clock_gen_rtl:1.0 CLOCK prescaler";
  ATTRIBUTE X_INTERFACE_INFO of CLOCK_clk_en:    SIGNAL is "imperix.ch:ix:clock_gen_rtl:1.0 CLOCK clk_en";

  attribute X_INTERFACE_MODE : string;
  attribute X_INTERFACE_MODE of CLOCK_timer : signal is "monitor";

  signal reg_Pwm : std_logic;
  signal reg_Carrier : unsigned(15 downto 0);

  signal reg_ClkEnable : std_logic;
  signal reg_HalfPeriodMinusOne : unsigned(15 downto 0);
  signal reg_DutyCycle : unsigned(15 downto 0);
  signal reg_DutyCyclePlusOne : unsigned(15 downto 0);

  signal reg_Timer : unsigned(15 downto 0);

  type t_CarrierStates is (COUNTING_UP, COUNTING_DOWN);
  signal reg_CarrierState: t_CarrierStates;

begin

  o_carrier <= std_logic_vector(reg_Carrier);
  o_pwm <= reg_Pwm;

  P_INPUT_SAMPLING : process(clk_250_mhz)
  begin
    if rising_edge(clk_250_mhz) then
      reg_Timer <= unsigned(CLOCK_timer);
      reg_ClkEnable <= CLOCK_clk_en;
      reg_HalfPeriodMinusOne <= shift_right(unsigned(CLOCK_period), 1) - 1;
      -- update the duty-cycle when the carrier hits zero
      if reg_Carrier = 0 or (i_enableDoubleRate = '1' and reg_Carrier = unsigned(CLOCK_period)) then
        reg_DutyCycle <= unsigned(i_nextDutyCycle);
        reg_DutyCyclePlusOne <= unsigned(i_nextDutyCycle) + 1;
      end if;
    end if;
  end process P_INPUT_SAMPLING;

  P_TRIANGLE_CARRIER: process(clk_250_mhz)
  begin
    if rising_edge(clk_250_mhz) then
      -- reg_ClkEnable serves to slow down the logic if the CLOCK_prescaler is used
      -- it is used only if the frequency is lower than 3.8 kHz
      if reg_ClkEnable = '1' then
        if reg_CarrierState = COUNTING_UP then
          reg_Carrier <= reg_Carrier + 2;
          if reg_Timer >= reg_HalfPeriodMinusOne then -- minus one because well go in counting down in next clock cycle
            reg_CarrierState <= COUNTING_DOWN;
          end if;
        else -- reg_CarrierState = COUNTING_DOWN
          if reg_Carrier >= 2 then
            reg_Carrier <= reg_Carrier - 2;
          end if;
          if reg_Timer = 0 then
            reg_CarrierState <= COUNTING_UP;
            reg_Carrier <= (others => '0');
          end if;
        end if;
      end if;
    end if;
  end process P_TRIANGLE_CARRIER;

  P_OUTPUT: process(clk_250_mhz)
  begin
    if rising_edge(clk_250_mhz) then
      if (reg_DutyCycle /= 0) AND (reg_DutyCyclePlusOne > reg_Carrier) then
        reg_Pwm <= '1';
      else
        reg_Pwm <= '0';
      end if;
    end if;
  end process P_OUTPUT;

end impl;

Testing the FPGA-based PWM module in simulation

The following test bench is used to validate the proper functioning of the FPGA pulse-width modulation with a Simulink simulation. The screenshots below show the test of PWM using the Xilinx System Generator implementation, but the testbench is identical for testing the MATLAB HDL Coder implementation.

Simulation test bench for the FPGA-based PWM block

The testbench generates CLOCK signals that are identical to the interface of the imperix firmware IP. The CLOCK frequency is set to 200 kHz. A sinusoidal signal ranging from 0 to CLOCK_period (in ticks) is connected to the dutycycle input. The following plot validates that the resulting PWM has a frequency of 200 kHz and that its duty cycle varies from 0 to 1 following the sinusoid.

As shown below, the behavior of internal signals of the PWM module can also be inspected by adding scopes inside the Xilinx System Generator design. System Generator will complain and show red (!). It does not cause any problem during simulation but it is important to remove any scope before generating the FPGA IP.

The internal signal UpdatedDutyCycle is the actual value compared to the triangular carrier to generate the PWM signal. Changing the constant signal applied to the update_rate input allows selecting between single-rate update (0) and double-rate update (1). These two modes are documented in the standard carrier-based PWM block help. Observing internal signals allows checking that the UpdatedDutyCycle behaves according to the selected update rate mode.

Duty cycle update in single-rate update mode

Duty cycle update in double-rate update mode

Integrating the PWM modulator into the Xilinx FPGA

Description of the FPGA-based PWM modulator example

The FPGA design illustrated below is used to test the generated FPGA PWM modulator. Its purpose is to manually select the duty cycle of the modulator from a PC by using Imperix Cockpit. To do so, a tunable parameter block is used on the CPU. It is configured with the name duty_cycle and the data type single. This value is transferred from the CPU to the FPGA using the CPU2FPGA_00 interface (SBO_00 and SBO_01) as explained in the getting started with FPGA control page. In the FPGA, this single-precision duty-cycle is transformed into an integer value in ticks as follow:

The 32-bit single-precision floating-point value is transformed into a 16-bit fixed-point value with an integer width of 1-bit and a fraction width of 15-bit (fix16_15). This repartition has been chosen because the duty cycle is expected to range between 0.0 and 1.0 so only 1-bit is required on the integer part.
To obtain a value in ticks, the result of the previous step is multiplied by CLOCK_period. The result of the multiplication of a fix16_15 with a uint16 is a fix32_15 (32-bit, 17-bit integer part, and 15-bit fractional part).
Finally, only the 16 first bits of this result are used as the duty cycle input of the FPGA PWM modulator IP.

The CLOCK_0 will be used as a clock reference for the FPGA pulse-width modulation, which means that the PWM modulator will run at the same frequency as the CPU control task and that both will stay synchronized. (That is because the CPU interrupt rate is always defined by CLOCK_0.)

On the imperix firmware IP, the sb_pwm[31:0] port provides access to the same PWM output chain as that used by other modulators (CB-PWM, PP-PWM, DO-PWM and SS-PWM). This allows the user to generate complementary signals with dead-time, use the standard activate and deactivate functions and rely on the protection mechanism that blocks PWM outputs when a fault is detected.

When driving a PWM channel (two pseudo-complementary signals with dead time), the user only needs to generate the HIGH signal, which must be connected to the appropriate sb_pwm input (sb_pwm[0], sb_pwm[2], sb_pwm[4], etc.). Another example of such a configuration is available in the FPGA-based hysteresis current control example

Imperix strongly discourages the user from directly driving the top-level pwm port, as this would bypass the enable/disable mechanism! Instead, the SB-PWM driver is meant to provide proper access to PWM outputs, which should be used in all cases. This is critical since this mechanism also handles fault management!

CPU-side implementation

As with any FPGA-based implementation, a CPU code is still required to configure the imperix IP and define real-time variables accessible from the various Cockpit modules. In the current example, the CPU code

configures the frequency of CLOCK_0,
configures the Sandbox PWM driver,
declares the duty_cycle variable and transmits it to the FPGA through the SBO interface.

Using ACG SDK on Simulink

The frequency of CLOCK_0 is defined in the Configuration block, and the duty_cycle (float) variable is created using a tunable parameter block. It is then mapped to M_AXIS_CPU2_FPGA_00 using the the MATLAB Function block single2sbo (as introduced in Getting started with FPGA control development).

The SB-PWM block is used to configure and activate/deactivate the output PWM channel 0 (CH0) (lane #0 and lane #1). The output is configured as Dual (PWM_H + PWM_L) with a deadtime of 1 µs. This configuration expects a PWM signal coming to sb_pwm[0] input of the imperix firmware IP and will automatically generate the complementary signals with the configured deadtime.

Click to download the Simulink model

Using CPP SDK

The equivalent functionalities can also be implemented in C code, using CPP SDK. The corresponding code is available for download below.

Click to download the C code example

FPGA-side implementation using Vivado

The TN141_vivado_design.pdf file below shows the full Vivado FPGA design. Here are the step-by-step instructions to reproduce it.

Click to download TN141_vivado_design.pdf

Create an FPGA control implementation starter template by following the Getting started with FPGA control implementation.

Add the FPGA PWM IP into your Vivado project. System Generator is taken as an example but the steps are identical for a VHDL or an HDL Coder module.

Add a Floating-point IP and select the Float-to-fixed operation. Select Single precision for input and Integer Width 1, Fraction Width 15 (fix16_15) for output. This module will convert the duty cycle sent by CPU from single to fix16_15.

Add a Multiplier IP and set the configuration as shown below.
- The input A is connected to the CLOCK_period so it is set to 16-bit unsigned
- The input B is connected to the fixed-point duty cycle so it is set to 16-bit signed
- The output range is set to return only the 16 first bits of the integer part of the result
- The clock enable (CE) input is enabled and will be connected to the tvalid output of the single_to_fix16_15 IP output. This way, the multiplication is performed synchronously with the data coming from the AXI4-Stream.

Add a Constant IP to set all the 31 unused sb_pwm outputs to ‘0’. Set its Const Width to 31 and its Const Val to 0.

Add a Concat IP. It will serve to concat the pwm output of the PWM IP with the zeros of the Constant IP.

Connect the pins as follows:
- all the CLOCK_0 signals of the imperix firmware IP signals to the PWM block
- M_AXIS_CPU2FPGA_00 to S_AXIS_A of single_to_fix16_15
- CLOCK_0_period to A of the Multiplier
- tdata of M_AXIS_RESULT of single_to_fix16_15 to B of the Multiplier
- tvalid of M_AXIS_RESULT of single_to_fix16_15 to CE of the Multiplier
- P of the Multiplier to i_nextdutycycle
- o_pwm of the PWM IP to In0 of the Concat IP
- dout of the 31-bit to zero Constant IP to In1 of the Concat IP
- dout of the Concat IP to the sb_pwm input of the imperix firmware IP
- all the IP clocks to clk_250_mhz

And finally, the design can be synthesized and the bitstream generated:

Click Generate bitstream. It will launch the synthesis, implementation and bitstream generation
Once the bitstream generation is completed, click on File → Export → Export Bitstream File… to save the bitstream somewhere on your computer.

Experimental validation

The bitstream can be generated and loaded into the device using Cockpit, as explained in the Getting started with FPGA control implementation page. Then, the Simulink model TN141_CPU_side.slx can be built and launched as explained in the Programming and operating imperix controllers getting started page.

Using the Variables module of Cockpit, the duty_cycle variable can be changed in real-time. After enabling the PWM outputs, the PWM signals at the output CH0 (lanes 0 & 1) can be observed using an oscilloscope or a logic analyzer.

imperix Cockpit user interface

The screenshot below shows the measured PWM signals using a logic analyzer, as expected the measured signals are complementary 50 kHz PWM signal with a duty cycle of 30% and a dead time of 1 µs.

Going further

The high-level synthesis for FPGA developments page re-uses this FPGA PWM modulator in a PI-based current control of a buck converter scenario. It shows how to integrate and connect HLS-generated IPs in a realistic FPGA power converter control implementation.

Back to FPGA development homepage

The post Custom PWM modulator implementation in FPGA appeared first on imperix.

Fixed point vs floating point arithmetic in FPGA

Benoît Steinmann — Fri, 13 Aug 2021 13:31:17 +0000

The choice of fixed vs floating-point arithmetic for an FPGA algorithm is a decision that has a significant impact on the FPGA resources usage, computation latency, as well as data precision. This page provides a comparison between fixed-point vs floating-point arithmetic and gives advantages and drawbacks for each approach. Then, it shows how to use the typecast MATLAB function in MATLAB Simulink to transform a floating-point value into an integer without changing the underlying bits, which is useful when exchanging data between the CPU and FPGA of an imperix power converter controller.

To find all FPGA-related notes, you can visit FPGA development homepage.

Integers and fixed-point arithmetic in FPGA

A fixed-point number is represented with a fixed number of digits before and after the radix point. In FPGA, a fixed-point number is stored as an integer that is scaled by a specific implicit factor. For example, the common notation fix16_10 used by Xilinx stands for a 16-bit integer scaled by $2^{10}$. In other words, 10 out of the 16 bits are used to represent the fractional part and 6 bits for the integer part.

Fixed-point arithmetic is widely used in FPGA-based algorithms because it usually runs faster and uses fewer resources when compared to floating-point arithmetic.

However, one drawback of fixed-point arithmetic is that the user has to anticipate the range of the data and choose the scaling factor accordingly (the size of the fractional part), making the design more prone to errors.

The custom FPGA PWM modulator is a good example where it makes sense to use fixed-point arithmetic, given that the range of the duty-cycle parameter is restricted between 0.0 and 1.0.

Floating-point arithmetic in FPGA

A floating-point number is represented with a fixed number of significant digits and scaled using an exponent in some fixed base. There are three formats supported by Xilinx tools: half (16 bit), single (32 bit), and double (64 bit). For example, a single format number is represented as:

Floating-point-based algorithms are more complex to handle than fixed-point, especially when using HDL languages (VHDL, Verilog). Fortunately, various Xilinx tools (such as the Vivado Floating-Point IP and other High-Level Synthesis (HLS) tools) make the development of floating-point-based algorithms much more convenient. Indeed, these tools add an abstraction level so that the user does not need to handle data representations to their binary representation level.

We recommend starting a design by using the single format everywhere and then switching to fixed-point arithmetic, where necessary, to improve latency and resource utilization.

How to typecast in MATLAB Simulink

To typecast a floating point to an integer in MATLAB Simulink, the following MATLAB Function can be used. It uses the typecast instruction to alter the type of a variable without modifying the underlying binary number. As FPGA registers are 16-bits, this transformation is required in order to transform a 32-bit floating-point value into two 16-bit registers and transfer it from the CPU to the FPGA using.

function [y1,y2] = single2sbo(u)
  temp = typecast(single(u),'uint16');
  y1 = temp(1);
  y2 = temp(2);

Typecasting an integer to a floating point in MATLAB Simulink is the reverse operation. It consists of concatenating two uint16 values and interpreting the result as single-precision data using the typecast MATLAB function.

function y = sbi2single(u1,u2)
  y = single(0); % sets compiled size of output
  y = typecast([uint16(u1) uint16(u2)], 'single');

A use-case example of the MATLAB typecast function can be seen in the High-Level Synthesis for FPGA developments example.

Back to FPGA development homepage

The post Fixed point vs floating point arithmetic in FPGA appeared first on imperix.

SBO – Sandbox output towards FPGA

Benoît Steinmann — Fri, 02 Apr 2021 13:36:41 +0000

The Sandbox Output towards FPGA (SBO) block writes the value of the SBO registers in the FPGA. It is used to transfer data from the CPU to the user-made code within the FPGA.

An SBO register can be configured as:

a configuration register: the value is written only once, at the code launch
a real-time register: the value can change anytime during the control

In Simulink and PLECS, configuration register values are defined from the block mask and real-time registers from the block input signal.

Information on FPGA edition is available on:

Editing the FPGA firmware (sandbox) (PN116)

Usage examples of the SBI block are available on:

Simulink block

Signal specification

The input expects a vector of 16-bit unsigned integer values to write to the SBO registers.

Up to 8 real-time registers and 8 configuration registers can be written from a single SBO block. Multiple SBO blocks can be used to write to more registers.

Parameters

Device ID selects which B-Box/B-Board to address when used in a multi-device configuration.
Real-time registers: Starting register number and Number of registers define the range of registers to write to.
Configuration registers: Starting register number and Number of registers define the range of registers to write to. Their values can be set from the Values tab.

PLECS block

Signal specification

The input expects a vector of 16-bit unsigned integer values to write to the SBO registers

Parameters

Device ID selects which B-Box/B-Board to address when used in a multi-device configuration.
Real-time register(s)(vectorizable) defines the real-time registers to write to using the input signal
Configuration register(s)(vectorizable) defines the configuration registers to write to and Configuration values(s) (vectorizable) sets their constant values.

C++ functions

void Sbo_WriteDirectly(unsigned int address, uint16_t data, unsigned int device=0);

Writes a constant value to an SBO register.

It can only be called in UserInit().

Parameters

address: address of the targeted register (0 to 63)
data: value to write
device: the id of the addressed device (optional, used in multi-device configuration only)

void Sbo_Write(unsigned int address, uint16_t data, unsigned int device=0);

Updates the value of an SBO register configured as real-time. It has to be called in the control interrupt.

For this function to work the addressed register must be set as real-time using Sbo_ConfigureAsRealTime().

Parameters

address: address of the targeted register (0 to 63)
data: value to write
device: the id of the addressed device (optional, used in multi-device configuration only)

void Sbo_ConfigureAsRealTime(unsigned int address, unsigned int device=0);

Updates the value of an SBO register configured as real-time. It has to be called in the control interrupt.

Tags an SBO register as real-time, meaning that its value can be updated from the interrupt routine using Sbo_Write() and is transferred to the FPGA at the end of the interrupt routine execution.

It has to be called in UserInit().

Parameters

address: address of the targeted register (0 to 63)
device: the id of the addressed device (optional, used in multi-device configuration only)

The post SBO – Sandbox output towards FPGA appeared first on imperix.

FPGA development on imperix controllers

Benoît Steinmann — Thu, 10 Jun 2021 09:43:16 +0000

Usually, the user programs the B-Box RCP or the B-Board PRO CPU using imperix ACG SDK or C++ SDK, and simply uses the pre-implemented FPGA peripherals such as the ADC drivers or PWM generators. Nevertheless, advanced users can also directly program the FPGA to implement high-performance algorithms, interface with specialized peripherals, or implement high-speed communications with other devices.

This page summarizes the documentation pages relating to FPGA development on imperix controllers.

Executing power converter control algorithms on an FPGA

For control frequencies of 250 kHz or less, the CPU of the B-Box RCP is usually more than fast enough. Controlling a power converter using the FPGA instead of the CPU allows for even higher execution rates and lower computation latency. Our FPGA-based control of a grid-tied inverter example shows how an entire control algorithm can be ported to the FPGA and reach a control frequency above 650 kHz.

TN147: FPGA-based control of a grid-tied inverter

This example also demonstrates how High-Level Synthesis tools can be leveraged to generate complex FPGA modules such as a grid synchronization module with dq-PLL or a dq current control.

TN143: FPGA implementation of a PLL for grid synchronization
TN144: DQ current control using FPGA-based PI controllers

Starting an FPGA development project

Imperix offers the possibility to customize its FPGA firmware by instantiating the imperix firmware IP in Xilinx Vivado and editing programmable logic around it. The first step to start editing the FPGA consists of installing Xilinx Vivado Design Suite, which is available for free as the ML Standard (or WebPACK) edition.

PN168: Xilinx Vivado Design Suite installation

The user should familiarise himself with the imperix firmware IP for Xilinx Vivado. It contains the logic operating the FPGA part of the B-Box and B-Board controllers. The IP also provides user interfaces to exchange data with the CPU, retrieve conversion results of the analog inputs, drive the PWM outputs, and more.

PN116: Imperix firmware IP product guide

PN117: Download and update imperix IP for FPGA sandbox

The user can then create the FPGA development template by following the instructions in the product note below. This product note also explains how the template facilitates the retrieval of the sampled analog inputs and data exchange with the CPU, by using the AXI4-Stream protocol. Finally, it provides a “hello-world” example explaining step-by-step how to add custom logic, synthesize the design and load it into the FPGA.

PN159: Getting started with FPGA control development

Implementing closed-loop control of a buck converter in FPGA

The following notes show a step-by-step example of how to implement the closed-loop control of a buck converter in FPGA.

FPGA-based control of a buck converter

The first page presents three ways of implementing a custom triangular carrier PWM modulator, using VHDL or code generation tools such as System Generator or MATLAB HDL Coder.

TN141: Custom FPGA PWM modulator implementation

The second page explains how to create a PI-based current controller for a buck converter using high-level synthesis (HLS) tools such as Model Composer and Vitis HLS.

TN142: PI-based current controller for a buck converter

Learning about automated generation tools for FPGA

Traditionally, FPGA designs are implemented using HDL languages such as VHDL or Verilog. However, the user can use automated code generation tools to design FPGA modules without writing any line of any HDL language.

These tools can be separated into two main categories:

HDL-level tools, such as System Generator and HDL Coder, in which the user can describe its design down to the flip-flop register. These tools are much closer to HDL languages (VHDL or Verilog).
High-Level Synthesis (HLS) tools, such as Xilinx Model Composer and Vitis HLS, are particularly adapted to describe control algorithms using complex data types and math functions.

Only Vitis HLS is free of cost, the others require a paid license.

To implement peripherals, such as custom PWM modulators or communication interfaces, the HDL approach is recommended. The following pages deal with HDL code generation tools:

TN141: Custom FPGA PWM modulator implementation
PN161: Xilinx System Generator introduction
PN162: MATLAB HDL Coder introduction

To implement control algorithms, HLS tools are better suited. The following pages deal with HLS tools:

TN142: PI-based current controller for a buck converter
PN163: Xilinx Model Composer introduction
PN164: Xilinx Vitis HLS introduction

Implementing communication with other devices

FPGA can also be used to implement high-speed communication with other devices such as hardware-in-the-loop (HIL) or third-party FPGA. The following page details how SFP ports can be used to communicate using the Aurora protocol.

PN118: Example of FPGA-based Aurora 8B/10B communication
PN122: SFP communication with an RTDS MMC simulator

Additional examples

These older examples were written before AXI4-Stream was introduced to the FPGA development template. As such, they may not implement all the recommendations provided in the pages above.

FPGA-based SPI communication IP for A/D converter
FPGA-based Direct Torque Control using Vivado HLS
FPGA-based hysteresis controller for three-phase inverter using HDL Coder
FPGA-based hysteresis current controller for three-phase inverter

The post FPGA development on imperix controllers appeared first on imperix.

Download and update imperix IP for FPGA sandbox

Benoît Steinmann — Thu, 05 Sep 2024 09:46:32 +0000

This page provides the imperix IP and other source files required for FPGA development on imperix controllers.

To learn how to use the imperix IP, please refer to the getting started with FPGA page and the imperix IP user guide page.

To find all FPGA-related notes, you can visit FPGA development homepage.

Download

The following table lists the different imperix IP version available. The minimal Vivado version required is 2022.1.

When upgrading an existing Vivado project, please refer to the upgrade section below.

C++ or ACG SDK	imperix IP version	Download
2024.3	3.10 Rev. 0	FPGA_Sandbox_template_3.10rev0.zip
2025.1	3.10 Rev. 0 3.10 Rev. 1*	FPGA_Sandbox_template_3.10rev1.zip

New features of the imperix IP version 3.10 are shown here.

*An upgrade of the imperix IP from 3.10 Rev. 0 to 3.10 Rev. 1 is only required for users who need to configure the excitation frequency and resolution of the resolver.

C++ or ACG SDK	imperix IP version	Minimal Vivado version required	Download
3.4.x.x 3.5.x.x	3.4 Rev. 1	2019.2	sandbox_sources_3.4_3.5.zip
3.6.x.x	3.6 Rev. 1	2019.2	sandbox_sources_3.6.zip
3.7.x.x	3.7 Rev. 1	2021.1	sandbox_sources_3.7.zip
3.8.x.x	3.8 Rev. 1	2022.1	sandbox_sources_3.8.zip
internal only	3.9 Rev. 1 to 3	2022.1	internal only
2024.1	3.9 Rev. 4	2022.1	FPGA_Sandbox_template_3.9rev4.zip ⚠️Read upgrade section below
2024.2	3.9 Rev. 5	2022.1	FPGA_Sandbox_template_3.9rev5.zip

Upgrade procedure

This section describes how to upgrade the imperix IP from an existing Vivado project.

Download the FPGA_Sandbox_template zip file for the targeted SDK version and unzip it.
In the /ix_repo/ directory, replaces the interfaces and ips folders with the ones from the freshly downloaded sandbox template.

Copy the new constraint file to /constraints/.
⚠️If modifications were made to the constraint file (e.g. to use USR pins), these modifications need to be reported to the new constraint file.

Open the Vivado project. Remove the old constraint file and add the new one.

Upgrade the imperix sandbox IP. It may generate warnings, which is expected.

Follow the sections below depending on the used imperix IP version.

Upgrading to 3.9 Rev. 4

When upgrading the imperix IP to version 3.9 Rev. 4 or later, the following changes must be made

The interface BBOX was added, make sure to connect it to the top-level interface

Added BBOX interface

private_in and private_out size changed, make sure to update the top-level ports accordingly

The private_in and private_out size changed

After these changes, re-generate the top-level HDL wrapper. To make sure Vivado apply the change properly, we recommend deleting the top_wrapper.vhd and re-create the HDL wrapper.

In SDK 2024.1, a new feature called the synchronous averaging has been introduced. ADC channels with synchronous averaging enabled only output a new value once per CLOCK_0 period, regardless of the oversampling ratio set in the CONFIG block. When using an oversampling ratio, we recommend disabling the synchronous averaging option in the ADC blocks.

Upgrading to 3.9 Rev. 5

If upgrading from an older version, do not miss the upgrading to 3.9 Rev.4 section above.

The imperix IP version 3.9 Rev. 5 brings the following changes. Additional information are provided after the procedure.

The SBI and SBO interfaces were replaced by the memory-mapped SBIO_BUS.
The provided VHDL module sbio_registers.vhd makes the bridge between the SBIO_BUS and the traditional SBI and SBO interfaces that were present in the imperix IP 3.9 Rev. 4 and earlier
The sbio_interconnect .vhd was added for convenience

Upgrade procedure

In /ix_repo/
– replace AXIS_interface.vhd
– add sbio_interconnect.vhd and sbio_registers.vhd

If using the AXI-Stream interface (AXIS_interface.vhd)

In Vivado, click Refresh Changed Modules then reconnect the SBIO_BUS interface.

SBIO_BUS between the imperix IP and the AXI4-Stream interface

If using SBI and SBO registers
The provided VHDL module sbio_registers.vhd makes the bridge between the SBIO_BUS and the traditional SBI and SBO interfaces that were present in the imperix IP 3.9 Rev. 4 and earlier.

sbio_register module that converts SBIO_BUS to SBI and SBO

What’s new in 3.9 Rev. 5 (SDK 2024.2)

The SBI and SBO interfaces were replaced by the memory-mapped SBIO_BUS.
In the future, this bus allows addressing up to 1024 registers as well as using interconnects for an increase in flexibility.

Memory-mapped SBIO_BUS signals

The provided VHDL module sbio_registers.vhd makes the bridge between the SBIO_BUS and the traditional SBI and SBO interfaces

sbio_register module that converts SBIO_BUS to SBI and SBO

The sbio_interconnect was added for convenience

The SBIO interconnect increases the number of SBIO_BUS interfaces, allowing to connect multiple SBIO modules as illustrated below.

sbio_interconnect

The address mapping of the SBIO interconnect is shown below, it divides the SBIO addressable range in 4 smaller areas.

sbio_interconnect memory mapping

As an example, to write to SBO_reg_03 of an sbio_registers block connected to S2_SBIO_BUS, The user has to use an SBO block to register number 512+3=515.

What’s new in 3.10 (SDK 2024.3 and SDK 2025.1)

In version 3.10, the IP becomes configurable:

The SFP port can be repurposed by the user, to use the Aurora protocol for instance.
Unused resources can be removed, to save up FPGA resources.

Repurposing the SFP ports

Checking the box “Disable RealSync on SFP” will remove the imperix RealSync logic for the selected SFP and free the GTX transceiver. The SFP can then be repurposed by the user to implement its own communication, using the Aurora 8B10B IP for instance.

An example of SFP port repurposing is available on the page Example of FPGA-based Aurora 8B/10B communication.

Settings to disable RealSync on specific ports

Checking the box “Disable RealSync on SFP 0” will reveal the txn, txp, rxn, and rxp ports, as well as the GT interface, which contains the Shared Logic ports.

The image below shows how an Aurora 8B10B IP can be used on the SFP port. Make sure that the “GT Refclk” parameter is set to 250 MHz and that the “include Shared Logic in example design” is checked. The other parameters can be changed freely.

An example of block design showing how to set up communication with Aurora is available here.

Saving FPGA resources

Some modules can be removed from the design to free FPGA resources. For now, only PWM modulators can be removed, but in the future we intend to allow disabling additional modules (for instance modules only used by the TPI8032) to make even more FPGA resources available to the user.

Settings to remove unused PWM modulators to save resources in the FPGA sandbox

The table below summarize the resources saved for each option. The percentages indicate the percentage of resources used by the component compared to the total resources available in the FPGA. Checking all the boxes frees around 18% of FPGA resources. With all the boxes ticked, the imperix IP takes around 30% of the FPGA resources available in the FPGA.

Resource usage	Slice LUTs	Slice registers
CB-PWM lane 0 to 7	1619 (2.39%)	12165 (1.38%)
CB-PWM lane 8 to 15	1619 (2.39%)	12165 (1.38%)
CB-PWM lane 16 to 23	1619 (2.39%)	12165 (1.38%)
CB-PWM lane 24 to 31	1619 (2.39%)	12165 (1.38%)
SS-PWM	2732 (3,48%)	3305 (2,10%)
PP-PWM	4303 (5,47%)	9162 (5,83%)

FPGA resources saved for each option

Back to FPGA development homepage

The post Download and update imperix IP for FPGA sandbox appeared first on imperix.

FPGA-based decoder for a Delta-Sigma modulator

Shu Wang — Tue, 24 Aug 2021 14:40:38 +0000

This technical note shows how to build a decoder IP for a Delta-Sigma Modulator and establish communication with such a device through USR ports of the B-Box RCP and B-Board PRO. The corresponding approach uses the user-programmable area inside the FPGA, also known as sandbox.

Introduction

Delta-Sigma Modulators are a class of analog-to-digital converters (ADCs) that produce a high-frequency data stream (1-bit), whose pulse density represents the acquired analog value. In data acquisition applications, such devices are particularly useful when only a few (typ. 1-2) digital lines are available. This may notably be essential when data must be carried across a galvanic isolation barrier, such as in numerous power electronic applications.

Delta-sigma modulation may represent various modulation techniques, resulting in different types of data encoding methods for the digital stream. Common types are Non-Return-to-Zero (NRZ) and Manchester coding. These techniques differ in their bitrates, but may also offer (or not) the possibility of recovering the clock from the bit stream.

Clock recovery is often an essential feature. Indeed, by recovering the data clock directly from the stream itself, a separate clock becomes dispensable, which further reduces the number of required communication lines. In practice, a delta-sigma modulator can communicate with its associated demodulator with only one digital line!

This note provides an implementation example centered around the AMC1035 delta-sigma modulator, which supports two output encoding methods: Non-Return-to-Zero (NRZ) and Manchester coding. The data rate of NRZ coding is 9~21MHz while the data rate of Manchester coding is 9~11MHz. In the next chapter, a decoder for each coding method will be provided.

Instructions on how to build an FPGA project template and how to exchange data between the CPU and the sandbox using the AXI4-Stream interface module can be found in Getting started with FPGA control development.
The page AXI4-Stream IP from Xilinx presents the AXI4-Stream interface and the Xilinx AXI4-Stream IPs.
Another example of how to expand the number of ADC inputs using SPI and user ports is available in FPGA-based SPI communication IP for ADC.

Software sources

AMC1035 Download

Delta-sigma decoder implementation

Synchronous decoder for NRZ encoding

In synchronous mode, the clock signal is generated by the FPGA, transmitted to the delta-sigma modulator, and used by the latter for the modulation. The same clock is then used for decoding the NRZ bit stream that is received by the FPGA. This approach uses two physical USR ports available from the sandbox and one three-order sinc filter as a decimator.

In general, the suggested system diagram for a synchronous decoder is shown below.

System diagram using a synchronous decoder

The VHDL codes are given below.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.std_logic_unsigned.all;
use IEEE.NUMERIC_STD.ALL;

entity amc_driver is
	port(   
	-- Configuration from CPU
        AMC_M : in std_logic_vector(15 downto 0); -- Decimation ratio
        
	-- Input from AMC1035:
        DATA_IN : in std_logic;  -- Input bit stream
        CLK  : in std_logic; -- AMC clock input
        
        -- sinc3 filter output using AXI4-Stream (Non-blocking)
        M_AXIS_DATA_tdata : out std_logic_vector(31 downto 0);
        M_AXIS_DATA_tvalid : out std_logic;
        
        -- Active low reset
        RESN : in std_logic
	);
end amc_driver;

architecture rtl of amc_driver is
    
    ATTRIBUTE X_INTERFACE_INFO : STRING;
    ATTRIBUTE X_INTERFACE_INFO of CLK: SIGNAL is "xilinx.com:signal:clock:1.0 clk CLK";
	
    -- Divided CLK : f_CNR = f_CLK/M
    signal CNR : std_logic := '0';
    
    -- Intermediate signals of sinc3 filter
    signal DN0, DN1, DN3, DN5 : std_logic_vector(31 downto 0); 
    signal CN1, CN2, CN3, CN4, CN5 : std_logic_vector(31 downto 0); 
    signal DELTA1 : std_logic_vector(31 downto 0);
	
begin

    -- Generate divided CLK 
    P_CNR : process(CLK)
        variable CNR_cnt: unsigned(15 downto 0):=(others=>'0');
    begin
        if rising_edge(CLK) then
           M_AXIS_DATA_tvalid <= '0'; -- Default
           -- Toggle CNR
           if CNR_cnt+1 >= unsigned('0' & AMC_M(15 downto 1)) then
               CNR_cnt := (others => '0');
               CNR <= not CNR;
               if CNR = '0' then
                   M_AXIS_DATA_tvalid <= '1'; -- Data is valid at each CNR rising edge
               end if;
           else
               CNR_cnt := CNR_cnt + 1;
           end if;
        end if;
    end process P_CNR;
    
    -- sinc3 filter input
    process(CLK, RESN) 
     begin 
         if RESN = '0' then 
            DELTA1 <= (others => '0'); 
         elsif CLK'event and CLK = '1' then 
            if DATA_IN = '1' then 
                DELTA1 <= DELTA1 + 1; 
            end if; 
         end if; 
     end process;
     
     -- Integral
     process(RESN, CLK) 
     begin 
         if RESN = '0' then 
            CN1 <= (others => '0'); 
            CN2 <= (others => '0'); 
         elsif CLK'event and CLK = '1' then 
            CN1 <= CN1 + DELTA1; 
            CN2 <= CN2 + CN1; 
         end if; 
     end process;
     
     -- Comb
     process(RESN, CNR) 
     begin 
        if RESN = '0' then 
            DN0 <= (others => '0'); 
            DN1 <= (others => '0'); 
            DN3 <= (others => '0'); 
            DN5 <= (others => '0'); 
        elsif CNR'event and CNR = '1' then 
            DN0 <= CN2; 
            DN1 <= DN0; 
            DN3 <= CN3; 
            DN5 <= CN4; 
        end if; 
     end process;
     
     CN3 <= DN0 - DN1; 
     CN4 <= CN3 - DN3; 
     CN5 <= CN4 - DN5;
     M_AXIS_DATA_tdata <= CN5;

end rtl;

The AMC1035 sends a NRZ (Non-Return-to-Zero) coded bit stream
The clock is generated by the FPGA and fed to the AMC1035 and decoder block, synchronously
The decimation ratio M can be configured from the CPU using an SBO register
The output data is a 32-bit unsigned integer available on an AXI4-Stream interface

The sinc³ filter is implemented using CIC (Cascaded integrator–comb filter) architecture as shown below. CIC is an efficient implementation of a moving-average filter, which is built using adders and registers only. The relationship between the decimation ratio M and the output data width is given in the table below.

Xilinx sinc³ filter implementation (taken from TI document)

Decimation	Date Rate (kHz)	Gain_DC (bits)	Total Output Width (bits)
4	2500	6	7
8	1250	9	10
16	625	12	13
32	312.5	15	16
64	156.2	18	19

Summary of the sin³ filter for 10MHz samping clock

Manchester decoder

In Manchester coding mode, the AMC1035 can be clocked with a local clock, whose phase and frequency are unknown. On the FPGA side, this approach needs only one user port, for receiving the data stream, and one three-order sinc filter as decimator. In addition, a Manchester decoder is needed to translate Manchester-encoded data to NRZ. This mode has the advantage of reducing the number of used user ports but the drawback of a reduced data rate (down to 9~11MHz).

In general, the suggested system diagram using the Manchester decoder is shown below.

System diagram using Manchester decoder

The VHDL codes are given below.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.std_logic_unsigned.all;
use IEEE.NUMERIC_STD.ALL;

entity amc_driver_md is
	port(
	-- Configuration from CPU
        AMC_M: in std_logic_vector(15 downto 0); -- Decimation ratio
        
	-- Input data from delta-sigma modulator
        DATA_IN  : in std_logic;  -- AMC DATA input
        
        -- sinc3 filter output using AXI4-Stream (Non-blocking)
        M_AXIS_DATA_tdata : out std_logic_vector(31 downto 0);
        M_AXIS_DATA_tvalid : out std_logic;
        
        -- Decoder clock
	CLK : in std_logic;
		
	-- Active low reset
	RESN : in std_logic
	);
end amc_driver_md;

architecture rtl of amc_driver_md is

    ATTRIBUTE X_INTERFACE_INFO : STRING;
    ATTRIBUTE X_INTERFACE_INFO of CLK: SIGNAL is "xilinx.com:signal:clock:1.0 clk CLK";
    
    -- Intermediate signals of Manchester decoder
    signal Q0, Q1, Q2, Q3, Q4 : std_logic := '0';
    signal INV_1, INV_2, XOR2, OR2_1, AND2B1, AND2, OR2_2, AND3B2 : std_logic := '0';
    
    -- Manchester decoder output
    signal DATA_MD : std_logic; -- Output data
    signal STROBE : std_logic;  -- Output data valid
    
    -- Divided AMC_CLK : f_CNR = f_CLK/M
    signal CNR_rising_edge : std_logic;
    signal CNR_cnt : unsigned(15 downto 0) := (others=>'0');
     
    -- Intermediate signals of sinc3 filter
    signal DN0, DN1, DN3, DN5 : std_logic_vector(31 downto 0); 
    signal CN1, CN2, CN3, CN4, CN5 : std_logic_vector(31 downto 0); 
    signal DELTA1 : std_logic_vector(31 downto 0);

begin
 
    -- Manchester decoder
    DATA_MD <= INV_2; -- 0: falling 1: rising
    INV_1 <= not Q0;
    INV_2 <= not Q1;
    XOR2 <= Q0 xor INV_2;
    OR2_1 <= XOR2 or Q2;
    AND2B1 <= OR2_1 and (not Q4);
    AND2 <= Q3 and OR2_2;
    OR2_2 <= Q2 or Q4;
    AND3B2 <= (not Q2) and (not Q4) and XOR2;
    
    P_MD : process(CLK)
    begin
        if rising_edge(CLK) then
            Q0 <= DATA_IN;
            Q1 <= INV_1;
            Q2 <= AND2B1;
            Q3 <= Q2;
            Q4 <= AND2;
            STROBE <= AND3B2;
        end if;
    end process P_MD;
    
    -- Generate CNR
     P_CNR : process(CLK, RESN)
     begin
         if RESN = '0' then
             CNR_cnt <= (others=>'0');
         elsif rising_edge(CLK) then
             CNR_rising_edge <= '0';
             if STROBE = '1' then -- Data is valid
                 -- Toggle postscaled_CNR
                 if CNR_cnt+1 >= unsigned(AMC_M) then
                     CNR_rising_edge <= '1';
                     CNR_cnt <= (others => '0');
                 else
                     CNR_cnt <= CNR_cnt + 1;
                 end if;
             end if;
         end if;
     end process P_CNR;
     
     -- sinc3 filter
     P_SINC3 : process(CLK, RESN)
     begin
         if RESN = '0' then 
             DELTA1 <= (others => '0');
             CN1 <= (others => '0'); 
             CN2 <= (others => '0');
             DN0 <= (others => '0'); 
             DN1 <= (others => '0'); 
             DN3 <= (others => '0'); 
             DN5 <= (others => '0');
         elsif rising_edge(CLK) then
             M_AXIS_DATA_tvalid <= '0'; -- Default
             -- Integral
             if STROBE = '1' then -- Data is valid
                 if DATA_MD = '1' then 
                     DELTA1 <= DELTA1 + 1; 
                 end if;
                 CN1 <= CN1 + DELTA1; 
                 CN2 <= CN2 + CN1;
             end if;
             -- Comb
             if CNR_rising_edge = '1' then
                 M_AXIS_DATA_tvalid <= '1';
                 DN0 <= CN2; 
                 DN1 <= DN0; 
                 DN3 <= CN3; 
                 DN5 <= CN4;
             end if;
         end if;
     end process P_SINC3;
     
     CN3 <= DN0 - DN1; 
     CN4 <= CN3 - DN3; 
     CN5 <= CN4 - DN5;
     M_AXIS_DATA_tdata <= CN5;
    
end rtl;

The AMC1035 works in the Manchester coding mode
The clock used by the Manchester decoder can be asynchronous to the input data, but must be between 5 and 12 times, nominally 8 times, faster than the input data rate. For the AMC1035 we can simply use a 80MHz clock
The decimation ratio M can be configured by the CPU using an SBO register
The output data is a 32-bit unsigned integer using AXI4-Stream interface

This FPGA implementation of a Manchester decoder uses two registers and a XOR gate for transition detect, and one divide-by-six Johnson counter that locks up in the 000 state. Once a transition is detected, the STROBE flag will be asserted, indicating valid data, then the 6-counter will terminate STROBE for the following 5 periods. This procedure ensures that no between-bit transition is detected by mistake.

Manchester decoder circuit (taken from Manchester decoder in 3 CLBs)

The implementation of sinc³ filter uses the previously introduced CIC architecture. However, due to the different data-valid mechanism, the sinc³ filter only accepts input data when STROBE is asserted, whereas in the previous design, it accepts data at each rising edge of the synchronous clock.

Both blocks have the same name CLK for the clock input. They shall however NOT be confused.
For the synchronous decoder block, the CLK is the clock used by the AMC1035 running at 9~21MHz.
For the Manchester decoder block, the CLK is a specialized clock for Manchester decoding, running at 80MHz.
The reason why the same name is used for both clocks is that the AXI4-Stream interface is used to simplify the connection between blocks. And any clock used by the AXI4-Stream must be named according to Xilinx conventions. Otherwise, a clock cannot be automatically recognized.

Delta-sigma modulator testbench

For each block, a VHDL testbench simulating NRZ/Manchester coded bit streams is provided to validate the behavior of the two decoders.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity AMC_driver_tb is
end entity AMC_driver_tb;

architecture rtl of AMC_driver_tb is

constant MCLK_PERIOD : time := 100ns; -- DSM CLK
constant MCLK_LOW    : time := MCLK_PERIOD / 2;
constant MCLK_HIGH   : time := MCLK_PERIOD / 2;
signal mclk : std_logic;

signal RESN : std_logic;

signal DATA_IN : std_logic;
signal M : std_logic_vector(15 downto 0) := std_logic_vector(to_unsigned(8, 16)); -- Decimation ratio

component amc_driver is
	port(   
	-- Configuration from CPU
        AMC_M : in std_logic_vector(15 downto 0); -- Decimation ratio
        
	-- Input from AMC1035:
        DATA_IN : in std_logic;  -- Input bit stream
        CLK  : in std_logic; -- AMC clock input
        
        -- sinc3 filter output using AXI4-Stream (Non-blocking)
        M_AXIS_DATA_tdata : out std_logic_vector(31 downto 0);
        M_AXIS_DATA_tvalid : out std_logic;
        
        -- Active low reset
        RESN : in std_logic
	);
end component amc_driver;

-- input NRZ coded '0'
procedure input_0 (signal DATA_IN : inout std_logic) is
begin
    wait until rising_edge(mclk);
    DATA_IN <= '0';
    wait for MCLK_PERIOD;
end procedure input_0;

-- input NRZ coded '1'
procedure input_1 (signal DATA_IN : inout std_logic) is
begin
    wait until rising_edge(mclk);
    DATA_IN <= '1';
    wait for MCLK_PERIOD;
end procedure input_1;

begin
    dut: amc_driver
    port map (
        AMC_M => M,
        DATA_IN => DATA_IN,
        CLK  => mclk,
        M_AXIS_DATA_tdata => open,
        M_AXIS_DATA_tvalid => open,
        RESN => RESN
    );

    -- Reset process
    p_reset : process is
    begin
        wait until falling_edge(mclk);
        RESN <= '0';
        wait for 3*MCLK_PERIOD;
        RESN <= '1';
        wait;
    end process p_reset;
    
    -- MCLOCK process
    p_mclk: process is
    begin
        mclk <= '0';
        wait for 0.4 * MCLK_LOW;
        mclk <= '1';
        WAIT FOR MCLK_HIGH;
        mclk <= '0';
        wait for 0.6 * MCLK_LOW;
    end process p_mclk;
    
    -- Test process
    p_test : process is
    begin
    
        -- input "1001"
        input_1(DATA_IN);
        input_0(DATA_IN);
        input_0(DATA_IN);
        input_1(DATA_IN);
        
--        -- input "1000"
--        input_1(DATA_IN);
--        input_0(DATA_IN);
--        input_0(DATA_IN);
--        input_0(DATA_IN);
        
--        -- input "1110"
--        input_1(DATA_IN);
--        input_1(DATA_IN);
--        input_1(DATA_IN);
--        input_0(DATA_IN);
        
        -- Add more input patterns

    end process p_test;
    
end architecture rtl;

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity AMC_driver_md_tb is
end entity AMC_driver_md_tb;

architecture rtl of AMC_driver_md_tb is

constant CLK_PERIOD : time := 12.5ns; -- Decoder CLK
constant CLK_LOW    : time := CLK_PERIOD / 2;
constant CLK_HIGH   : time := CLK_PERIOD / 2;
signal clk : std_logic;

constant MCLK_PERIOD : time := 100ns; -- DSM CLK
constant MCLK_LOW    : time := MCLK_PERIOD / 2;
constant MCLK_HIGH   : time := MCLK_PERIOD / 2;
signal mclk : std_logic;

signal RESN : std_logic;

signal DATA_IN : std_logic;
signal M : std_logic_vector(15 downto 0) := std_logic_vector(to_unsigned(8, 16)); -- Decimation ratio

component amc_driver_md is
	port(
	-- Configuration from CPU
        AMC_M: in std_logic_vector(15 downto 0); -- Decimation ratio
        
	-- Input data from delta-sigma modulator
        DATA_IN  : in std_logic;  -- AMC DATA input
        
        -- sinc3 filter output using AXI4-Stream (Non-blocking)
        M_AXIS_DATA_tdata : out std_logic_vector(31 downto 0);
        M_AXIS_DATA_tvalid : out std_logic;
        
        -- Decoder clock
	CLK : in std_logic;
		
	-- Active low reset
	RESN : in std_logic
	);
end component amc_driver_md;

-- input Manchester coded '0'
procedure input_0 (signal DATA_IN : inout std_logic) is
begin
    wait until rising_edge(mclk);
    DATA_IN <= '1';
    wait until falling_edge(mclk);
    DATA_IN <= '0';
end procedure input_0;

-- input Manchester coded '1'
procedure input_1 (signal DATA_IN : inout std_logic) is
begin
    wait until rising_edge(mclk);
    DATA_IN <= '0';
    wait until falling_edge(mclk);
    DATA_IN <= '1';
end procedure input_1;

begin
    dut: amc_driver_md
    port map (
        AMC_M => M,
        DATA_IN => DATA_IN,
        M_AXIS_DATA_tdata => open,
        M_AXIS_DATA_tvalid => open,
		CLK => clk,
		RESN => RESN
    );

    -- Reset process
    p_reset : process is
    begin
        wait until falling_edge(clk);
        RESN <= '0';
        wait for 3*CLK_PERIOD;
        RESN <= '1';
        wait;
    end process p_reset;
    
    -- Decoder CLOCK process
    p_clk: process is
    begin
        clk <= '0';
        wait for CLK_LOW;
        clk <= '1';
        WAIT FOR CLK_HIGH;
    end process p_clk;
    
    -- DSM CLOCK process
    p_mclk: process is
    begin
        mclk <= '0';
        wait for 0.4 * MCLK_LOW;
        mclk <= '1';
        WAIT FOR MCLK_HIGH;
        mclk <= '0';
        wait for 0.6 * MCLK_LOW;
    end process p_mclk;
    
    -- Test process
    p_test : process is
    begin 
        
        -- input "1001"
        input_1(DATA_IN);
        input_0(DATA_IN);
        input_0(DATA_IN);
        input_1(DATA_IN);
        
--        -- input "1000"
--        input_1(DATA_IN);
--        input_0(DATA_IN);
--        input_0(DATA_IN);
--        input_0(DATA_IN);
        
--        -- input "1110"
--        input_1(DATA_IN);
--        input_1(DATA_IN);
--        input_1(DATA_IN);
--        input_0(DATA_IN);
        
        -- Add more input patterns
        
    end process p_test;
    
end architecture rtl;

In this testbench, we chose a decimation rate of $M = 8 $. The output word size is then 9 bits, and the maximum output range is $2^9-1 = 511$. We can select different input patterns, and the sinc³ filter output will be proportional to the number of ‘1’s in the bit stream.

Input pattern = “1001” (50% 1s), output = 256

Synchronous decoder

Manchester decoder

Input pattern = “1000” (25% 1s), output = 128

Synchronous decoder

Manchester decoder

Deployment on the B-Board PRO

This Vivado project shows how to add the synchronous decoder and/or Manchester decoder block to the B-Board firmware and send output data to the CPU through ix_axis_interface. This project starts from the template introduced in Getting started with FPGA control development, and the following blocks are added to the project.

clk_10m is the 10MHz clock source for the AMC1035 and its output is connected to the physical pin USR[0]
clk_80m is the 80MHz clock for Manchester decoding
amc_driver_0 and amc_driver_md_0 are the synchronous decoder and Manchester decoder blocks, their input DATA_IN is connected to the physical pin USR[1]
Two AXI4-Stream FIFOs are used to deal with the asynchronous data transfer between different clock fields, their outputs are sent to the CPU through FPGA2CPU_00 and FPGA2CPU_01
The decimation ratio M is sent to the FPGA through SBO_reg_32
proc_sys_reset_10mhz and proc_sys_reset_80mhz provide active low resets for the 10MHz and 80MHz clock fields, their ext_reset_n input is connected to nReset_sync port of ix_axis_interface

This project has both implementations integrated. In a real application, users can choose one approach or the other, according to their needs.

Before synthesizing the project, Vivado will report timing failure at reg_M because the asynchronous data transfer (due to the 250MHz FPGA main clock and the 10MHz/80MHz decoder clock) may lead to a metastable state. However, since we know that, in reality, there will be enough time to wait for a new stable state, this issue can be safely ignored by setting a longer maximum delay. In order to do this, a new constraint file must be added to the Vivado project following the instructions in Getting started with FPGA control development. The constraints below shall be sufficient to circumvent this issue.

set_max_delay -from [get_pins {top_i/reg_M/U0/i_synth/i_bb_inst/gen_output_regs.output_regs/i_no_async_controls.output_reg[*]/C}] -to [get_pins {top_i/amc_driver_0/U0/P_CNR.CNR_cnt_reg[*]/R}] 4.0
set_max_delay -from [get_pins {top_i/reg_M/U0/i_synth/i_bb_inst/gen_output_regs.output_regs/i_no_async_controls.output_reg[*]/C}] -to [get_pins top_i/amc_driver_0/U0/CNR_reg/D] 4.0

set_max_delay -from [get_pins {top_i/reg_M/U0/i_synth/i_bb_inst/gen_output_regs.output_regs/i_no_async_controls.output_reg[*]/C}] -to [get_pins {top_i/amc_driver_md_0/U0/CNR_cnt_reg[*]/D}] 4.0
set_max_delay -from [get_pins {top_i/reg_M/U0/i_synth/i_bb_inst/gen_output_regs.output_regs/i_no_async_controls.output_reg[*]/C}] -to [get_pins top_i/amc_driver_md_0/U0/CNR_rising_edge_reg/D] 4.0

On the CPU side, a Simulink file is provided, which reads the sinc3 filter output and converts the uint32 data to Volts. As introduced in its datasheet, the full-scale input range of AMC1035 is +/-1.25 V, then the sinc³ filter with decimation ratio $M $ converts the input to $1+3\log_{2}{M} $ bits. Based on this, data can be recovered using the program below.

Experimental results

The following hardware was used:

B-Board evaluation kit
AMC1035 evaluation module
Function generator

A 50Hz 2V (p-p voltage) sine wave is connected to ADC 0 of the B-Board and the input port of the AMC1035, and the results are plotted in BB Control. The three signals overlap properly, showing that the synchronous decoder and Manchester decoder blocks for the delta-sigma modulator both work properly.

The post FPGA-based decoder for a Delta-Sigma modulator appeared first on imperix.

FPGA-based SPI communication IP for ADC

Benoît Steinmann — Fri, 02 Apr 2021 12:28:46 +0000

This technical note shows how an SPI communication link can be established between an FPGA and an external Analog-to-Digital Converter (ADC). The development setup will consist of an imperix B-Board PRO evaluation kit and an LTC2314 demonstration circuit. The LTC2314 ADC driver will be developed using VHDL integrated into the user-programmable area (the sandbox) of the FPGA thanks to the FPGA customization feature of the imperix controllers. Three of the 36 user-configurable 3.3V I/Os of the B-Board will be used for the SPI communication with the ADC.

This note provides a VHDL implementation of the FPGA ADC driver. However, automated HDL code generation tools such as MATLAB HDL Coder or Xilinx System Generator can be used to create FPGA peripherals as shown on the custom FPGA PWM page.

To find all FPGA-related notes, you can visit FPGA development homepage.

Information on how to set up the toolchain for the FPGA programming is available on the Vivado Design Suite installation page.

Quick-start information on how to use the sandbox is provided on the getting started with FPGA page.

Software resources

The FPGA ADC driver resources can be downloaded by clicking on the button below. It contains the VHDL driver LT2314_driver.vhd, its associated testbench LT2314_tb.vhd, as well as the C++ drivers implemented using the C++ SDK.

Click to download TN130_LTC2314_ADC_FPGA_driver.zip

FPGA ADC implementation

This example implements a full-custom FPGA ADC SPI driver for the LTC2314-14 serial sampling ADC with the following settings:

It uses the LTC2314 SCK continuous mode (see next figure)
The SCK frequency is configurable using a postscaler (postscaler_in)
The conversion is started upon the assertion of sampling_pulse

TN130: FPGA-based SPI communication IP for A/D converter > LTC2314_timings.png"/>

LTC2314-14 Serial Interface Timing Diagram in SCK Continuous Mode (source LTC2314 datasheet)

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
 
entity LT2314_driver is
port(
    -- CLOCKS:
    clk_250: in std_logic; -- 250 MHz clock
    sampling_pulse: in std_logic; -- sampling strobe
 
    -- CONFIGURATION:
    -- spi_sck = clk_250 / (postscaler_in*2)
    postscaler_in: in std_logic_vector(15 downto 0);
 
    -- OUTPUT DATA:
    data_out: out std_logic_vector(15 downto 0) := (others => '0');
 
    -- SPI SIGNALS:
    spi_sck: out std_logic; -- communication clock
    spi_cs_n: out std_logic; -- chip select strobe / sampling trigger
    spi_din: in std_logic -- serial data in
);
end LT2314_driver;
 
architecture impl of LT2314_driver is
 
    TYPE states is (ACQ,CONV);
 
    SIGNAL state : states := ACQ; -- FSM state register
 
    -- Signal used as SPI communication clock
    -- spi_sck = postscaled_clk = clk_250 / (postscaler_in*2)
    SIGNAL postscaled_clk : std_logic := '0';
 
    -- Indicates a rising edge on postscaled_clk
    SIGNAL postscaled_clk_rising_pulse : std_logic := '0';
 
    -- Asserted when sampling_pulse = '1'
    -- Cleared when postscaled_clk_rising_pulse = '1'
    SIGNAL pulse_detected : std_logic := '0';
begin
 
    spi_sck <= postscaled_clk;
    spi_cs_n <= '1' when state=ACQ else '0';
 
    -- Generate postscaled_clk and postscaled_clk_rising_pulse
    POSTSCALER: process(clk_250)
        variable postscaler_cnt: unsigned(15 downto 0):=(others=>'0');
    begin
        if rising_edge(clk_250) then
            postscaled_clk_rising_pulse <= '0';
 
            -- Toggle postscaled_clk
            -- Assert postscaled_clk_rising_pulse if rising edge
            if postscaler_cnt+1 >= unsigned(postscaler_in) then
                if postscaled_clk = '0' then
                    postscaled_clk_rising_pulse <= '1';
                end if;
                postscaler_cnt := (others => '0');
                postscaled_clk <= not postscaled_clk;
            else
                postscaler_cnt := postscaler_cnt + 1;
            end if;
        end if;
    end process POSTSCALER;
 
    -- Generate pulse_detected
    SAMPLING: process(clk_250)
    begin
        if rising_edge(clk_250) then
            if sampling_pulse = '1' then
                pulse_detected <= '1';
            elsif postscaled_clk_rising_pulse = '1' then
                pulse_detected <= '0';
            end if;
        end if;
    end process SAMPLING;
 
    -- Finite State Machine
    -- Run at SPI clock speed (using postscaled_clk_rising_pulse=
    FSM : process(clk_250)
        variable bit_cnt : unsigned(4 downto 0) := (others=>'0'); -- bit counter
    begin
        if rising_edge(clk_250) and postscaled_clk_rising_pulse = '1' then
            case state is
 
                when ACQ =>
                    bit_cnt := (others => '0');
                    if pulse_detected = '1' then
                        state <= CONV;
                    end if;
 
                when CONV =>
                    bit_cnt := bit_cnt + 1;
                    if bit_cnt >= 16 then
                        state <= ACQ;
                    end if;
 
                when others => null;
            end case;
        end if;
    end process FSM;
 
    -- Sample spi_din on spi_sck rising edge during ACQUISITION phase
    SHIFT_REG: process (clk_250)
        variable data_reg: std_logic_vector(15 downto 0):=(others=>'0');
    begin
        if rising_edge(clk_250) then
            if state = CONV and postscaled_clk_rising_pulse = '1' then
                data_reg := data_reg(14 downto 0) & spi_din;
            elsif state = ACQ then
                data_out <= "0" & data_reg(15 downto 1); -- re-align data
            end if;
        end if;
    end process SHIFT_REG;
end impl;

FPGA ADC testbench

A VHDL testbench modeling the LTC2314 behavior has been written in order to validate the FPGA ADC driver behavior.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
 
entity LT2314_tb is end;
 
architecture bench of LT2314_tb is
     
    -- number of blank bits provided by the ADC
    constant NBLANKBITS : positive := 1;
     
    -- SCK = CLK_250_MHZ / (POSTSCALER*2) = 62.5 MHz
    constant SCK_POSTSCALER : std_logic_vector := "0000000000000010";
     
    -- main clock period
    constant CLK_PERIOD : time := 4.0 ns; -- 250 MHz
     
    -- simulated data sample produced by the ADC
    signal rawdata : unsigned(13 downto 0) := (others=>'0');
     
    -- clock signals
    signal clk_250, sampling_pulse : std_logic := '0';
     
    -- SPI signals
    signal SPI_DIN, SPI_nCS, SPI_CLK : std_logic := '0';
     
begin
         
    primary_clock: clk_250 <= not clk_250 after CLK_PERIOD / 2;
         
    --------------------------------------------------------------------------------
    -- DEVICE UNDER TEST
    --------------------------------------------------------------------------------
         
    DUT: entity work.LT2314_driver
    port map(
        clk_250 => clk_250,
        sampling_pulse => sampling_pulse,
        postscaler_in => SCK_POSTSCALER,
        spi_sck => SPI_CLK,
        spi_cs_n => SPI_nCS,
        spi_din => SPI_DIN,
        data_out => open);
         
    --------------------------------------------------------------------------------
    -- ANALOG-TO-DIGITAL CONVERTER MODEL
    --------------------------------------------------------------------------------
         
    DATA_SAMPLE: process
    begin
        wait for CLK_PERIOD*100;
             
        rawdata <= to_unsigned(12345,14);
        sampling_pulse <= '1';
        wait for CLK_PERIOD;
        sampling_pulse <= '0';
             
        wait for CLK_PERIOD*100;
             
        rawdata <= to_unsigned(5782,14);
        sampling_pulse <= '1';
        wait for CLK_PERIOD;
        sampling_pulse <= '0';
             
        wait for CLK_PERIOD*100;
             
        rawdata <= to_unsigned(777,14);
        sampling_pulse <= '1';
        wait for CLK_PERIOD;
        sampling_pulse <= '0';
             
    end process DATA_SAMPLE;
         
    SPI_TARGET: process(SPI_nCS,SPI_CLK,SPI_DIN)
    variable counter : integer := 0;
    begin
        if SPI_nCS='1' then
            SPI_DIN <= 'Z';
            counter := 13 + NBLANKBITS;
        elsif SPI_nCS='0' and falling_edge(SPI_CLK) then
            if (counter > 13 or counter < 0) then
                SPI_DIN <= '0';
            else
                SPI_DIN <= std_logic(rawdata(counter));
            end if;
            counter := counter - 1;
        end if;
    end process SPI_TARGET;
         
end architecture bench;

TN130: FPGA-based SPI communication IP for A/D converter > sandbox_spi_sim.png"/>

Deployment on the B-Board PRO FPGA

To learn how to add a VHDL module into B-Board FPGA firmware using Xilinx Vivado, please read the getting started with FPGA page. The ADC SPI driver has interfaced as follow:

spi_sck is connected to the physical pin USR[0]
spi_cs_n is connected to the physical pin USR[1]
spi_din is connected to the physical pin USR[2]
postscaler_in is connected to SBO_reg_00 (configuration register)
data_out is connected to SBI_reg_00 (real-time register)

From SDK version 2024.2, ports SBI and SBO on the imperix IP are replaced by the SBIO_BUS. The sbio_register block must be used to access the SBI and SBO registers. More information about SBIO_BUS can be found on the Getting Started with FPGA Control Development page.

Furthermore, the signals spi_sck, spi_cs_n, spi_din, data_out and sampling_pulse are also connected to an Integrated Logic Analyzer (ILA), allowing them to be observed during run-time.

TN130: FPGA-based SPI communication IP for A/D converter > sandbox_spi_vivado_blocks.png"/>

Interfacing of the ADC driver in the B-Board FPGA

Using the imperix 3.3V USR pins

The SPI signals (SCK, nCS, and MISO) of the ADC driver are connected to 3 of the 36 user-configurable 3.3V I/Os of the B-Board (usr_0, usr_1, and usr_2). The physical pin constraint file sandbox_pins.xdc file must be edited by the user to match the external port names.

From version 3.7, a USR interface is present in the imperix firmware IP. This port must be disconnected to use USR pins for other applications. Imperix only uses USR for communication with the motor interface.

Experimental results

The following hardware was used:

B-Board evaluation kit
LTC2314 demonstration circuit
Xilinx JTAG Platform Cable USB II
DSLogic Plus logic analyzer

TN130: FPGA-based SPI communication IP for A/D converter > IMG_20200218_091739.jpg"/>

The following C++ code has been used to test the LT2314 driver.

define ADC_GAIN (4.096/8192.0)
 
int adc_raw;
float Vmeas;
 
tUserSafe UserInit(void)
{
  Clock_SetFrequency(CLOCK_0, 20e3);
  ConfigureMainInterrupt(UserInterrupt, CLOCK_0, 0.5);
 
  Sbi_ConfigureAsRealTime(0); // SBI_reg_00 contains the ADC value (LT2314_driver data_out)
  Sbo_WriteDirectly(0, 2);    // SBO_reg_00 is the clk postscaler (LT2314_driver postscaler_in)
                              // postscaler = 2 -> SCK = 62.5 MHz
  return SAFE;
}
 
tUserSafe UserInterrupt(void)
{
  adc_raw = Sbi_Read(0); // read SBI_reg_00
  Vmeas = adc_raw * ADC_GAIN; // convert to Volts
 
  return SAFE;
}

The external SPI signals can be observed using a physical logic analyzer such as the DSLogic Plus:

TN130: FPGA-based SPI communication IP for A/D converter > sandbox_spi_dslogic.png"/>

Secondly, the Xilinx Integrated Logic Analyzer (ILA) allows to observe internal signals too:

TN130: FPGA-based SPI communication IP for A/D converter > sandbox_spi_ILA.png"/>

Finally, the end result can be plotted in the Cockpit monitoring software, attesting that the SPI module works correctly.

Back to FPGA development homepage

The post FPGA-based SPI communication IP for ADC appeared first on imperix.