Intel® FPGAs for OpenCL™ - Support Center

Welcome to the OpenCL™ BSP support page! Here you will find information on how to plan, design, and implement your OpenCL™ BSP, as well as learn a few tips and tricks for debugging purposes.

This page is set up to walk you through from start to finish the process of developing an OpenCL™ Board Support Package (BSP) (or designing/migrating OpenCL kernel/algorithms). In the Modify a Reference Design section you will find resources on how to modify the Intel® reference platform into your own custom platform as well as how to compile flat designs without timing failure. The Floor Planning section provides guidance on how to partition your design and to achieve maximum operating frequency. The Timing Closure section describes the techniques to close timing on your design and have a guaranteed timing closure while compiling any kernel against the BSP. The Testing the Hardware section provides steps on how to test your design on your board and verify the result.

The Debug section provides you with some tools and resources for debugging issues you might encounter. There are documents and training courses listed in all the sections that are helpful during the BSP development process.

Intel® FPGA SDK for OpenCL™ enables software developers to accelerate their applications by targeting heterogeneous platforms with Intel® CPUs and FPGAs. You can also download the Intel® FPGA SDK for OpenCL™ separately from the Quartus® software.

To get started with the BSP development, ensure that you perform the following steps:

  1. Confirm that the Intel® FPGA SDK for OpenCL™ and Intel® Quartus® software is installed.
  2. Verify that the tool version that matches the OpenCL™ reference BSP is available.
  3. Confirm access to the full Intel® Quartus® software license.

Select the reference design that suits your custom platform

Intel supports the OpenCL™ reference BSP designs for the following platforms. You can also view the OpenCL™ BSP porting guide for a specific platform:

Start modifying the reference design to your platfrom by following the steps in the OpenCL BSP porting guides. It is recommended that after the design changes are complete, you should try compiling your first kernel. Generally, we use a kernel called Boardtest, which tests out different interfaces of the BSP.  Information on the Boardtest and generic BSP building steps are mentioned in the following guide:

Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide (PDF)

Recommended steps to build a BSP:

  1. Compile the Boardtest in "flat" flow to generate a timing closed ".aocx" file
  2. Validate the ".aocx" by running the Boardtest and cross-check the interface bandwidth expections from the test
  3. Start working on floor-planning for "base" build to create a guaranteed timing-closed OpenCL BSP

In OpenCL, we need to work on the timing for two different revisions of the project – the flat and the base revisions. A flat revision is the one without any partitions or logic lock regions and uses the hardware/flat.qsf file to implement it. While the base revision is the one which includes the partitioning and the logic locks, and uses the hardware/base.qsf file to implement it. We recommend that you get a timing-clean flat revision first as a good start and then work on floor planning to get a timing-clean base revision of the design. 

For more details on the compilation flow, refer the OpenCL™ BSP Compilation Flow section in the Intel® FPGA SDK for OpenCL™ Board Support Package Floorplan Optimization Guide.

Begin with flat compilation to understand where all the main components of the BSP gets placed naturally (especially the intellectual property (IP) blocks with I/O connections, such as PCIe* or DDR memory). 

For more guidelines on this, refer to the Guidelines for OpenCL™ BSP Floorplanning section in the Intel® FPGA SDK for OpenCL™ Board Support Package Floorplan Optimization Guide.

For details, you can also refer to the Design Planning for Partial Reconfiguration chapter in volume 1 of the Intel® Quartus® Prime Standard Edition Handbook.

During base compilation, start with the Logic Lock Region on kernel that contains  freeze_wrapper_inst|kernel_system_inst. Use the flat compile and chip planner to identify the size and location of the BSP hardware. Attempt to reserve more resources for the kernel_system by using the Logic Lock Region.

For more guidelines on this, refer to the Guidelines for OpenCL™ BSP Floorplanning section in the Intel® FPGA SDK for OpenCL™ Board Support Package Floorplan Optimization Guide.

To fix timing violations in the design, you might need to add pipeline stages in between IP cores.

For more guidelines, refer the following links:

The .failing_paths.rpt and .failing_clocks.rpt in the output directory list the major failures in the design. If there is a consistent failure in some of the paths , you might want to set a minimum or maximum delay constraint for that critical path inside the /hardware/top.sdc file. 

For related issues, you can refer to the following workaround method on the Knowledge Base page—How to close timing on competing hold and setup violations in Arria 10?

MMD software library implements basic input/output (I/O) between the host and the acceleration board and provides interfaces, such as open, read, and write. The MMD library driver is stored as Windows* 64 or Linux* 64 formats, and the source code is stored in the source folder.

For more information, refer to the Creating the MMD Library section in the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide.

OpenCL™ utilities allow you to perform board access using Intel® FPGA SDK for OpenCL™.  This includes aocl install, aocl uninstall, aocl diagnose, aocl program, and aocl flash.

For more information, refer to the Providing Intel® FPGA SDK for OpenCL™ Utilities Support section in the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide. 

After you create the software utilities and the MMD layer, the hardware design needs to be tested. The standard way is to generate the boardtest kernel and run on the board.

For more information, refer to the Testing the Hardware Design section in the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide. 

This section helps you to troubleshoot issues while bringing up either Intel® FPGA development kits or your own custom boards. 

To find out some known issues that you might face while bringing up your boards, refer to the following sections in AN 807: Configuring the Intel® Arria® 10 GX FPGA Development Kit for the Intel® FPGA SDK for OpenCL™ Application Note:

For tips and tricks on using minimal area for the static logic and leaving more space for your OpenCL™ kernel, you can refer to the AN 824: Intel® FPGA SDK for OpenCL™ Board Support Package Floorplan Optimization Guide.

There are certain environment variables that can be set to get more debug information while running the host application. These are Intel® FPGA SDK for OpenCL™ specific environment variables, which can help diagnose problems with custom platform designs. 

The following table lists all of these environment variables as well as describes them in detail.

Environment Variable Description
ACL_HAL_DEBUG Set this variable to a value of 1 to 5 to increase debug output from the Hardware Abstraction Layer (HAL), which interfaces directly with the MMD layer.
ACL_PCIE_DEBUG Set this variable to a value of 1 to 10000 to increase debug output from the MMD. This variable setting is useful for confirming that the version ID register was read correctly and the UniPHY IP cores are calibrated.
ACL_PCIE_JTAG_CABLE Set this variable to override the default quartus_pgm argument that specifies the cable number. The default is cable 1. If there are multiple Intel® FPGA Download Cable, you can specify a particular cable by setting this variable.
ACL_PCIE_JTAG_DEVICE_INDEX Set this variable to override the default quartus_pgm argument that specifies the FPGA device index. By default, this variable has a value of 1. If the FPGA is not the first device in the JTAG chain, you can customize the value.
ACL_PCIE_USE_JTAG_PROGRAMMING Set this variable to force the MMD to reprogram the FPGA using the JTAG cable instead of Partial Reconfiguration. 
ACL_PCIE_DMA_USE_MSI Set this variable if you want to use MSI for direct memory access (DMA) transfers on Windows*.
CL_CONTEXT_COMPILER_MODE_INTELFPGA Unset this variable or set it to a value of 3. The OpenCL™ host runtime reprograms the FPGA as needed, which it does at least once during initialization. To prevent the host application from programming the FPGA, set this variable to a value of 3.

Because OpenCL™ designs do not support simulation feature, using the Signal Tap Logic Analyzer is the best way to debug these designs.

To debug any design where there is a kernel hang or an issue related to memory interface or aocl diagnsoe failure, using the Signal Tap Logic Analyzer is recommended.

To learn more about the Signal Tap Logic Analyzer, refer to the Design Debugging with the Signal Tap Logic Analyzer section in volume 3 of the Intel® Quartus® Prime Pro Edition Handbook

Perform the following steps to add the Signal Tap file into the BSP design:

  1. Open the Signal Tap GUI and all the signals to be analyzed.
  2.  Save the STP file in the same directory as the Intel® Quartus® software project file.
  3. Add the following command lines into your flat.qsf:
    • set_global_assignment -name ENABLE_SIGNALTAP ON
    • set_global_assignment -name USE_SIGNALTAP_FILE <file_name>.stp
    • set_global_assignment -name SIGNALTAP_FILE <file_name>.stp
  4. Recompile the kernel from the AOCL command line.

Other Technologies

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.