Altera Home Page
Literature Licensing
Buy On-Line Download

  Home   |   Products   |   Support   |   End Markets   |   Technology Center   |   Education & Events   |   Corporate   |   Buy On-Line  
  Devices   |   Design Software   |   Intellectual Property   |   Design Services   |   Dev. Kits/Cables   |   Literature  

 High-End FPGAs
      About Stratix Series
   Stratix IV (E and GX)
   Stratix III (L and E)
   Stratix II (and GX)
       Stratix II
               Overview
               Design Utilities
               Features
               Literature
       Stratix II GX
   Stratix (and GX)
  
 Midrange FPGAs
   Arria (GX)
  
 Low-Cost FPGAs
   Cyclone III
   Cyclone II
   Cyclone
  
 CPLDs
   MAX II (and G, Z)
   MAX 3000A
  
 ASICs
      About HardCopy Series
   HardCopy IV (E and GX)
   HardCopy III
   HardCopy II
   HardCopy Stratix
  
 Device-Specific Offerings
   RoHS Compliant
      Extended Temperature
      Industrial Temperature
      Military Temperature
      Automotive Temperature
  
 Configuration Devices
   Enhanced Configuration
   Serial Configuration
  
 Mature Products
      Product Listing
  

Stratix II FPGA Design Building Block Performance

Design building blocks are a set of simple circuits that perform basic logic or arithmetic operations. These building blocks are commonly found in larger, more complex designs.

To highlight the Stratix® II FPGA family’s performance and logic efficiency, twelve examples are benchmarked below. See Figure 1 for a comparison of the performance and logic utilization of Stratix and Stratix II devices based on the twelve building block benchmarked examples.

Detailed analysis of the Stratix II architectural advancements and benefits are available in the Stratix II Device Performance & Logic Efficiency Analysis (PDF) white paper.

Figure 1. Stratix II vs. Stratix Devices—Design Building Block Performance Comparison

Figure 1. Stratix II vs. Stratix Devices

Notes to Figure 1:
1. LUT = Look-up table
2. DES = Data encryption standard
3. FSM = Finite state machine
4. DSP = Digital signal processing

Figure 2. Stratix II vs. Stratix Devices—Logic Utilization Comparison

Figure 2. Stratix II vs. Stratix Devices

In addition to the building block benchmark results, the design source codes (in Verilog only) are also available for download. Test drive these designs yourself and experience the high performance of Stratix II FPGAs.

Adder Tree

An adder tree is often used in correlators within channel cards in 3G wireless basestations. Adder trees are usually implemented in a binary-tree structure—with two bits added together in each summation stage. Stratix II devices offer a tremendous performance boost and logic resource reduction by allowing the summation of three bits in a single entity. The benchmark results for a 128-number, 16-bit per number adder tree are shown in Table 1.

Table 1. 128-Number, 16-bit Adder Tree Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Pipelined Adder Tree 371 314 1,559 2,279 Download Download
Non-Pipelined Adder Tree 109 64 1,209 2,279

Barrel Shifter

A barrel shifter rotates the input data bits by the amount specified by the input control signals. The direction (up or down) of the rotation is also controlled by a separate input control signal. The foundation of a barrel shifter is based on multiplexers that can be efficiently implemented by the native construct of wide-input LUTs in the Stratix II devices. Table 2 shows the benchmark results for a 32-bit barrel shifter.

Table 2. 32-bit Barrel Shifter Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Barrel Shifter 322 177 219 284 Download Download

Complex Arithmetic

The arithmetic capability of Stratix II devices is greatly enhanced as the adaptive logic modules (ALMs) allow both logic and arithmetic operations to be performed in one step. In this complex arithmetic example, input A and B are selected by S. Based on the AddSub signal, the selected input is then added to or subtracted by input C. The Input data A, B, and C are 8-bits wide. See Table 3 for the complex arithmetic benchmark results.

Table 3. 8-bit Select-Add/Sub Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Select-Add/sub 686 422 9 16 Download Download

Crossbar Switch

Crossbar switches are commonly found in the telecommunication designs to switch data from one port to another. The example given here is a crossbar switch that has four data input ports and two data output ports. Two sets of input select signals (2-bits per set) are used to switch the data from an input port to the output port. The data ports are 8-bits wide each.

Each ALM can be configured to implement two 6-variable functions that perform the same logic operation and with four shared variables. Thus, each bit of the 4x2 crossbar switch can be implemented within just one ALM. See Table 4 for the crossbar switch benchmark results.

Table 4. 4x2, 8-bit Crossbar Switch Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Crossbar Switch 727 531 (3) 16 32 Download Download

DIP-4 Parity Checker

Diagonal Interleaved Parity (DIP) checker calculates the parity by performing an “exclusive-or” operation diagonally on the data. With the Stratix II logic structure, each ALM can perform a logical function with up to 7 inputs—drastically improving device performance by reducing the number of logic levels. Table 5 compares the benchmark results between a Stratix II and Stratix device when implementing a DIP-4 parity checker.

Table 5. DIP-4 Parity Checker Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
CRC Checker 286 194 173 222 Download Download

Data Encryption Standard (DES)

DES is a US government encryption standard using 56-bit keys to encode a plain text message (Note: The DES design source code is not provided here. The original source is written by Rudolf Usselmann). DES uses look-up tables with 4-to-6 inputs in its algorithm to shuffle the bits and encrypt the data. Stratix II ALM’s flexible logic structure can be configured to exactly fit the demands on various sized LUT for implementing the DES function. See Table 6 for a DES benchmark of the logic utilization differences between a Stratix and Stratix II device.

Table 6. DES Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
DES 316 210 2,605 4,456 NA NA

Multiplier (DSP Block)

DSP applications generally require a lot of multiplication operations in designs like the finite impulse response (FIR) and fast Fourier transform (FFT) filters. The Stratix II devices include up to 384 18x18 multipliers that can operate at up to 420 MHz. Table 7 shows the benchmark results of an 18x18 signed, non-pipeline multiplier implemented in a DSP block.

Table 7. 18x18 Signed, Non-Pipeline Multiplier (DSP block) Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Multiplier (DSP) 420 279 0 (4) 0 (4) Download Download

Multiplier (Logic-Based)

For very intensive DSP applications that consume more multipliers than available, the Stratix II family provides a highly efficient way to implement high-performance logic-based multipliers. The arithmetic mode in the Stratix II family can absorb the partial product generation and the first stage of partial product summation into a single operation. In addition, the ternary adder tree support in the ALM further reduces the number of logic levels and the logic resources required for the summation of the partial products beyond the first stage. Table 8 shows the performance and logic utilization comparison of a logic-based, 18x18 signed, non-pipeline multiplier between a Stratix and Stratix device.

Table 8. 18x18 Signed, Non-Pipeline Multiplier (Logic-Based) Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Multiplier (Logic-Based) 160 87 312 409 Download Download

Multiply and Accumulate

The DSP blocks in the Stratix II devices are more than just a multiplier. In addition to the embedded multiplier, the enhanced DSP block contains a dedicated accumulator and the rounding capability. A 18-bit signed multiply and accumulate (MAC) example is provided here. A pipeline stage is inserted between the multiplication and accumulation. See Table 9 for the benchmark results.

Table 9. 18-bit MAC (DSP Block) Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
MAC 420 279 0 (4) 0 (4) Download Download

Multiplexer

Multiplexing is the basic mechanism for making logic decisions in digital logic designs. Multiplexers are typically implemented by cascading LUTs in FPGAs. With each additional cascading LUT, the performance is penalized by both the incremental increase in the logic level and the programmable routing delay used for cascading LUTs. The wide-input LUTs in the Stratix II device family can greatly reduce the need for LUT-cascade and hence, reduce the LUT and the programmable routing delays. The benchmark results for a 32-bus-to-1,16-bit per bus multiplexer are provided in Table 10.

Table 10. 32-Bus-to-1, 16-bit Per Bus Multiplexer Performance & Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
Multiplexer 379 234 249 336 Download Download

Generic 6-Input Logic Function

A generic 6-input logic function example has been implemented in both a Stratix and Stratix II device. While the implementation of this function requires five LEs in the Stratix architecture, it only needs one Stratix II ALM. Not only is the logic utilization dramatically reduced, the performance is also improved. Equation 1 shows the 6-input logic function implemented and Table 11 shows the benchmark results.

Equation 1:

Table 11. Generic 6-Input Logic Utilization (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
6-Input Checker 1010 (3) 534 (3) 2 4 Download Download

Finite State Machine

In a finite state machine (FSM), there can be tens and hundreds of input signals used for next-state logic. Also, input stimuli (signals) are often common between each next-state logic. With this benchmark, a generic FSM using 20-inputs and 20-outputs is tested in both a Stratix and Stratix II FPGA.

Stratix II devices allow any 6-input logic function to be implemented in a single ALM that greatly enhance device performance by reducing the number of logic levels. Furthermore, Stratix II FPGAs reduce the logic resources usage by packing logic functions and common inputs together. Table 12 provides the benchmark results for 20-input, 20-output FSM.

Table 12. 20-Input, 20-Output FSM Performance & Logic Utilization Comparison Between Stratix & Stratix II Devices (Notes 1, 2)
fMAX (MHz) Logic Utilization Design Download
FPGA Family Stratix II Stratix Stratix II
(ALUT)
Stratix
(LE)
Stratix II Stratix
FSM 550 459 (3) 54 58 Download Download

Notes to Tables 1 through 12:

  1. Some of the designs targeting Stratix II devices include placement constraints for Quartus II software. In these cases, Quartus II software places the used logic resources closer together so that it is easier to see the complete design on the floor planner.
  2. The logic utilization numbers do not include the consumption for input and output registers.
  3. The fmax is derived from the propagation delay of the critical path
  4. The design is implemented in a DSP block
Table 13. Learn More about Stratix II FPGAs
Topic Description
Performance Comparison Compare Stratix II Performance with Competing Devices
Architecture FPGA Architecture White Paper
Performance and Logic Efficiency Analysis White Paper
8-Input Fracturable LUT in the ALM
Embedded Adders
Achieving More Performance 3 Steps to Higher Performance
DSP DSP Blocks
DSP Performance Center
Benchmarking Benchmarking Methodology White Paper
Benchmarking Methodology

Developing Stratix II FPGAs

  Please Give Us Feedback