By Ron Wilson, Editor-in-Chief, Altera Corporation
Active electronically scanned-array (AESA) radar is now a key component of advanced weapons systems, especially in airborne warfighting. But the architecture’s future reaches beyond its military origins, into geophysical mapping, automotive driver assistance, autonomous vehicles, industrial robotics, and augmented reality: essentially, any application in which rich streams of sensor data are conditioned and fused into models for decision making.
As AESA architectures spread, they will leave the privileged enclave of radar signal-processing experts. In the outside world, these designs will encounter typical embedded design flows: CPU- and software-centric, C-based, and hardware-agnostic. In this article, we will outline an advanced scanned-array radar, and explore its architecture from the points of view of both seasoned radar signal processing experts and more traditional embedded-system designers.
Orientation to a Typical System
The difference between scanned arrays and conventional moving-dish radar begins at the antenna. Instead of the familiar, tirelessly rotating parabolic antenna, the scanned array is planar, and in most systems, stationary. Instead of a single element at the focus of the reflector, the array comprises hundreds or thousands of elements, each with its own transceiver module. The system electronics shape and aim the radar beam and receiver pattern by manipulating the amplitude and phase of the signals at each of the elements, setting up interference patterns that define the aggregate antenna pattern.
This approach, aside from eliminating a lot of big moving parts, allows the radar to do things that are physically impossible with a conventional antenna, such as changing the beam direction instantaneously, having multiple antenna patterns for transmitting and receiving simultaneously, or even subdividing the array into multiple antenna arrays and performing multiple functions—say, searching for targets, tracking a target, and following terrain—simultaneously. These tricks only require adding together a number of signals at each transmitter, and separating out the signals at each receiver. Superposition is a wonderful thing.
The complete system reaches from a CPU cluster to the antenna and back again (Figure 1). At the beginning of the process, a software-controlled waveform generator creates the chirp that the system will transmit. Depending on the application, the signal will be some compromise between the needs of noise-reduction, Doppler processing, and stealth.
The waveform generator sends its signal into the beamforming network. Here the signal is routed to each of the channels that will transmit it. At this stage, digital multipliers can apply amplitude weights to the channels to implement a spatial filter that will sharpen the beam. Or that step may be done later. In many designs, the signals for each channel would now go through a digital-to-analog converter (DAC), and then into analog IF and RF upconverters. After RF upconversion, the signals arrive at the individual transmitter modules, where they are given their phase shift or time delay, amplitude adjustment (if that was not done at baseband), and final filtering and amplification.
Initially, the received signals go through essentially the same path in reverse, but then get a lot more processing at the back end. At each antenna element, a limiter and bandpass filter protect the low-noise amplifier. The amplifier drives an RF downconverter, which may be combined with analog amplitude and/or phase adjustment. Passing through the IF stage into baseband, the signal from each antenna element reaches its analog-to-digital converter (ADC). Then the beamforming module recombines the antenna signals into one or more streams of complex data samples, each stream representing the signal from a particular received beam. These signal streams pass on into heavy-duty digital signal processing (DSP) circuits that further condition the data, perform Doppler processing, and attempt to extract meaningful signals from the noise.
When to Do Data Conversion
In many designs, much of the signal processing is done in analog. But as digital speeds go up and power and cost go down, the data converters get pushed closer and closer to the antenna. An ideal system, suggests Altera application specialist Colman Cheung, would drive the antenna elements directly from DACs. But such a design is not technically feasible in 2013, especially for trans-GHz RF.
It is feasible today to put the data converters at IF, and to do the IF frequency conversions and all the baseband processing digitally (Figure 2). The time delays that create the interference patterns between antenna elements can be done digitally in a beamforming network at baseband as well, eliminating the need for analog phase-shifters or delay lines on each antenna element. This partitioning allows DSP designers to decompose the transmit and receive paths into discrete functions—multipliers, filters, FIFOs for delay, and adders—model them in MATLAB, and implement them from libraries. The most demanding functions can go into purpose-built ASICs, FPGAs, or perhaps GPU chips, while less-demanding operations can be grouped into code on DSP chips or accelerators.
Receive-chain signal processing after the signal comes out of the beamforming network deserves particular attention, because its memory and processing needs can become enormous, and because the dynamic range involved—from staring into the mouth of a jamming transmitter to searching the very edge of detection range for a cloaked echo—can be huge. High-precision floating-point hardware may be a necessity, yet it still requires significant additional power.
In its final stages, the receive chain changes both in purpose and implementation. Through its filtering, beamforming, and pulse-compression stages, the role of the chain is to extract from the noise specifically those signals that might carry information about real objects in the outside world. But then the emphasis shifts from the signals to the objects they represent, and the nature of the tasks changes.
From Signals to Objects
Pulse compression is the beginning of this abstraction process. In either the time domain or frequency domain, the pulse compressor identifies, often simply by autocorrelation, waveforms that are likely to be reflections of the transmitted chirp. It then represents those waveforms with pulse objects—data packets that contain arrival-time, frequency and phase, and other pertinent data. From here on, the receive chain will work on this data packet rather than the received signal.
The next step is usually Doppler processing. First, pulses get dropped into an array of bins (Figure 3). In the array, each column contains pulses returned from a particular transmitter chirp. There may be many columns in the array, depending on how much latency the system can tolerate. Rows in the array represent echo transit time: the further from the x-axis of the array, the longer the delay between the transmitter chirp and the arrival of the received pulse. Thus the delay bin also represents the range to the target that reflected that particular pulse.
Once the pulses for a series of chirps are dropped into the correct bins, Doppler processing routines can traverse the data horizontally—looking at the pulses returned from a single target over time—to refine information about the relative velocity and heading of the target. This processing approach requires a very large circular buffer able to hold all the bins for however many chirps the particular Doppler algorithm can process at one time.
Advanced systems add another dimension to the array. By subdividing the antenna into sub-arrays, the system can transmit a number of beams simultaneously, and then can set the receiver to listen using the same many-lobed antenna pattern. Or alternatively, the system can scan the beam, either through beamforming or using synthetic-aperture techniques. Now, when binning the compressed pulses, the system can create a three-dimensional array of bins: transmit pulses on one axis, echo delay on a second, and beam azimuth on a third, or third and fourth, axes (Figure 4). Now for each pulse we have a two- or three-dimensional array of bins that represent both range and direction—a representation of physical space. This arrangement of memory is the starting point for space-time adaptive processing (STAP).
The phrase is descriptive: “space-time” because the data set unites the location of the target in 3D space with the time of the chirp that illuminated it. And “adaptive” because the algorithms derive adaptive filters from the data.
Conceptually, and often actually, forming the adaptive filter is a matrix-inversion process: what matrix would I have to multiply this data by to get the result that I think is hidden in the noise? Knowledge of the presumed hidden pattern may come from seeds found during Doppler processing, from data collected by other sensors, or from intelligence data, according to Altera senior technical marketing manager Michael Parker. Algorithms running on the CPU downstream insert the presumed pattern into the matrix equation and solve for the filter that would produce the expected data.
Obviously, at this point, the computing load is becoming huge. And the dynamic range required by the inversion algorithms nearly demands floating-point computations. Parker has estimated that for a modest actual system in a combat situation, where the processing must be performed in real time, the STAP load could reach several TFLOPS. In systems that employ lower resolution, lower dynamic range, and do not have to be strictly real time, such as simpler automotive driver-assist systems or synthetic-aperture mapping systems, this load can be reduced considerably.
From STAP, the information moves into general-purpose CPUs, where complex but less numerically intensive software attempts to categorize the targets, build a model of the situation, assess threats, and either advise the human operator or take emergency action directly. At this point, we have followed the signal beyond the realm of signal processing and into the world of artificial intelligence.
Two Views of the Architecture
We have taken a superficial tour of an AESA combat radar from the point of view of an experienced radar system architect. This frame of reference sees the system as a network of relatively static DSP chains, all terminating in the STAP block, which is itself a software-directed matrix-algebra unit. Beyond that, from the DSP expert’s point of view, lies a bunch of CPU cores.
In contrast, an automotive or robotics system designer might see the system quite differently. From an embedded designer’s point of view, the system could be just one big piece of software, with some rather specialized I/O devices and some tasks that will require acceleration. The experienced radar-signals engineer might dismiss this approach as nonsense, given the relative scale of the signal-processing and general-purpose hardware. Obviously the data rates, flexibility, and dynamic range of an airborne multifunction radar will require dedicated DSP pipelines and huge local buffers in order to stay real time. But in a different application with fewer antenna elements, a simpler environment, shorter range, and lower resolution, the CPU-centric viewpoint raises some interesting questions.
The first question, as Gene Frantz, professor in the practice at Rice University, puts it, is defining the I/Os to the real world. The second question is the choice of CPU. “Rarely will there be just one CPU,” Frantz observes. “More often it will be a heterogeneous multiprocessing system.” Frantz suggests this approach does not start out with DSP functions in MATLAB—it begins with the entire system described in C. Then instead of defining a hard boundary between the DSP and CPU regions of the design, the CPU-centric designer “successively optimizes and accelerates the C code.”
The results in practical terms can be quite different from the DSP-centric version. The CPU-centric approach, for example, begins with the assumption that everything gets executed on a single, general-purpose CPU. If that isn’t fast enough, the approach shifts to multiple CPUs sharing a coherent memory hierarchy. Only when multicore is not enough does the methodology turn to optimized hardware accelerators.
Similarly, the CPU-centric design starts with the assumption of a single unified memory. It splits off coherent caches for individual processors, and local working storage for accelerators. It does not start out assuming any hardware pipelines, or any fixed mapping of tasks onto hardware resources.
In the most demanding applications, the two architectural approaches may very well end in the same system design. Extreme bandwidth and computing requirements in almost every task will dictate dedicated hardware pipelines and memory instances. And the need for ruthless power minimization may force decisions about numeric precision that would further complicate any attempt to share hardware between tasks.
Precision is a point Frantz underlines. “Cutting the number of significant bits in half can give you an order of magnitude higher performance,” he points out. You can then trade some or all of that gain for lower power.
Frantz makes a related point about the analog/digital boundary. “We need to rethink analog signal processing,” he says. “Thirty years ago, we started telling system designers to just do good data conversion, and we’d handle everything else in digital. But in fact, at 8 bit resolution, analog and digital techniques are about equivalent. Can analog be better? That depends on what ‘better’ means at that point in your system.”
Lower-bandwidth systems, such as synthetic-aperture radar for geophysical mapping or systems for autonomous land vehicles, may end up in quite different architectures than combat radars would use. It might make sense to use analog filters, up-/downconverters, and beamforming, and to conduct all the subsequent processing from a single high-bandwidth memory system, using a heterogeneous pool of processors with floating-point accelerators and dynamic load balancing (Figure 5).
By virtualizing the signal-processing tasks and keeping them in software, the system designer gets new run-time options, such as shifting processing resources between tasks, shutting down unneeded processors, altering algorithms early in the process in response to data patterns, or even running multiple algorithms against each other to see which gives the best results.
AESA radar systems provide a rich environment for studying not just implementation strategies, but ways of thinking about signal-intensive systems. As these active arrays spread beyond the military into other design cultures, they will bump up against the limits of traditional embedded-design thinking. And the ideas thus generated may find use in entirely different kinds of signal-intensive fields, including signal intelligence and network security. This is an area worth watching.