Big Data Analytics

Data streams, such as computer network traffic, data feeds, sensor data, and event stream processing (ESP) involve large, complex data sets. To process this data, several new applications have evolved. For example, the open-source Hadoop framework can process data faster by utilizing thousands of systems simultaneously. Another open-source framework, Spark, provides a Hadoop-compatible, data processing engine. These tools, however, are processor intensive and the CPU can become a bottleneck. Using FPGAs, data centers can implement these specific applications at a hardware level significantly increasing the system's performance per watt.

 

Today's CPUs are evolving to contain more and more cores, but the bandwidth to external memory is not growing at the same pace of as this multi-core computing power. FPGAs can relieve the CPU data access bottlenecks by providing compression, filtering, and de-duplication functions. FPGAs can also compress data more efficiently, for example in Hadoop where they accelerate the “shuffle” phase, when the results are brought back to a single server. Advantages are also seen with Spark, where the FPGA can accelerate data streaming, real and predictive data, shuffle phase compression, and other functions.

With FPGAs, users can perform real-time analytics, such as predicting the class or value of new data instances, and can also filter data for storage. In these applications, using one or more FPGAs efficiently accelerates complex data processing.