Machine learning, at the highest level, is a program that extracts specific features from data to solve predictive problems. A few example include: detecting anomalies to prevent network intrusion or fraud, classifying objects such as tumors or detecting pedestrians, scanning social media sentiment and perception for marketing purposes, self-driving cars, or evaluating traffic patterns to forecast and make decisions to control the flow in an optimal way.

Machine learning applications cross all vertical markets from military, automotive, industrial, and data center.  Imagine a person trying to rank each web search manually every time someone issues a web search; it is an impossibility for a person, but machines do it billions of times a day. This job is exactly where machines can help humans. Machines can navigate through the tsunami of data collected every second and automatically recognize complex patterns. Then, they can make intelligent decisions based on this insight. For accuracy, models must be trained, tested, and calibrated to detect patterns using previous experiences.

Today, one of the most popular machine learning methods is using neural networks for object detection and recognition. Neural networks are modelled after the brain's interconnected neurons and use a variety of layers that extract lower levels of detail for each layer in the network. The FPGA implements these layers very efficiently because the FPGA has the ability to retrieve the data and perform classification in real time. By leveraging 8 TBps of on-die memory bandwidth and minimizing the need to interact with external memory, designers can leverage the flexible FPGA architecture to obtain very power efficient implementations.  FPGAs can also efficiently move data in and out of the network directly to classify in-line video, signal, or packet processing. 

To take advantage of the flexibility of the FPGAs architecture at building very power efficient implementations of CNN Topologies for inference (i.e. scoring) without needing to become an FPGA expert, we are offering the Deep Learning Accelerator FPGA intellectual property (IP). With this IP loaded into the FPGA you are able to leverage Intel® Caffe in the Data Analytics Acceleration Library (DAAL) and Intel MKL-DNN to build ALexNet or GoogleNet like topologies without even requiring a recompilation of the FPGA.  

To get started with an FPGA implementation using the Deep Learning Accelerator FPGA IP, please contact your local sales representative.  You can also purchase a turn key FPGA board that comes preloaded and ready to do inference, the Intel Deep Learning Inference Accelerator board, and you can start using the Intel Caffe and Intel MKL-DNN today, to develop your CNN topologies and target the FPGA seamlessly.

This implementation was created using the Intel® FPGA SDK for OpenCL™ and the OpenCL language, which supports flexible, scalable FPGA implementations by leveraging the software development flow.