Abstract:
In recent years, with the continuous breakthrough in the field of algorithms, the current target detection algorithms have higher and higher computational complexity. In the forward inference stage, many practical applications often face low latency and strict power consumption restrictions. How to realize a low-power, low-cost, and high-performance target detection platform has gradually attracted more attention. As a high-performance, reconfigurable and low-cost embedded platform, Field Programmable Gate Array (FPGA) is becoming the key technology of algorithm application. In view of the above requirements, this paper proposes a low-power target detection accelerator architecture based on FPGA+SOC (System On Chip) heterogeneous platform by adopting various hardware acceleration methods such as coarse and fine granularity optimization, parameter fixed-point and reordering. Aiming at the design limitation of existing researches on Zynq 7000 series FPGA, this paper proposes a new multi-dimensional hardware acceleration of YOLOv2 (You Only Look Once) algorithm, and deeply analyzes and models the accelerator performance and resource consumption to verify the rationality of the architecture. In order to make full use of the on-chip hardware resources to optimize the design of each module, the accelerator data access mechanism is improved to effectively reduce the transmission delay of the system and improve the actual utilization rate of bus bandwidth. The fixed-point processing of floating-point numbers can reduce the processing load of FPGA and further accelerate the processing speed. It is shown via experiments that the architecture achieves 26.98 GOPs performance on PYNQ-Z2 platform, which is about 38.71% higher than the existing FPGA-based target detection platform, and the power consumption is only 2.96 W. Moreover, it has far-reaching significance for the application of target detection algorithm.