FPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs

dc.authoridTASCI, MUSTAFA/0000-0002-8073-8587
dc.authoridTUMEN, VEDAT/0000-0003-0271-216X
dc.contributor.authorTasci, Mustafa
dc.contributor.authorIstanbullu, Ayhan
dc.contributor.authorTumen, Vedat
dc.contributor.authorKosunalp, Selahattin
dc.date.accessioned2025-07-03T21:25:20Z
dc.date.issued2025
dc.departmentBalıkesir Üniversitesi
dc.description.abstractRecently, convolutional neural networks (CNNs) have received a massive amount of interest due to their ability to achieve high accuracy in various artificial intelligence tasks. With the development of complex CNN models, a significant drawback is their high computational burden and memory requirements. The performance of a typical CNN model can be enhanced by the improvement of hardware accelerators. Practical implementations on field-programmable gate arrays (FPGA) have the potential to reduce resource utilization while maintaining low power consumption. Nevertheless, when implementing complex CNN models on FPGAs, these may may require further computational and memory capacities, exceeding the available capacity provided by many current FPGAs. An effective solution to this issue is to use quantized neural network (QNN) models to remove the burden of full-precision weights and activations. This article proposes an accelerator design framework for FPGAs, called FPGA-QNN, with a particular value in reducing high computational burden and memory requirements when implementing CNNs. To approach this goal, FPGA-QNN exploits the basics of quantized neural network (QNN) models by converting the high burden of full-precision weights and activations into integer operations. The FPGA-QNN framework comes up with 12 accelerators based on multi-layer perceptron (MLP) and LeNet CNN models, each of which is associated with a specific combination of quantization and folding. The outputs from the performance evaluations on Xilinx PYNQ Z1 development board proved the superiority of FPGA-QNN in terms of resource utilization and energy efficiency in comparison to several recent approaches. The proposed MLP model classified the FashionMNIST dataset at a speed of 953 kFPS with 1019 GOPs while consuming 2.05 W.
dc.description.sponsorshipBalikesir University Scientific Research and Development Support Program (BAP) in Turkiye [2020/091]
dc.description.sponsorshipThis work has been supported by Balikesir University Scientific Research and Development Support Program (BAP) in Turkiye under project number 2020/091.
dc.identifier.doi10.3390/app15020688
dc.identifier.issn2076-3417
dc.identifier.issue2
dc.identifier.scopus2-s2.0-85215821024
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org/10.3390/app15020688
dc.identifier.urihttps://hdl.handle.net/20.500.12462/21474
dc.identifier.volume15
dc.identifier.wosWOS:001403933300001
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherMdpi
dc.relation.ispartofApplied Sciences-Basel
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20250703
dc.subjectaccelerator
dc.subjectFPGA
dc.subjectQNN
dc.subjectdeep learning
dc.subjectFINN
dc.titleFPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs
dc.typeArticle

Dosyalar