Mobile QR Code QR CODE

References

1 
K. He, X. Zhang, S. Ren, and J. Sun, ``Deep residual learning for image recognition,'' Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.DOI
2 
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, ``MobileNetV2: Inverted residuals and linear bottlenecks,'' Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510-4520, 2018.DOI
3 
M. Tan and Q. Le, ``Efficientnet: Rethinking model scaling for convolutional neural networks,'' Proc. of International Conference on Machine Learning, pp. 6105-6114, 2019.DOI
4 
S. Xie, R. Girshick, P. Dollár Z. Tu, and K. He, ``Aggregated residual transformations for deep neural networks,'' Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1492-1500, 2017.DOI
5 
H. Zhang, C. Wu, Z. Zhang, Y. Zhu, H. Lin, and Z. Zhang, ``Resnest: Split-attention networks,'' Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2736-2746, 2022.DOI
6 
N. J. Kim, J. Lee, and H. Kim, ``HyQ: Hardware-friendly post-training quantization for CNN-transformer hybrid networks,'' Proc. of International Joint Conference on Artificial Intelligence (IJCAI), pp. 4291-4299, 2024.DOI
7 
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, ``An image is worth $16\times16$ words: Transformers for image recognition at scale,'' arXiv preprint arXiv:2010.11929, 2020.DOI
8 
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, ``SSD: Single shot multibox detector,'' Proc. of European Conference on Computer Vision, pp. 21-37, 2016.DOI
9 
Y. Cai, Z. Wang, Z. Luo, B. Yin, A. Du, H. Wang, X. Zhang, X. Zhou, E. Zhou, and J. Sun, ``Learning delicate local representations for multi-person pose estimation,'' Proc. of European Conference on Computer Vision, pp. 455-472, 2020.DOI
10 
S. I. Lee and H. Kim, ``GaussianMask: Uncertainty-aware instance segmentation based on Gaussian modeling,'' Proc. of International Conference on Pattern Recognition (ICPR), pp. 3851-3857, 2022.DOI
11 
R. Singh and S. S. Gill, ``Edge AI: A survey,'' Internet of Things and Cyber-Physical Systems, vol. 3, pp. 71-92, 2023.DOI
12 
D. T. Nguyen, T. N. Nguyen, H. Kim, and H.-J. Lee, ``A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection,'' IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 8, pp. 1861-1873, 2019.DOI
13 
Y. Ma, Y. Cao, S. Vrudhula, and J.-s. Seo, ``Optimizing the convolution operation to accelerate deep neural networks on FPGA,'' IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26, no. 7, pp. 1354-1367, 2018.DOI
14 
R. Zhao, H.-C. Ng, W. Luk, and X. Niu, ``Towards efficient convolutional neural network for domain-specific applications on FPGA,'' in Proc. of International Conference on Field Programmable Logic and Applications (FPL), pp. 147-1477, 2018.DOI
15 
Y. Chen, K. Zhang, C. Gong, C. Hao, X. Zhang, and T. Li, ``T-DLA: An open-source deep learning accelerator for ternarized DNN models on embedded FPGA,'' Proc. of IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 13-18, 2019.DOI
16 
Q. Xiao and Y. Liang, ``Zac: Towards automatic optimization and deployment of quantized deep neural networks on embedded devices,'' Proc. of IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1-6, 2019.DOI
17 
S. Kala and S. Nalesh, ``Efficient CNN accelerator on FPGA,'' IETE Journal of Research, vol. 66, no. 6, pp. 733-740, 2020.DOI
18 
J. Wen, Y. Ma, and Z. Wang, ``An efficient FPGA accelerator optimized for high throughput sparse CNN inference,'' Proc. of IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pp. 165-168, 2020.DOI
19 
X. Xie, J. Lin, Z. Wang, and J. Wei, ``An efficient and flexible accelerator design for sparse convolutional neural networks,'' IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 68, no. 7, pp. 2936-2949, 2021.DOI
20 
Y. Meng, C. Yang, S. Xiang, J. Wang, K. Mei, and L. Geng, ``An efficient CNN accelerator achieving high PE utilization using a dense-/sparse-aware redundancy reduction method and data-index decoupling workflow,'' IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 10, pp. 1537-1550, 2023.DOI
21 
O. Weng, G. Marcano, V. Loncar, A. Khnodamoradi, G. Abarajithan, N. Sheybani, A. Meza, F. Koushanfar, K. Denolf, J. M. Duarte, and R. Kastner, ``Tailor: Altering skip connections for resource-efficient inference,'' ACM Transactions on Reconfigurable Technology and Systems, vol. 17, no. 1, pp. 1-23, 2024.DOI
22 
M. Nagel, R. A. Amjad, M. Van Baalen, C. Louizos, and T. Blankevoort, ``Up or down? Adaptive rounding for post-training quantization,'' Proc. of International Conference on Machine Learning, pp. 7197-7206, 2020.DOI
23 
D. T. Nguyen, H. Kim, and H.-J. Lee, ``Layer-specific optimization for mixed data flow with mixed precision in FPGA design for CNN-based object detectors,'' IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 6, pp. 2450-2464, 2020.DOI
24 
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, and A. Howard, ``Quantization and training of neural networks for efficient integer-arithmetic-only inference,'' Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2704-2713, 2018.DOI
25 
S. Ki, J. Park, and H. Kim, ``Dedicated FPGA Implementation of the Gaussian TinyYOLOv3 Accelerator,'' IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 70, no. 10, pp. 3882-3886, 2023.DOI