گامی در راه رسیدن به شبکههای عصبی عمیق تمامنوری: بهکارگیری واحد غیر خطی نوری
محورهای موضوعی : مهندسی برق و کامپیوترآیدا ابراهیمی دهقان پور 1 , سمیه کوهی 2 *
1 - دانشگاه صنعتی شریف،دانشكده مهندسي كامپيوتر
2 - دانشگاه صنعتی شریف،دانشکده مهندسی کامپیوتر
کلید واژه: پردازش نوری, تابع فعالساز نوری, سرعت بالا, شبکه عصبی پیچشی, شبکه عصبی پیچشی نوری,
چکیده مقاله :
در سالهای اخیر، شبکههای عصبی نوری به علت سرعت بالا و توان مصرفی پایینی که دارند، بسیار مورد توجه قرار گرفتهاند. با این وجود، این شبکهها هنوز محدودیتهای زیادی دارند که یکی از این محدودیتها پیادهسازی لایه غیر خطی در آنهاست. در این نوشتار، پیادهسازی واحد غیر خطی برای شبکههای عصبی پیچشی نوری مورد بررسی قرار گرفته تا در نهایت با استفاده از این واحد غیر خطی بتوان به یک شبکه عصبی پیچشی تمامنوری عمیق با دقتی مشابه شبکههای الکتریکی، سرعت بالاتر و توان مصرفی کمتر رسید و بتوان قدمی در راستای کاهش محدودیتهای این شبکهها برداشت. در این راستا ابتدا روشهای مختلف پیادهسازی واحد غیر خطی مرور شدهاند. سپس به بررسی تأثیر استفاده از جاذب اشباعشونده به عنوان واحد غیر خطی در لایههای مختلف بر دقت شبکه پرداخته شده و نهایتاً روشی نوین و ساده برای جلوگیری از کاهش دقت شبکههای عصبی در صورت استفاده از این تابع فعالساز ارائه گردیده است.
In recent years, optical neural networks have received a lot of attention due to their high speed and low power consumption. However, these networks still have many limitations. One of these limitations is implementing their nonlinear layer. In this paper, the implementation of nonlinear unit for an optical convolutional neural network is investigated, so that using this nonlinear unit, we can realize an all-optical convolutional neural network with the same accuracy as the electrical networks, while providing higher speed and lower power consumption. In this regard, first of all, different methods of implementing optical nonlinear unit are reviewed. Then, the impact of utilizing saturable absorber, as the nonlinear unit in different layers of CNN, on the network’s accuracy is investigated, and finally, a new and simple method is proposed to preserve the accuracy of the optical neural networks utilizing saturable absorber as the nonlinear activating function.
[1] C. Julie, V. Sitzmann, X. Dun, W. Heidrich, and G. Wetzstein, "Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification," Scientific Reports, vol. 8, no. 1, pp. 1-10, 17 Aug. 2018.
[2] S. Colburn, Y. Chu, E. Shilzerman, and A. Majumdar, "Optical frontend for a convolutional neural network," Applied Optics, vol. 58, no. 12, pp. 3179-3186, 20 Apr. 2019.
[3] K. H. Wagner and S. McComb, "Optical rectifying linear units for back-propagation learning in a deep holographic convolutional neural network," IEEE J. of Selected Topics in Quantum Electronics, vol. 26, no. 1, pp. 1-18, Jan./Feb. 2019.
[4] X. Guo, T. D. Barrett, Z. M. Wang, and A. I. Lvovsky, "Backpropagation through nonlinear units for the all-optical training of neural networks," Photonics Research, vol. 9, no. 3, pp. B71-B80, 2021.
[5] R. Hecht-Nielsen, "Theory of the backpropagation neural network," in Neural Networks for Perception, pp. 65-93, Academic Press, 1992.
[6] Z. Gu, Y. Gao, and X. Liu, "Optronic convolutional neural networks of multi-layers with different functions executed in optics for image classification," Optics Express, vol. 29, no. 4, pp. 5877-5889, 15 Feb. 2021.
[7] C. M. V. Burgos, T. Yang, Y. Zhu, and A. N. Vamivakas, "Design framework for metasurface optics-based convolutional neural networks," Applied Optics, vol. 60, no. 15, pp. 4356-4365, 20 May 2021.
[8] I. A. Williamson, et al., "Reprogrammable electro-optic nonlinear activation functions for optical neural networks," IEEE J. of Selected Topics in Quantum Electronics, vol. 26, no. 1, pp. 1-12, Jan./Feb. 2019.
[9] M. Miscuglio, et al., "All-optical nonlinear activation function for photonic neural networks," Optical Materials Express, vol. 8, no. 12, pp. 3851-3863, 2018.
[10] Y. Zuo, et al., "All-optical neural network with nonlinear activation functions," Optica, vol. 6, no. 9, pp. 1132-1137, 2019.
[11] P. Ienne, T. Cornu, and G. Kuhn, "Special-purpose digital hardware for neural networks: an architectural survey," J. of VLSI Signal Processing Systems for Signal, Image and Video Technology, vol. 13, no. 1, pp. 5-25, 1996.
[12] J. Misra and I. Saha, "Artificial neural networks in hardware: a survey of two decades of progress," Neurocomputing, vol. 74, no. 1-3, pp. 239-255, Dec. 2010.
[13] K. Ovtcharov, et al., "Accelerating deep convolutional neural networks using specialized hardware," Microsoft Research Whitepaper, vol. 2, no. 11, pp. 1-4, Feb. 2015.
[14] M. Capra, et al., "An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks," Future Internet, vol. 12, no. 7, Article ID: 113, 2020.
[15] X. Sui, Q. Wu, J. Liu, Q. Chen, and G. Gu, "A review of optical neural networks," IEEE Access, vol. 8, pp. 70773-70783, 2020.
[16] Y. Shen, et al., "Deep learning with coherent nanophotonic circuits," Nature Photonics, vol. 11, pp. 441-446, 2017.
[17] X. Lin, et al., "All-optical machine learning using diffractive deep neural networks," Science, vol. 361, no. 6406, pp. 1004-1008, 7 Sept. 2018.
[18] M. Miscuglio, et al., "Massively parallel amplitude-only fourier neural network," Optica, vol. 7, no. 12, pp. 1812-1819, 2020.
[19] S. Colin, E. Contesse, P. L. Boudec, G. Stephan, and F. Sanchez, "Evidence of a saturable-absorption effect in heavily erbium-doped fibers," Optics Letters, vol. 21, no. 24, pp. 1987-1989, 15 Dec. 1996.
[20] Z. Cheng, H. K. Tsang, X. Wang, K. Xu, and J. Xu, "In-plane optical absorption and free carrier absorption in graphene-on-silicon waveguides," IEEE J. of Selected Topics in Quantum Electronics, vol. 20, no. 1, pp. 43-48, Jan./Feb. 2014.
[21] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.
[22] A. Krizhevsky and G Hinton, Learning Multiple Layers of Features from Tiny Images, Technical Report, University of Toronto, pp. 7, 2009.
[23] H. Xiao, K. Rasul, and R. Vollgraf, Fashion-Mnist: A Novel Image Dataset for Benchmarking Machine Learning Algorithms, arXiv preprint arXiv:1708.07747, 2017.