طراحی و جمعآوری دادگان گفتاری بهعنوان گام نخست بومیسازی تشخیص هوشمند اوتیسم در کودکان ایرانی
محورهای موضوعی : مهندسی برق و کامپیوترمریم علیزاده 1 , شیما طبیبیان 2 *
1 - پژوهشکده فضای مجازی، دانشگاه شهید بهشتی
2 - پژوهشکده فضای مجازی، دانشگاه شهید بهشتی
کلید واژه: تشخیص اوتیسم, پردازش گفتار, یادگیری ماشین, دادگان گفتاری, کودکان, زبان فارسی,
چکیده مقاله :
اختلال طیف اوتیسم، نوعی اختلال رشدی به شمار میآید که از طریق علائمی مانند ناتوانی در برقراری ارتباط اجتماعی، خود را نشان میدهد. بنابراین بارزترین نشانه افراد مبتلا به اوتیسم، اختلال گفتار است. این مقاله در بخش اول به بررسی و مرور مطالعات انجامشده برای تشخیص خودکار اوتیسم بر اساس پردازش گفتار افراد مشکوک به ابتلا میپردازد. با توجه به بررسیهای انجامشده، رویکردهای اصلی پردازش گفتار برای تشخیص اوتیسم به دو گروه تقسیم میشوند. گروه اول با پردازش پاسخها یا احساسات افراد مورد آزمایش در پاسخ به سؤالات یا داستان پرسشگر، افراد مبتلا به اوتیسم را تشخیص میدهند. گروه دوم، افراد مبتلا به اوتیسم را از طریق میزان نرخ دقت بازشناسی گفتارشان در سیستمهای تشخیص خودکار گفتار از افراد سالم تفکیک میکنند. علیرغم پژوهشهای زیاد انجامشده در این حوزه در خارج از ایران، پژوهشهای اندکی داخل ایران انجام شدهاند که مهمترین دلیل آن، عدم وجود دادگان غنی متناسب با نیازمندیهای تشخیص اوتیسم مبتنی بر پردازش گفتار افراد مبتلا است. در بخش دوم پژوهش حاضر به روند طراحی، جمعآوری و ارزیابی یک مجموعه دادگان گفتاری مستقل از گوینده برای تشخیص اوتیسم در کودکان ایرانی بهعنوان گام نخست بومیسازی حوزه مذکور پرداختهایم.
Autism Spectrum Disorder is a type of disorder in which, the patients suffer from a developmental disorder that manifests itself by symptoms such as inability to social communication. Thus, the most apparent sign of autism is a speech disorder. The first part of this paper reviews research studies conducted to automatically diagnose autism based on speech processing methods. According to our review, the main speech processing approaches for diagnosing autism can be divided into two groups. The first group detects individuals with autism by processing their answers or feelings in response to questions or stories. The second group distinguishes people with autism from healthy people because of the accuracy of recognizing their spoken utterances based on automatic speech recognition systems. Despite much research being conducted outside Iran, few studies have been conducted in Iran. The most important reason for this is the lack of rich data that meet the needs of autism diagnosis based on the speech processing of suspected people. In the second part of the paper, we discuss the process of designing, collecting, and evaluating a speaker-independent dataset for autism diagnosis in Iranian children as the first step in the localization of the mentioned field.
[1] D. Mitsumoto, et al., "Autism spectrum disorder discrimination based on voice activities related to fillers and laughter," in Proc. 53rd Annual Conf. on Information Sciences and Systems, CISS'19, 6 pp., Baltimore, MD, USA, 20-22 Mar. 2019.
[2] M. Alizadeh and S. Tabibian, "A Persian speaker-independent dataset to diagnose autism infected children based on speech processing techniques," in Proc. 7th Int. Conf. on Signal Processing and Intelligent Systems, ICSPIS'21, 5 pp., Tehran, Iran, 29-30 Dec. 2021.
[3] S. Schelinski and K. V. Kriegstein, "Speech-in-noise recognition and the relation to vocal pitch perception in adults with autism spectrum disorder and typical development," J. of Autism Development Disorder, vol. 50, no. 1, pp. 356-363, Jan. 2020.
[4] S. Schelinski and K. V. Kriegstein, "The relation between vocal pitch and vocal emotion recognition abilities in people with autism spectrum disorder and typical development," J. of Autism and Developmental Disorders, vol. 49, pp. 68-82, 2019.
[5] P. P. Denes, The Speech Chain, WH Freeman Company, 1993.
[6] J. Deng, et al., "Speech-based diagnosis of autism spectrum condition by generative adversarial network representations," in Proc. of the Inte. Conf. on Digital Health, pp. 53-57, Londres, UK, 2-5 Jul. 2017.
[7] H. Drimalla, et al., "Detecting autism by analyzing a simulated social interaction," in Proc. Joint European Conf. on Machine Learning and Knowledge Discovery in Databases, pp. 193-208, 10-14 Sept. 2018.
[8] S. Schelinski, Mechanisms of Voice Processing: Evidence from Autism Spectrum Disorder, Ph.D. Thesis, Humboldt University in Berlin, 2018.S [9] A. Baird, et al., "Automatic classification of autistic child vocalisations: a novel database and results," in Proc. InterSpeech'17, pp. 849-853, Stockholm, Sweden, 20-24 Aug. 2017.
[10] E. Lyakso, et al., "AD-Child. Ru: speech corpus for Russian children with atypical development," in Proc. Int. Conf. on Speech and Computer, SPECOM'19, pp. 299-308, Istanbul, Turkey, 20-25 Aug. 2019.
[11] S. Sadiq, et al., "Deep learning based multimedia data mining for autism spectrum disorder (ASD) diagnosis," in Proc. Int. Conf. on Data Mining Workshops, ICDMW'19, pp. 847-854, Beijing, China, 8-11 Nov. 2019.
[12] W. Liu, T. Zhou, C. Zhang, X. Zou, and M. Li, "Response to name: a dataset and a multimodal machine learning framework towards autism study," in Proc. 7th Int. Conf. on Affective Computing and Intelligent Interaction, ACII'17, pp. 178-183, San Antonio, TX, USA, 23-26 Oct. 2017.
[13] K. Welarathna, V. Kulasekara, K. Pulasinghe, and V. Piyawardana, "Automated sinhala speech emotions analysis tool for autism children," in Proc. 10th Int. Conf. on Information and Automation for Sustainability, ICIAfS'21, pp. 500-505, Negambo, Sri Lanka, 11-13 Aug. 2021.
[14] D. Xu, et al., "Automatic childhood autism detection by vocalization decomposition with phone-like units," in Proc. of the 2nd Workshop on Child, Computer and Interaction, WOCCI '09, Article ID: 5, 7 pp., Cambridge, MA, USA, 5-5 Nov. 2009.
[15] L. G. Pillai and E. Sherly, "A deep learning based evaluation of articulation disorder and learning assistive system for autistic children," International J. on Natural Language Computing, vol. 6, no. 5, pp. 19-36, Oct. 2017.
[16] J. Zhang, Y. Meng, C. Wu, Y. T. Xiang, and Z. Yuan, "Non-speech and speech pitch perception among cantonese-speaking children with autism spectrum disorder: an ERP study," Neuroscience Letters, vol. 703, pp. 205-212, Jun. 2019.
[17] N. A. Chi, et al., "Classifying autism from crowdsourced semistructured speech recordings: machine learning model comparison study," JMIR Pediatrics and Parenting, vol. 5, Article ID: e35406, Apr. 2022.
[18] A. Khozaei, H. Moradi, R. Hosseini, H. Pouretemad, and B. Eskandari, "Early screening of autism spectrum disorder using cry features," PloS One, vol. 15, Article ID: e0241690, Dec. 2020.
[19] T. Talkar, J. R. Williamson, D. J. Hannon, H. M. Rao, S. Yuditskaya, K. T. Claypool, et al., "Assessment of speech and fine motor coordination in children with autism spectrum disorder," IEEE Access, vol. 8, pp. 127535-1275452020.
[20] A. Mohanta and V. K. Mittal, "Acoustic features for characterizing speech of children affected with ASD," in Proc. IEEE 16th India Council Int. Conf., INDICON'19, 4 pp., Rajkot, India, 13-15 Dec. 2019.
[21] A. Mohanta, P. Mukherjee, and V. K. Mirtal, "Acoustic features characterization of autism speech for automated detection and classification," in Proc. National Conf. on Communications, NCC'20, 6 pp., Kharagpur, India, 21-23 Feb. 2020.
[22] F. Ringeval, et al., "Automatic analysis of typical and atypical encoding of spontaneous emotion in the voice of children," in Proc. 17th Annual Conf. of the Int. Speech Communication Association, ISCA'16, pp. 1210-1214, San Francisco, CA, USA, 8-12 Sept. 2016.
[23] I. F. Lin, et al., "Vocal identity recognition in autism spectrum disorder," PloS One, vol. 10, Article ID: e0129451, Jun. 2015.
[24] F. Ringeval, et al., "Automatic intonation recognition for the prosodic assessment of language-impaired children," IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, no. 5, pp. 1328-1342, Oct. 2010.
[25] M. Asgari, A. Bayestehtashk, and I. Shafran, "Robust and accurate features for detecting and diagnosing autism spectrum disorder," in Proc. Annual Conf. of the Int. Speech Communication Association, - pp. 191-194, 25-29 Aug. 2013.
[26] E. Lyakso, et al., "Speech features of 13-15 year-old children with autism spectrum disorders," in Proc. Int. Conf. on Speech and Computer, SPECOM'20, pp. 291-303, St. Petersburg, Russia, 7-9 Oct. 2020.
[27] S. R. Livingstone and F. A. Russo, "The ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in north american english," PloS One, vol. 13, Article ID: e0196391, May 2018.
[28] R. Matin and D. Valles, "A speech emotion recognition solution-based on support vector machine for children with autism spectrum disorder to help identify human emotions," in Proc. Intermountain Engineering, Technology and Computing, IETC'20, 6 pp., Orem, UT, USA, 2-3 Oct. 2020.
[29] C. Küpper, et al., "Identifying predictive features of autism spectrum disorders in a clinical sample of adolescents and adults using machine learning," Scientific Reports, vol. 10, Article ID: 4805, 11 pp., 2020.
[30] Y. K. Kim, et al., "Analyzing short term dynamic speech features for understanding behavioral traits of children with autism spectrum disorder," in Proc. Interspeech'21, pp. 2916-2920, Brno, Czech Republic, 30 Aug.-3 Sept. 2021.
[31] B. Schuller, S. Steidl, and A. Batliner, "The Interspeech 2009 emotion challenge," in Proc. Interspeech'09, pp. 312-315, Brighton, UK, 6-10 Sept. 2009.
[32] B. Schuller, et al., "The INTERSPEECH 2010 paralinguistic challenge," in Proc. Interspeech'10, pp. 2794-2797, Makuhari, Japan, 26-30 Sept. 2010.
[33] B. Schuller, et al., "The INTERSPEECH 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism," in in Proc. Interspeech'13, pp. 148-152, Lyon, France, 25-29 Aug. 2013.
[34] A. Pahwa, G. Aggarwal, and A. Sharma, "A machine learning approach for identification & diagnosing features of neurodevelopmental disorders using speech and spoken sentences," in Proc. Int. Conf. on Computing, Communication and Automation, ICCCA'16, pp. 377-382, Greater Noida, India, 29-30 Apr. 2016.
[35] S. A. Majeed, H. Husain, S. A. Samad, and T. F. Idbeaa, "Mel frequency cepstral coefficients (MFCC) feature extraction enhancement in the application of speech recognition: a comparison study," J. of Theoretical & Applied Information Technology, vol. 79, no. 1, pp. 38-56, Sept. 2015.
[36] A. Mohanta, P. Mukherjee, and V. K. Mirtal, "Acoustic features characterization of autism speech for automated detection and classification," in Proc. National Conf. on Communications, NCC'20, 6 pp., haragpur, India, 21-23 Feb. 2020.
[37] Z. Sherkatghanad, et al., "Automated detection of autism spectrum disorder using a convolutional neural network," Frontiers in Neuroscience, vol. 13, Article ID: 1325, Jan. 2020.
[38] S. H. R. E. Motlagh, H. Moradi, and H. Pouretemad, "Using general sound descriptors for early autism detection," in Proc. 9th Asian Control Conf., ASCC'13, 5 pp., Istanbul, Turkey, 23-26 Jun. 2013.
[39] A. Wijesinghe, P. Samarasinghe, S. Seneviratne, P. Yogarajah, and K. Pulasinghe, "Machine learning based automated speech dialog analysis of autistic children," in Proc. 11th Int. Conf. on Knowledge and Systems Engineering, KSE'19, 5 pp., Da Nang, Vietnam, 24-26 Oct. 2019.
[40] M. Eni, et al., "Estimating autism severity in young children from speech signals using a deep neural network," IEEE Access, vol. 8, pp. 139489-139500, 2020.