شناسایی ژنهای عامل سرطان در شبکه ژنی با استفاده از معماری شبکه عصبی پیشخور
محورهای موضوعی : مهندسی برق و کامپیوترمصطفی اخوان صفار 1 * , عباسعلی رضایی 2
1 - دانشكده مهندسي كامپيوتر و فناوری اطلاعات، دانشگاه پیام نور
2 - دانشكده مهندسي كامپيوتر و فناوری اطلاعات، دانشگاه پیام نور
کلید واژه: یادگیری عمیق, ژنهای عامل سرطان, شبکه عصبی پیشخور, سرطان سینه,
چکیده مقاله :
شناسایی ژنهای آغازگر سرطان یا عامل سرطان یکی از موضوعات تحقیقاتی مهم در زمینه سرطانشناسی و زیستدادهورزی است. ژنهای عامل سرطان، ژنهایی هستند که بعد از اینکه جهش در آنها اتفاق میافتد، آن جهش را از طریق برهمکنشهای پروتئین- پروتئین به دیگر ژنها منتقل کرده و از این طریق، باعث اختلال در عملکرد سلول و بروز بیماری و سرطان میشوند. تا کنون روشهای مختلفی برای پیشبینی و دستهبندی ژنهای عامل سرطان پیشنهاد شده که اکثراً متکی به دادههای ژنومی و ترنسکریپتومیک هستند و از این رو میانگین هارمونیک پایینی در نتایج دارند. تحقیقات در این زمینه بهمنظور بهبود دقت نتایج ادامه دارد و از این رو روشهای مبتنی بر شبکه و زیستدادهورزی به کمک این حوزه آمدهاند. در این مطالعه ما رویکردی را پیشنهاد دادهایم که متکی به دادههای جهش نیست و از روشهای شبکهای برای استخراج ویژگی و از شبکه عصبی سهلایه پیشخور برای دستهبندی ژنها استفاده میکند. برای این منظور، ابتدا شبکه زیستی مورد نظر که شبکه تنظیم رونویسی سرطان سینه است، تشکیل و سپس ویژگیهای مختلف هر ژن بهصورت بردارهایی استخراج گردید. نهایتاً بردارهای بهدستآمده جهت دستهبندی به یک شبکه عصبی پیشخور داده شد. نتایج بهدستآمده نشان میدهند که استفاده از روشهای مبتنی بر شبکههای عصبی چندلایه میتواند صحت و میانگین هارمونیک را بهبود بخشد و باعث بهبود عملکرد نسبت به سایر روشهای محاسباتی شود.
Identifying the genes that initiate cancer or the cause of cancer is one of the important research topics in the field of oncology and bioinformatics. After the mutation occurs in the cancer-causing genes, they transfer it to other genes through protein-protein interactions, and in this way, they cause cell dysfunction and the occurrence of disease and cancer. So far, various methods have been proposed to predict and classify cancer-causing genes. These methods mostly rely on genomic and transcriptomic data. Therefore, they have a low harmonic mean in the results. Research in this field continues to improve the accuracy of the results. Therefore, network-based methods and bioinformatics have come to the aid of this field. In this study, we proposed an approach that does not rely on mutation data and uses network methods for feature extraction and feedforward three-layer neural network for gene classification. For this purpose, the breast cancer transcriptional regulatory network was first constructed. Then, the different features of each gene were extracted as vectors. Finally, the obtained vectors were given to a feedforward neural network for classification. The obtained results show that the use of methods based on multilayer neural networks can improve the accuracy and harmonic mean and improve the performance compared to other computational methods.
[1] M. H. Bailey, et al., "Comprehensive characterization of cancer driver genes and mutations," Cell, vol. 173, no. 2, pp. 371-385, Apr. 2018.
[2] L. Ding, et al., "Somatic mutations affect key pathways in lung adenocarcinoma," Nature, vol. 455, no. 7216, pp. 1069-1075, Oct. 2008.
[3] L. Mularoni, R. Sabarinathan, J. Deu-Pons, A. Gonzalez-Perez, and N. López-Bigas, "OncodriveFML: a general framework to identify coding and non-coding regions with cancer driver mutations," Genome Biology, vol. 17, Article ID: 128, 13 pp, Dec. 2016.
[4] J. Reimand, O. Wagih, and G. D. Bader, "The mutational landscape of phosphorylation signaling in cancer," Scientific Reports, vol. 3, no. 1, Article ID: 2651, 9 pp., 2013.
[5] M. Helmy, M. Awad, and K. A. Mosa, "Limited resources of genome sequencing in developing countries: challenges and solutions," Applied & Translational Genomics, vol. 9, pp. 15-19, Jun. 2016.
[6] A. Youn and R. Simon, "Identifying cancer driver genes in tumor genome sequencing studies," Bioinformatics, vol. 27, no. 2, pp. 175-181, Jan. 2011.
[7] J. Zhao, S. Zhang, L. Y. Wu, and X. S. Zhang, "Efficient methods for identifying mutated driver pathways in cancer," Bioinformatics, vol. 28, no. 22, pp. 2940-2947, 15 Nov. 2012.
[8] F. Vandin, E. Upfal, and B. J. Raphael, "De novo discovery of mutated driver pathways in cancer," Genome Research, vol. 22, no. 2, pp. 375-385, Feb. 2012.
[9] A. Gonzalez-Perez and N. Lopez-Bigas, "Functional impact bias reveals cancer drivers," Nucleic Acids Research, vol. 40, no. 21, Article ID: e169, Nov. 2012.
[10] G. Ciriello, E. Cerami, C. Sander, and N. Schultz, "Mutual exclusivity analysis identifies oncogenic network modules," Genome Research, vol. 22, no. 2, pp. 398-406, Feb. 2012.
[11] A. Bashashati, G. Haffari, J. Ding, G. Ha, K. Lui, J. Rosner, D. G. Huntsman, C. Caldas, S. A. Aparicio, and S. P. Shah, "DriverNet: uncovering the impact of somatic driver mutations on transcriptional networks in cancer," Genome Biology, vol. 13, no. 12, pp. 1-4, Dec. 2012.
[12] F. Vandin, E. Upfal, and B. J. Raphael, "De novo discovery of mutated driver pathways in cancer," Genome Research, vol. 22, no. 2, pp. 375-385, Feb. 2012.
[13] J. Reimand, O. Wagih, and G. D. Bader, "The mutational landscape of phosphorylation signaling in cancer," Scientific Reports, vol. 3, no. 1, Article ID: 2651, Oct. 2013.
[14] M. R. Aure, et al., "Identifying in-trans process associated genes in breast cancer by integrated analysis of copy number and expression data," PloS One, vol. 8, no. 1, Article ID: e53014, Jan. 2013.
[15] M. S. Lawrence, et al., "Mutational heterogeneity in cancer and the search for new cancer-associated genes," Nature, vol. 499, pp. 214-218, 2013.
[16] D. Tamborero, A. Gonzalez-Perez, and N. Lopez-Bigas, "OncodriveCLUST: exploiting the positional clustering of somatic mutations to identify cancer genes," Bioinformatics, vol. 29, no. 18, pp. 2238-2244, Sept. 2013.
[17] E. Porta-Pardo and A. Godzik, "E-Driver: a novel method to identify protein regions driving cancer," Bioinformatics, vol. 30, no. 21, pp. 3109-3114, Nov. 2014.
[18] J. P. Hou and J. Ma, "DawnRank: discovering personalized driver genes in cancer," Genome Medicine, vol. 6, Article ID: 56, 16 pp., Jul. 2014.
[19] J. Zhang, L. Y. Wu, X. S. Zhang, and S. Zhang, "Discovery of co-occurring driver pathways in cancer," BMC Bioinformatics, vol. 15, Article ID: 271, 14 pp., 2014.
[20] D. Arneson, A. Bhattacharya, L. Shu, V. P. Mäkinen, and X. Yang, "Mergeomics: a web server for identifying pathological pathways, networks, and key regulators via multidimensional data integration," BMC Genomics, vol. 17, Article ID: 772, 9 pp., 2016.
[21] M. Rahimi, B. Teimourpour, and S. A. Marashi, "Cancer driver gene discovery in transcriptional regulatory networks using influence maximization approach," Computers in Biology and Medicine, vol. 114, Article ID: 103362, Nov. 2019.
[22] M. Akhavan-Safar, B. Teimourpour, and M. Kargari, "GenHITS: a network science approach to driver gene detection in human regulatory network using gene's influence evaluation," J. of Biomedical Informatics, vol. 114, Article ID: 103661, Feb. 2021.
[23] Y. Lu, Y. Wang, N. Sheng, H. Wang, Y. Fu, and Y. Tian, "RDDriver: a novel method based on multi-layer heterogeneous transcriptional regulation network for identifying pancreatic cancer biomarker," in Proc.IEEE Int. Conf. on Bioinformatics and Biomedicine, BIBM'22, pp. 497-502, Las Vegas, NV, USA, 6-8 Dec. 2022.
[24] S. Lee, H. Jung, J. Park, and J. Ahn, "Accurate prediction of cancer prognosis by exploiting patient-specific cancer driver genes," International J. of Molecular Sciences, vol. 24, no. 7, Article ID: 6445, Apr. 2023.
[25] J. M. Vaquerizas, S. K. Kummerfeld, S. A. Teichmann, and N. M. Luscombe, "A census of human transcription factors: function, expression and evolution," Nature Reviews Genetics, vol. 10, no. 4, pp. 252-263, Apr. 2009.
[26] H. Han, et al., "TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions," Nucleic Acids Research, vol. 46-D1, pp. 380-386, Jan. 2018.
[27] S. M. Cheng, V. Karyotis, P. Y. Chen, K. C. Chen, and S. Papavassiliou, "Diffusion models for information dissemination dynamics in wireless complex communication networks," J. of Complex Systems, vol. 2013, Article ID: 972352, 13 pp., 2013.
[28] I. F. Chung, C. Y. Chen, S. C. Su, C. Y. Li, K. J. Wu, H. W. Wang, and W. C. Cheng, "DriverDBv2: a database for human cancer driver gene research," Nucleic Acids Research, vol. 44-D1, pp. 975-979, Jan. 2016.
[29] P. Shannon, et al., "Cytoscape: a software environment for integrated models of biomolecular interaction networks," Genome Research, vol. 13, no. 11, pp. 2498-2504, Nov. 2003.