ارائه روش استخراج ویژگی شبهکوواریانسی مبتنی بر تبدیل موجک جهت کشف نشانگر حیاتی از الگوهای پروتئینی سرطان تخمدان
محورهای موضوعی : مهندسی برق و کامپیوترحسین منتظری کردی 1 * , محمدحسین میرانبیگی 2 , محمدحسن مرادی 3
1 - دانشگاه تربیت مدرس
2 - دانشگاه تربیت مدرس
3 - دانشگاه صنعتی امیرکبیر
کلید واژه: پروتئینشناسیتشخیص الگوتبدیل موجک گسستهتابع وزن شبهکوواریانسینشانگر حیاتی,
چکیده مقاله :
تغییرات پاتولوژیک درون یک عضو حیاتی بهصورت الگوهای پروتئینی در خون انعکاس مییابد. طیفسنجی جرمی بهعنوان یک ابزار اندازهگیری قدرتمند در تولید الگوهای پروتئینی از خون شناخته شده است. دادههای حاصل از این تکنیک بهعنوان دادههایی با ابعاد و همبستگی بالا در نظر گرفته میشوند که ویژگیهای کلیدی مهم برای محققان، پیکهای موجود در طیف میباشند. با توجه به این خصوصیات رفتاری داده، یک روش تحلیل مناسب مثل تبدیل موجک نیاز میباشد. در این تحقیق، روش جدیدی جهت استخراج ویژگی از داده طیف جرمی با هدف کاهش بعد و همبستگی آن ارائه شده است که مبتنی بر تبدیل موجک گسسته و انتخاب ويژگی شبهکوواریانسی میباشد. با روش پیشنهادی، نشانگرهای حیاتی مناسب از مجموعه دادههای مورد مطالعه برای سرطان تخمدان که از انستیتو ملی سرطان آمریکا اخذ شده است، از روی طیف جرمی بازسازیشده استخراج گردیده است که منجر به حصول نتایج تشخیصی بالا با استفاده از معیارهای ارزشیابی استاندارد شده است. با استفاده از روشهای طبقهبندی مختلف، روش پیشنهادی جدید منجر به دقت تشخیص 98، نرخ قطعیت 97 و حساسیت 98 درصد شده است.
Pathological changes within an organ can be reflected as proteomic patterns in blood. The mass spectrometry has been used as powerful tools to generate proteomic patterns from serum. The produced profiles can be viewed as high dimensional and correlation data for which the features of scientific interest are the peaks. Due to this complexity of data, an appropriate analysis method is needed such as wavelet transform. In this study, we proposed a pseudo-covariance wavelet-based feature extraction method for dimension reduction and de-correlation between mass spectra data. Our algorithm was applied to datasets of ovarian cancer obtained from the National Cancer Institute of USA. The proposed algorithm was used to extract the set of proteins as potential biomarkers in each dataset from reconstructed mass spectra. The selected biomarkers were able to diagnose ovarian cancer patients from non-cancer with high accurate results using standard diagnosis criteria. Using different classification algorithms, our approach yielded an accuracy of 98%, specificity of 97%, and sensitivity of 98%.
[1] E. F. Petricoin and L. A. Liotta, "SELDI-TOF-based serum proteomic pattern diagnostics for early detection of cancer," Analytical Biotechnology, Science Direct, vol. 15, no. 1, pp. 24-30, Feb. 2004.
[2] H. Kuruma, S. Egawa, and et al., "Proteome analysis of prostate cancer," Prostate Cancer and Prostatic Disease, vol. 8, no. 1, pp. 14-21, 2005.
[3] T. P. Conrads, M. Zhou, E. F. Petricoin III, L. Liotta, and T. D. Veenstra, "Cancer diagnosis using proteomics patterns," Expert Rev. Mol. Diagn., vol. 3, no. 4, pp. 411-420, 2003.
[4] E. F. Petricoin III, D. K. Ornstein, C. P. Paweletz, A. M. Ardekani, P. S. Hackett, B. A. Hitt, A. Velassco, C. Trucco, L. Wiegand, K. Wood, C. B. Simone, P. J. Levine, W. M. Linehan, M. R. Emmert - Buck, S. M. Steinberg, E. C. Kohn, and L. A. Liotta, "Serum proteomic patterns for detection of prostate cancer," J. of National Cancer Institute, vol. 94, no. 20, pp. 1576-1578, Oct. 2002.
[5] E. J. Finehout and K. H. Lee, "An introduction to mass spectrometry applications in biological research," Biochemistry and Molecular Biology Education, vol. 32, no. 2, pp. 93-100, 2004.
[6] A. Jemal, R. Siegel, E. Ward, Y. Hao, J. Xu, T. Murray, and M. J. Thun, "Cancer statistics, 2008" CA Cancer J. Clin., vol. 58, pp. 71-96, 2008.
[7] J. S. Morris, K. R. Coombes, J. Koomen, K. A. Baggerly, and R. Kobayashi, "Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum," Bioinformatics, vol. 21, no. 9, pp. 1764-1775, 2005.
[8] E. F. Petricoin III, A. M. Ardekani, B. A. Hitt, P. J. Levine, V. A. Fusaro, S. M. Steinberg, G. B. Mills, C. Simone, D. A. Fishman, E. C. Kohn, and L. A. Liotta, "Use of proteomic patterns in serum to identify ovarian cancer," The Lancet, vol. 359, pp. 572-577, Feb. 2002.
[9] B. L. Adam, A. Vlahou, O. J. Semmes, and G. L. Wright, "Proteomic approaches to biomarker discovery in prostate and baladder cancers," Proteomics, vol. 1, no. 10, pp. 1264-1270, Oct. 2001.
[10] Y. Qu, B. L. Adam, Y. Yasui, M. D. Ward, S. Nasim, P. F. Schellhammer, Z. Feng, O. J. Semmes, and G. L. Wright, "Boosted decision tree analysis of SELDI mass spectral serum profiles discriminates prostate cancer from noncancer patients," Clinical Chemistry, vol. 48, no. 10, pp. 1835-1843, Oct. 2002.
[11] J. M. Sorace and M. Zhan, "A data review and re - assessment of ovarian cancer serum proteomic profiling," BMC Bioinformatics, vol. 4, no. 24, pp. 1-13, Jun. 2003.
[12] K. A. Baggerly, J. S. Morris, S. R. Edmonson, and K. R. Coombes, "Signal in noise: evaluating reported reproducibility of serum proteomic tests for ovarian cancer," J. of National Cancer Institute, vol. 97, no. 4, pp. 307-309, Feb. 2005.
[13] M. Hilario and A. Kalousis, "Approaches to dimensionality reduction in proteomic biomarker studies," Briefings in Bioinformatics, vol. 9, no. 2, pp. 102-118, Feb. 2008.
[14] K. A. Baggerly, J. S. Morris, and K. R. Coombes, "Reproducibility of SELDI - TOF protein patterns in serum: comparing datasets from different experiments," Bioinformatics, vol. 20, no. 5, pp. 777-785, 22 Mar. 2004.
[15] G. M. Boratyn, M. L. Merchant, and J. B. Klein, "Utilization of human expert techniques for detection of low-abundant peaks in high-resolution mass spectra," 28th IEEE EMBS Annual Int. Conf., pp. 5798-5801, New York City, US, 30 Aug-3 Sep. 2006.
[16] A. G. Hanbury and J. Serra, "Morphological operators on the unit circle," IEEE Trans. Image Processing, vol. 10, no. 12, pp. 1842-1850, Dec. 2001.
[17] S. Mallat, "A wavelet tour of signal processing," Academic Press, 1998.
[18] D. L. Donoho and I. M. Johnstone, "Threshold selection for wavelet shrinkage of noisy data," in Proc. 16th Annual Conf. of the IEEE Engineering in Medicine and Biology Society, vol. 1, pp. 24a-25a, Nov. 1994.
[19] D. L. Donoho, "Denoising by soft-thresholding," IEEE Trans. on Information Theory, vol. 41, no. 3, pp. 613-627, May 1995.
[20] J. Ojanen, T. Miettinen, J. Heikkonen, and J. Rissanen, "Robust denoising of electrophoresis and mass spectrometry signals with minimum description length principle," Federation of European Biochemical Societies Lett.,, vol. 570, no. 1-3, pp. 107-113, 2004.
[21] G. Frosini, B. Lazzerini, and F. Marcelloni, "A modified fuzzy C-means algorithm for feature selection," in Proc. of 19th Int.l Conf. of the North American Fuzzy Information Processing Society, NAFIPS’2000,, Atlanta, US, pp. 148-152, Jul. 2000.
[22] E. D. Hoffman and V. Stroobant, Mass Spectrometry: Principles and Applications, John Wiley and Sons Ltd., 2001.
[23] W. Windig and J. Guilment, "Interactive self-modeling mixture analysis," Analytical Chemistry, vol. 63, no. 14, pp. 1425-1432, 15 Jul. 1991.
[24] L. Cao, P. B. Harrington, and J. Liu, "SIMPLISMA and ALS applied to tow-way nonlinear wavelet compressed ion mobility spectra of chemical warfare agent simulates," Analytical Chemistry, vol. 77, no. 8, pp. 2575-2586, Apr. 2005.
[25] S. A. Astakhov, H. Stogbauer, A. Kraskov, and P. Grassberger, "Monte carlo algorithm for least dependent non-negative mixture decomposition," Analytical Chemistry, vol. 78, no. 5, pp. 1620-1627, 2006.
[26] M. Vannucci, N. Sha, and P. J. Brown, "NIR and mass spectra classification: bayesian methods for wavelet-based feature selection," Chemometrics and Intelligent Laboratory Systems, vol. 77, no. 1-2, pp. 139-148, May 2005.
[27] H. Montazery Kordy, M. H. Miranbaygi, and M. H. Moradi, "Ovarian cancer diagnosis using discrete wavelet transform based feature extraction from serum proteomic patterns," in Cairo Int. Biomedical Engineering Conf., vol. 1, pp. 139-142, Cairo, Egypt, Dec. 2006.
[28] H. Montazery Kordy, M. H. Miranbaygi, and M. H. Moradi, "Diagnosis of prostate cancer by wavelet based feature extraction method using blood proteomic patterns," in Proc. 13th Iranian Conf. in Biomedical Engineering, Tehran, Iran, Jan. 2007.
[29] L. Li, H. Tang, Z. Wu, J. Gong, M. Gruidl, J. Zou, M. Tockman, and R. A. Clark, "Data mining techniques for cancer detection using serum proteomic profiling," Artificial Intelligence in Medicine, vol. 32, no. 2, pp. 71-83, Mar. 2004.
[30] B. Wu, T. Abbott, D. Fishman, W. McMurray, G. Mor, K. Stone, D. Ward, K. Williams, and H. Zhao, "Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data," Bioinformatics, vol. 9, no. 13, pp. 1636-1643, Jul. 2003.