کاهش ابعاد روش پنهانشکنی CDF با استفاده از یک روش انتخاب ویژگی مبتنی بر تئوری گراف
محورهای موضوعی : مهندسی برق و کامپیوترسعید آزادیفر 1 , سیدحسین خواسته 2 * , محمدهادی ادریسی 3
1 - دانشگاه صنعتی خواجه نصیرالدین طوسی
2 - دانشگاه صنعتی خواجه نصیرالدین طوسی
3 - دانشگاه اصفهان
کلید واژه: پنهانشکنیپنهاننگاریانتخاب ویژگیکاهش بعد,
چکیده مقاله :
پنهانشکنی دانش کشف حضور داده پنهان در یک رسانه پوششی است. هدف پنهانشکنی جلوگیری از رسیدن روشهای پنهاننگاری به اهداف خود میباشد. یکی از معروفترین روشهای پنهانشکنی روش CDF است که در این پژوهش استفاده شده است. یکی از چالشهای عمده در مسئله پنهانشکنی تصاویر تعداد زیاد ویژگیهای استخراجشده برای این کار است. مجموعههای دادهای با ابعاد بالا از دو جهت باعث کاهش عملکرد پنهانشکنی میشود. از یک طرف با افزایش ابعاد دادهها، حجم محاسبات افزایش پیدا میکند و از طرف دیگر مدلی که بر اساس دادههای با ابعاد بالا ساخته میشود دارای قابلیت تعمیم پایینی است و احتمال بیشبرازش افزایش مییابد. در نتیجه، کاهش ابعاد مسئله میتواند هم پیچیدگی محاسباتی را کاهش داده و هم باعث بهبود عملکرد پنهانشکنی شود. در این مقاله تلاش شده با ترکیب مفهوم زیرگراف کامل بیشینه وزندار و معیار مرکزیت یال و در نظر گرفتن مناسببودن هر ویژگی، ویژگیهای تأثیرگذار و دارای حداقل افزونگی بهعنوان ویژگیهای نهایی انتخاب شوند. نتایج شبیهسازی بر روی مجموعه دادههای SPAM و CC-PEV نشان داد روش پیشنهادی دارای عملکرد مناسبی است و به دقت حدود 96% در تشخیص جاسازی داده در تصاویر دست پیدا کرده و همچنین این روش در مقایسه با روشهای شناختهشده قبلی دارای دقت بالاتری است.
The steganalysis purpose is to prevent the pursuit of steganography methods for your goals. In steganography, in order to evaluate new ideas, there should be known steganalysis attacks on them, and the results should be compared with other existing methods. One of the most well-known steganalysis methods is CDF method that used in this research. One of the major challenges in the image steganalysis issue is the large number of extracted features. High-dimensional data sets from two directions reduce steganalysis performance. On the one hand, with the increase in the dimensions of the data, the volume of computing increases, and on the other hand, a model based on high-dimensional data has a low generalization capability and increases probability of overfitting. As a result, reducing the dimensions of the problem can both reduce the computational complexity and improve the steganalysis performance. In this paper, has been tried to combine the concept of the maximum weighted clique problem and edge centrality measure, and to consider the suitability of each feature, to select the most effective features with minimum redundancy as the final features. The simulation results on the SPAM and CC-PEV data showed that the proposed method had a good performance and accurately obtained about 96% in the detection of data embedding in the images, and this method is more accurate than the previously known methods.
[1] S. M. Badr, G. Ismaial, and A. H. Khalil, "A review on steganalysis techniques: from image format point of view," International J. of Computer Applications, vol. 102, no. 4, pp. 11-19, Sep. 2014.
[2] V. Bhasin, P. Bedi, and A. Singhal, "Feature selection for steganalysis based on modified stochastic diffusion search using Fisher score," in Proc. Int. Conf. on Advances in Computing, Communications and Informatics, ICACCI'14, pp. 2323-2330, New Delhi, India, 24-27 Sept. 2014.
[3] Y. Miche, B. Roue, A. Lendasse, and P. Bas, "A feature selection methodology for steganalysis," in B. Gunsel, A. K. Jain, A. M. Tekalp, B. Sankur (Eds.) Multimedia Content Representation, Classification and Security: Int. Workshop, MRCS 2006, Springer, Berlin, Heidelberg, pp. 49-56, Sep. 2006.
[4] T. Pevn, P. Bas, and J. Fridrich, "Steganalysis by subtractive pixel adjacency matrix," Trans. Info. For. Sec., vol. 5, no. 2, pp. 215-224, Jun. 2010.
[5] A. Westfeld and A. Pfitzmann, Attacks on Steganographic Systems, Information Hiding, 2000.
[6] J. Fridrich, M. Goljan, and R. Du, "Reliable detection of LSB steganography in color and grayscale images," in Proc. of the Workshop on Multimedia and Security, MM&Sec'01, pp. 27-30, Ottawa, ON, Canada, 5-5 Oct 2001.
[7] S. Dumitrescu, X. Wu, and Z. Wang, "Detection of LSB steganography via sample pair analysis," Information Hiding, vol. 51, no. 7, pp. 1995-2007, Jul. 2003.
[8] A. Westfeld, Detecting Low Embedding Rates," in: F. A. P. Petitcolas (Ed.) Information Hiding: 5th International Workshop, IH 2002, Springer Berlin Heidelberg, Berlin, 2003.
[9] S. M. S. Tarzjani and S. Ghaemmaghami, "Detection of LSB replacement and LSB matching steganography using Gray level run length matrix," in Proc. of 5th Int. Conf. on Intelligent Information Hiding and Multimedia Signal Processing, pp. 787-790, Kyoto, Japan, 12-14 Sept. 2009.
[10] M. Goljan, J. Fridrich, and T. Holotyak, "New blind steganalysis and its implications," Proc. SPIE 6072, Security, Steganography, and Watermarking of Multimedia Contents VIII, pp. 1-13, 2006.
[11] J. Harmsen and W. Pearlman, “Steganalysis of additive-noise modelable information hiding,” in Proc. SPIE Security Watermarking Multimedia Contents, vol. 5020, pp. 131-142, 2003.
[12] Z. Zhang and E. R. Hancock, "Hypergraph based information-theoretic feature selection," Pattern Recognition Letters, vol. 33, no. 15, pp. 1991-1999, 1 Nov. 2012.
[13] P. Zhu, W. Zhu, Q. Hu, C. Zhang, and W. Zuo, "Subspace clustering guided unsupervised feature selection," Pattern Recognition, vol. 66, pp. 364-374, Jun. 2017.
[14] H. Peng, F. Long, and C. Ding, "Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, Aug. 2005.
[15] M. Belkin and P. Niyogi, "Laplacian eigenmaps and spectral techniques for embedding and clustering," Proc. of the 14th Int. Conf. on Neural Information Processing Systems: Natural and Synthetic, NIPS'01, pp. 585-592, Vancouver, BC, Canada, 3-8 Dec. 2002.
[16] J. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Trans. Pattern Anal. Machine Intell, vol. 22, no. 8, pp. 888-905, Aug. 2000.
[17] F. Chung, "Spectral graph theory," in Regional Conf. Series in Mathematics American Mathematical Society, vol. 92, pp. 1-212, 1997.
[18] Q. Gu, Z. Li, and J. Han, "Generalized fisher score for feature selection," in Proc. of the Int. Conf. on Uncertainty in Artificial Intelligence, pp. 266-273, Barcelona, Spain, 14-17 Jul. 2011.
[19] X. He, D. Cai, and P. Niyogi1, "Laplacian score for feature selection," in Proc. of the 18th Int. Conf. on Neural Information Processing System, NIPS'05, pp. 507-514, Vancouver, BC, Canada, 5-8 Dec. 2005.
[20] H. Cheng, W. Deng, C. Fu, Y. Wang, and Z. Qin, "Graph-based semi-supervised feature selection with application to automatic spam image identification," In: Yu Y., Yu Z., Zhao J. (eds) Computer Science for Environmental Engineering and EcoInformatics. CSEEE 2011. Communications in Computer and Information Science, vol. 159. Springer, Berlin, pp. 259-264, 2011.
[21] Q. Song, J. Ni, and G. Wang, "A fast clustering-based feature subset selection algorithm for high-dimensional data," IEEE Trans. on Knowledge and Data Engineering, vol. 25, no. 1, pp. 1-14, Jan. 2013.
[22] S. Bandyopadhyay, T. Bhadra, P. Mitra, and U. Maulik, "Integration of dense subgraph finding with feature clustering for unsupervised feature selection," Pattern Recognition Letters, vol. 40, pp. 104-112, 15 Apr. 2014.
[23] S. Tabakhi and P. Moradi, "Relevance-redundancy feature selection based on ant colony optimization," Pattern Recognition, vol. 48, no. 9, pp. 2798-2811, Sept. 2015.
[24] P. Moradi and M. Rostami, "A graph theoretic approach for unsupervised feature selection," Engineering Applications of Artificial Intelligence, vol. 44, pp. 33-45, Sept. 2015.
[25] J. Che, Y. Yang, L. Li, X. Bai, S. Zhang, and C. Deng, "Maximum relevance minimum common redundancy feature selection for nonlinear data," Information Sciences, vol. 409-410, pp. 68-86, Oct. 2017.
[26] M. Ghaemi and M. R. Feizi-Derakhshi, "Feature selection using forest optimization algorithm," Pattern Recognition, vol. 60, pp. 121-129, Dec. 2016.
[27] A. K. Farahat, A. Ghodsi, and M. S. Kamel, "Efficient greedy feature selection for unsupervised learning," Knowledge and Information Systems, vol. 35, no. 2, pp. 285-310, May 2013.
[28] A. J. Ferreira and M. A. T. Figueiredo, "An unsupervised approach to feature discretization and selection," Pattern Recognition, vol. 45, no. 9, pp. 3048-3060, Sept. 2012.
[29] Q. Wu, J. K. Hao, and F. Glover, "Multi-neighborhood tabu search for the maximum weight clique problem," Ann Oper Res, vol. 196, no. 1, pp. 611-634, Jul. 2012.
[30] U. Brandes, "On variants of shortest-path betweenness centrality and their generic computation," Social Networks, vol. 30, no. 2, pp. 136-145, May 2008.
[31] M. Mandal and A. Mukhopadhyay, "Unsupervised non-redundant feature selection: a graph-theoretic approach," in Proc. of the Int. Conf. on Frontiers of Intelligent Computing: Theory and Applications, FICTA'13, pp. 373-380, 2013.
[32] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten, The WEKA data mining software, Available from: <http://www.cs.waikato.ac.nz/ml/weka>.
[33] J. H. J. Xiao, K. Ehinger, A. Oliva, and A. Torralba, "SUN database: large-scale scene recognition from abbey to zoo," in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 3485 -3492, San Francisco, CA, USA , 13-18 Jun. 2010.
[34] M. Friedman, "A comparison of alternative tests of significance for the problem of m rankings," Annals of Math. Statistics, vol. 11, no. 1, pp. 86-92, 1940.