شرحنگاری خودکار تصویر با روش چرخش بلاک اصلی
محورهای موضوعی : مهندسی برق و کامپیوترحازم الرکابی 1 , نسرین صوفی 2 , هادی صدوقی یزدی 3 * , امیرحسین طاهرینیا 4
1 - دانشگاه فردوسي مشهد
2 - دانشگاه فردوسي مشهد
3 - دانشگاه فردوسی مشهد
4 - دانشگاه فردوسي مشهد
کلید واژه: تجزیه نامنفی ماتریسچرخش بلاک اصلیk- نزدیکترین همسایهشرحنگاری تصاویر,
چکیده مقاله :
سیستمهای شرحنگاری خودکار تصاویر، وظیفه توصیف محتوای تصاویر به وسیله تخصیص برچسب به آنها را بر عهده دارند. هدف از انجام این تحقیق بهبود نتایج دقت و سرعت یک سیستم شرحنگار تصاویر است. اخیراً با توجه به رشد روزافزون تصاویر، فرایند شرحنگاری بر روی پایههای تصاویر به جای خودشان اجرا میگردد. یکی از این روشهای جدید، پیادهسازی الگوریتم تجزیه نامنفی ماتریس (NMF) بر روی ویژگیهای به دست آمده از تصاویر است. در روش پیشنهادی برای افزایش سرعت و کارایی بهتر سیستم شرحنگاری، برای اولین بار از روشی به نام چرخش بلاک اصلی برای حل NMF در شرحنگاری استفاده شده است. این روش با توانایی افزودن برخط کلاس جدیدی از دادهها به دانش خود و یادگیری دانش به صورت فشرده و علاوه بر آن، توانایی آموزش بر اساس دادههای دریافتی بدون نیاز به پردازش مجدد توانسته از روشهای پیشین ارائهشده برای حل NMF عملکرد بهتری را نشان دهد. در مرحله آموزش با روش چرخش بلاک اصلی ماتریس ضرایب و پایه تصاویر ورودی به دست میآیند. سپس در مرحله آزمون برای تصویر ورودی، توسط ویژگیهای استخراجشده از تصویر و ضرایب به دست آمده از مرحله آموزش، ضریب تعلق تصویر آزمون به هر یک از کلاسهای تصاویر آموزش محاسبه میگردد. سپس این ضریب در هنگام جستجو در میان تصاویر آموزش برای تخصیص برچسب به تصویر آزمون، دقت کار را افزایش میدهد. این جستجو توسط روش KNN بر روی پایههای تصاویر صورت میگیرد. برای آزمایش روش پیشنهادی از دو پایگاه داده K5Corel و دادههای واقعی حیوانات (برگرفته از px 500) استفاده شده و نهایتاً با روشهای موجود مقایسه شده که در پایگاه داده K5Corel به میزان دقت 20/50 و روی دادههای واقعی به 89/62 رسیدیم که بهطور قابل ملاحظهای دقت افزایش یافته است.
Image annotation systems are responsible for describing the content of the images by assigning tags to them. The purpose of this research is to improve the accuracy and speed of image annotation system. Recently, with the growing of images, the image annotation process is based on the basics of images instead of themselves. One of these new methods is the implementation of the non-negative matrix algorithm (NMF) on the features of the images. In the proposed method, for the first time, in order to increase the speed and efficiency of the7 system, we use a method that called the block principal pivoting for the NMF solution. This method has ability to add online new class of data to its knowledge and knowledge learning in a compact form. Moreover, the ability to train based on received data without having to be re-processed. In the training phase, the matrix of the coefficients and the base of the input images are obtained using the Block Principal Pivoting method. Then, at the test phase for the input image, by extracted features of the image and the coefficients obtained from the training phase, the coefficient of belonging to the test image is calculated to each of the classes of training images. Then, this coefficient while searching among the teaching images for assigning the label to test image increases the accuracy of the algorithm. This search is done by the KNN method on the base of the images. To test the proposed method, we used two databases Corel5K and real animal data (derived from 500px) and, finally, compared with existing methods, which we found in the Corel5K database at a precision of 50.20 and real data was 62.89. Precision have been increased considerably.
[1] J. Kim and H. Park, "Fast nonnegative matrix factorization: an active-set-like method and comparisons," SIAM J. on Scientific Computing, vol. 33, no. 6, pp. 3261-3281, 2011.
[2] C. F. Tsai and C. Hung, "Automatically annotating images with keywords: a review of image annotation systems," Recent Patents on Computer Science, vol. 1, no. 1, pp. 55-68, Jan. 2008.
[3] J. Li and J. Z. Wang, "Real-time computerized annotation of pictures," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 985-1002, Jun. 2008.
[4] D. Putthividhy, H. T. Attias, and S. S. Nagarajan, "Topic regression multi-modal latent Dirichlet allocation for image annotation," in Proc IEEE Conf. on Computer Vision and Pattern Recognition, CVPR'10, pp. 3408-3415, San Francisco, CA, USA, 13-18 Jun. 2010.
[5] M. M. Kalayeh, H. Idrees, and M. Shah, "NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization," in Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, CVPR'14, pp. 184-191, Columbus, OH, USA, 23-28 Jun. 2014.
[6] Z. Li, et al., "Learning semantic concepts from image database with hybrid generative/discriminative approach," Engineering Applications of Artificial Intelligence, vol. 26, no. 9, pp. 2143-2152, Oct. 2013.
[7] M. Guillaumin, T. Mensink, J. Verbeek, and C. Schmid, "Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation," in Proc. IEEE 12th Int. Conf. on Computer Vision, pp. 309-316, Kyoto, Japan, 29 Sept.-2 Oct. 2009.
[8] D. D. Lee and H. S. Seung, "Learning the parts of objects by non-negative matrix factorization," Nature, vol. 401, no. 6755, pp. 788-791, 21 Oct. 1999.
[9] A. Eweiwi, M. S. Cheema, and C. Bauckhage, "Discriminative joint non-negative matrix factorization for human action classification," in Proc. German Conf. on Pattern Recognition, GPRC'13, pp. 61-70, vol 8142. Springer, Berlin, 2013.
[10] J. Liu, C. Wang , J. Gao, and J. Ha, "Multi-view clustering via joint nonnegative matrix factorization," in Proc. of the SIAM Int.l Conf. on Data Mining, SIAM, pp. 252-260, Austin, TX, USA, 2-4 May 2013.
[11] M. Chen, A. Zheng, and K. Weinberger, "Fast image tagging," in Proc. 30th Int. Conf. on Machine Learning, 9 pp., Atlanta, GA, USA, Jan. 2013.
[12] R. Rad and M. Jamzad, "Automatic image annotation by a loosely joint non-negative matrix factorization," IET Computer Vision, vol. 9, no. 6, pp. 806-813, Dec. 2015.
[13] R. Rad and M. Jamzad, "Image annotation using multi-view non-negative matrix factorization with different number of basis vectors," J. of Visual Communication and Image Representation, vol. 46, pp. 1-12, Jul. 2017.
[14] M. T. Chao and Y. S. Chen, "Keyboard recognition from scale-invariant feature transform," in Proc. IEEE In. Conf. on Consumer Electronics-Taiwan, ICCE-TW'17, pp. 205-206, Taipei, Taiwan,12-14 Jun. 2017.
[15] P. Duygulu, K. Barnard, J. F. G. de Freitas, and D. A. Forsyth, "Object recognition as machine translation: learning a lexicon for a fixed image vocabulary," in Proc. European Conf. on Computer Vision, vol. 4, pp. 97-112, May 28-31 2002.
[16] Y. Verma and C. Jawahar, "Image annotation using metric learning in semantic neighbourhoods," in Proc. European Conf. on Computer Vision, ECCV'12, vol. 3, pp. 836-849, Florence, Italy, 7-13 Oct. 2012.
[17] S. Moran and V. Lavrenko, "A sparse kernel relevance model for automatic image annotation," International J. of Multimedia Information Retrieval, vol. 3, no. 4, pp. 209-229, Nov. 2014.
[18] Y. Yang, W. Zhang, and Y. Xie, "Image automatic annotation via multi-view deep representation," J. of Visual Communication and Image Representation, vol. 53, pp. 368-377, Nov. 2015.
[19] V. N. Murthy, E. F. Can, and R. Manmatha, "A hybrid model for automatic image annotation," in Proc. of Int. Conf. on Multimedia Retrieval, ICMR'14, pp. 369-376, Glasgow, United Kingdom, 1-4 Apr. 2014.
[20] Z. Lu and Y. Peng, "Image annotation by semantic sparse recoding of visual content," in Proc. of the 20th ACM Int. Conf. on Multimedia, pp. 499-508, Nara, Japan, 29 Oct.-2-Nov. 2012.
[21] Y. Xiang, X. Zhou, T. –S. Chua, and C. –W. Ngo, "A revisit of generative model for automatic image annotation using Markov random fields," in Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR'09, pp. 1153-1160, Miami, FL, USA, 20-25 Jun. 2009.
[22] A. Makadia, V. Pavlovic, and S. Kumar, " A new baseline for image annotation," Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR'08, vol. 3, pp. 316-329, Marseille, France, 12-18 Oct. 2008.