معماری پایگاه داده تحلیلی تقریباً بیدرنگ مبتنی بر هستانشناسی
محورهای موضوعی : مهندسی برق و کامپیوترسیدمصطفی شفائی 1 , سیدمجید شفائی 2
1 - دانشگاه تربیت دبیر شهیدرجایی
2 - دانشگاه آزاد اسلامی، واحد تهران مركز
کلید واژه: پایگاه داده تحلیلی تقریباً بیدرنگ هستانشناسی داده خارجی پرس و جوی مقایسهای پرس و جوی ترتیبی دید ذخیرهشده,
چکیده مقاله :
پایگاه داده تحلیلی، پشتیبانی از دادههای خارجی را که به صورت پویا بعد از ساخت و طراحی پایگاه داده تحلیلی مورد نیاز است مهیا نمیکند. از این رو تحلیلگر برای انجام تحلیلهای مؤثر خود نیازمند پیداکردن همبستگی میان دادههای خارجی و پایگاه داده تحلیلی است و در مواقعی نیز نیازمند مقایسه بین هر دو داده پایگاه داده تحلیلی و دادههای خارجی با یکدیگر میباشد. همچنین تحلیلگر مجبور است که برای برخی از موقعیتهای تکراری کارهای گذشته را تکرار کند که این کارها شامل اصطلاحات، ایجاد معیارها و مقایسه میباشد. برای فارغشدن از این مسایل در این مقاله تلاش شده که یک معماری تقریباً بیدرنگ مبتنی بر هستانشناسی پیشنهاد شود. علاوه بر این الگوریتمی نیز جهت کاهش زمان پاسخدهی به پرس و جوهای تحلیلی کاربران با استفاده از دیدهای ذخیرهشده و پردازش موازی پیشنهاد میشود. نمونههای مطالعاتی به منظور نشاندادن نحوه ایجاد همبستگی میان دادههای خارجی با داده پایگاه داده تحلیلی صورت گرفت و نتایج به دست آمده کشف همبستگی میان دادههای خارجی و داده پایگاه داده تحلیلی را نشان میدهد. در آزمایشها استفاده از دیدهای ذخیرهشده در دو رویکرد مستقیم و والد در پایگاه داده تحلیلی معماری موجود، باعث کاهش زمان پاسخدهی به پرس و جوهای ترتیبی، مقایسهای و ترکیبی موازی کاربران میشود.
Data warehouse does not provide external data that are required to dynamically build after design and create the data warehouse. Therefore, analysts conduct effective analysis to find a correlation between external data and data warehouse data, and in other cases requires a comparison both external data and data warehouse data together. The analyst forced to repeat some past repetitive situations. This includes creating terminology, measures and comparison. In this paper, for graduates of this problem, a real-time data warehouse architecture based on ontology is proposed. Furthermore, an algorithm to reduce the response time to users’ queries using materialized views and parallel processing is proposed. A case study to demonstrate how to create correlation between external data and data warehouse data is done and the results show the correlation between external data and data warehouse data is discovered. In experiments, using both direct and parent materialized views approaches in existing data warehouse architecture, reduce response time to users’ sequential, comparative and combination queries.
[1] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques, Morgan Kaufmann, 3rd Ed., Jun. 2011.
[2] T. Winsemann, V. Koppen, and G. Saake, "A layered architecture for enterprise data warehouse systems," in Proc. Int. United Information Systems Conf., UNISCON'12, pp. 192-199, 2012.
[3] Y. Sharma, R. Nasri, and K. Askand, "Building a data warehousing infrastructure based on service oriented architecture," in Proc. IEEE Int. Conf. on Cloud Computing Technologies, Applications and Management, ICCCTAM'12, pp. 82-87, Dubai, United Arab Emirates, 8-10 Dec. 2012.
[4] V. Gonzalez-Castro, L. M. MacKinnon, and M. Del Pilar Angeles, "An alternative data warehouse reference architectural configuration," in Proc. British National Conf. on Databases. vol. 5588, pp. 33-41, 2009.
[5] Y. Zhu, L. An, and S. Liu, "Data updating and query in real-time data warehouse system," in Proc. Int. Conf. on. Computer Science and Software Engineering, vol. 5, pp. 1295-1297, Hubei, China, 12-14 Dec. 2008.
[6] R. J. Santos and J. Bernardino, "Real-time data warehouse loading methodology," in Proc. of the 2008 Int. Symp. on Database Engineering & Applications, pp. 49-58, Coimbra, Portugal, 10-12 Sept. 2008.
[7] A. Cuzzocrea, N. Ferreira, and P. Furtado, "Enhancing traditional data warehousing architectures with real-time capabilities," in Proc. Int. Symp. on Methodologies for Intelligent Systems, ISMIS'14, pp. 456-465, 2014.
[8] M. Obali, B. Dursun, Z Erdem, and A. K. Görur., "A real time data warehouse approach for data processing," in Proc. IEEE Signal Processing and Communications Applications Conf., SIU'13, 4 pp., Haspolat, Turkey, 24-26 Apr. 2013.
[9] J. Zuters, "Near real-time data warehousing with multi-stage trickle and flip," in Proc. Int. Conf. on Business Informatics Research, vol. 90, pp. 73-82, 2011.
[10] Y. Mao, W. Min, J Wang, B. Jia, and Q. Jie., "Dynamic mirror based real-time query contention solution for support big real-time data analysis," in Proc. IEEE 2nd. Int. Conf. on Information Technology and Electronic Commerce, ICITEC'14, pp. 229-233, Dalian, China, 20-21 Dec. 2014.
[11] W. Qu, V. Basavaraj, S. Shankar, amd S. Dessloch, "Real-time snapshot maintenance with incremental ETL pipelines in data warehouses," in Proc. Int. Conf. on Big Data Analytics and Knowledge Discovery. vol. 9263, pp. 217-228, 2015.
[12] Z. Lin, Y. Lai, C. Lin, Y. Xie, and Q. Zou, "Maintaining internal consistency of report for real-time OLAP with layer-based view," in Proc. Asia-Pacific Web Conf. vol. 6612, pp. 143-154, 2011.
[13] I. Hamdi, E. Bouazizi, and J. Feki, "Dynamic management of materialized views in real-time data warehouses," in Proc. IEEE 6th Int. Conf. on Soft Computing and Pattern Recognition, SoCPaR'14, pp. 168-173, Tunis, Tunisia, 11-14 Aug. 2014.
[14] T. Jain, "Refreshing datawarehouse in near real-time," International J. of Computer Applications, vol. 46, no. 18, pp. 24-29, May 2012.
[15] M. A. Naeem, G. Dobbie, and G. Webber, "An event-based near real-time data integration architecture," in Proc. IEEE 12th Enterprise Distributed Object Computing Conf. Workshops, pp. 401-404, Munich, Germany, 16-16 Sept. 2008.
[16] S. YiChuan and X. Yao, "Research of real-time data warehouse storage strategy based on multi-level caches," Physics Procedia, vol. 25, pp. 2315-2321, 2012.
[17] R. J. Santos, J. Bernardino, and M. Vieira, "24/7 real-time data warehousing: a tool for continuous actionable knowledge," in Proc. IEEE 35th Annual Computer Software and Applications Conf., pp. 279-288, Munich, Germany, 18-22 Jul. 2011.
[18] N. Ferreira, P. Martins, and P. Furtado, "Near real-time with traditional data warehouse architectures: factors and how-to," in Proc. of the 17th Int. Database Engineering & Applications Symp., pp. 68-75, Barcelona, Spain, 9-13 Oct. 2013.
[19] P. O'Neil, E. O’Neil, X. Chen, and S. Revilak, "The star schema benchmark and augmented fact table indexing," in Proc. Technology Conf. on Performance Evaluation and Benchmarking, vol. 5895, pp. 237-252, 009.
[20] T. M. Nguyen and A. Min Tjoa, "Zero-latency data warehousing for heterogeneous data sources and continuous data streams," in Proc. 5th Int. Conf. on Information Integration and Web-Based Applications Services, pp. 55-64, Jakarta, Indonesia, 2003.
[21] L. Golab and T. Johnson, "Data stream warehousing," in Proc. IEEE 30th Int. Conf. on Data Engineering, pp. 949-952, Chicago, IL, USA, 31 Mar.-4 Apr. 2014.
[22] M. Gorawski and A. Gorawska, "Research on the stream ETL process," in Proc. Int. Conf. Beyond Databases, Architectures and Structures, vol. 424, pp. 61-71, 2014.
[23] R. Abrahiem, "A new generation of middleware solutions for a near-real-time data warehousing architecture," in Proc. IEEE Int. Conf. on Electro/Information Technology, pp. 192-197, Chicago, IL, USA, 17-20 May 2007.
[24] B. Neumayr, M. Schrefl, and K. Linner, "Semantic cockpit: an ontology-driven, interactive business intelligence tool for comparative data analysis," in Proc. Int. Conf. on Conceptual Modeling, vol. 6999, pp. 55-64, 2011.
[25] T. Niemi and M. Niinimaki, "Ontologies and summarizability in OLAP," in Proc. of the ACM Symp. on Applied Computing, pp. 1349-1353, Sierre, Switzerland, 22-26 May 2010.
[26] T. Neubock, B. Neumayr, M. Schrefl, and C. Schutz., "Ontology-driven business intelligence for comparative data analysis," Business Intelligence, vol. 172, pp. 77-120, 2014.
[27] S. Khouri and B. Ladjel, "A methodology and tool for conceptual designing a data warehouse from ontology-based sources," in Proc. of the ACM 13th Int. Workshop on Data Warehousing and OLAP, pp. 19-24, Toronto, ON, Canada, 30-30 Oct. 2010.
[28] M. Niinimaki and T. Niemi, "An ETL process for OLAP using RDF/OWL ontologies," J. on Data Semantics XIII, vol. 5530, pp. 97-119, 2009.
[29] O. Romero and A. Abello, "Automating multidimensional design from ontologies," in Proc. of the ACM 10th Int. Workshop on Data Warehousing and OLAP, 8 pp, 9-9 Nov. 2007.
[30] D. Skoutas, A. Simitsis, and T. Sellis, "Ontology-driven conceptual design of ETL processes using graph transformations," J. on Data Semantics XIII. vol. 5530, pp. 120-146, 2009.
[31] C. Koncilia, J. Eder, and T. Morzy, "Analyzing sequential data in standard OLAP architectures," in Proc. 18th East European Conf., Advances in Databases and Information Systems, ADBIS'14, Ohrid, Macedonia, vol. 8716, pp. 56-69, Sept. 2014.
[32] B. Bebel, M. Morzy, T. Morzy, Z. Krolikowski, and R. Wrembel, "OLAP-like analysis of time point-based sequential data," in Proc. Advances in Conceptual Modeling, vol. 7518, pp. 153-161, Florence, Italy, Oct. 2012.