林子雨注:本文是在北京大学数据库实验室攻读博士学位期间发表的。
[LYWST08]林子雨,杨冬青,宋国杰,王腾蛟,唐世渭.实时主动数据仓库中多维数据实视图的选择. 《软件学报》. Vol 19 (2), Feb, 2008.PP:301-313.[全文PDF下载]
实时主动数据仓库中多维数据实视图的选择∗
林子雨1, 杨冬青1, 宋国杰2+, 王腾蛟1, 唐世渭2
1(北京大学 信息科学技术学院,北京 100871)
2(北京大学 视觉与听觉信息处理国家重点实验室,北京 100871)
Materialized Views Selection of Multi-Dimensional Data in Real-Time Active Data Warehouses
LIN Zi-Yu1, YANG Dong-Qing1, SONG Guo-Jie2+, WANG Teng-Jiao1, TANG Shi-Wei2
1(School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China)
2(State Key Laboratory of Machine Perception, Peking University, Beijing 100871, China)
+ Corresponding author: Phn: +86-10-62755440, E-mail: gjsong@pku.edu.cn, http://db.pku.edu.cn
Lin ZY, Yang DQ, Song GJ, Wang TJ, Tang SW. Materialized views selection of multi-dimensional data in real-time active data warehouses. Journal of Software, 2008,19(2):301−313. http://www.jos.org.cn/1000-9825/ 19/301.htm
Abstract: In this paper, data mining based on the log of active decision engine is introduced to find the CUBE using pattern of analysis rules, which can be used as important reference information for materialized views selection. Based on it, a 3A probability model is designed, and the greedy algorithm, called PGreedy (probability greedy), is proposed, which takes into account the probability distribution of CUBE. Also view keeping rule is adopted to achieve better performance for dynamic view adjusting. Experimental results show that PGreedy algorithm can achieve better performance than BPUS (benefit per unit space) algorithm in real-time active data warehouses environment.
Key words: view selection; materialized view; data warehouse; active decision engine; analysis rule; OLAP (online analytical processing)
摘 要: 通过基于主动决策引擎日志的数据挖掘来找到分析规则的CUBE使用模式,从而为多维数据实视图选择算法提供重要依据;在此基础上设计了3A概率模型,并给出考虑CUBE受访概率分布的视图选择贪婪算法PGreedy (probability greedy),以及结合视图挽留原则的视图动态调整算法.实验结果表明,在实时主动数据仓库环境下,PGreedy算法比BPUS(benefit per unit space)算法具有更好的性能.
关键词: 视图选择;实视图;数据仓库;主动决策引擎;分析规则;联机分析处理