第12章 基于闪存的键值存储

大数据之门

《闪存数据库概念与技术》

厦门大学数据库实验室  林子雨  编著

本网页是第12章内容,全书内容请点击这里访问本书官网,官网提供本书整本电子书PDF下载

第12章 基于闪存的键值存储

本章内容首先介绍基于闪存的键值存储的应用领域,然后介绍相关的研究。

12.1    基于闪存的键值存储的应用领域

文献[DebnathSL10]中列举了键值存储的两个重要的实际应用领域:多人游戏和重复数据删除。

1)多人游戏:目前网络在线多人游戏产业非常繁荣,吸引了大量的玩家。多人游戏中,要求处于不同地理位置的多个玩家,能够及时交互信息,这就需要服务器能够及时获得、存储和处理各个玩家的游戏状态信息。为了保证这些玩家信息处理的实时性,就要求存储这些数据的存储设备具有很高的访问速度。传统的磁盘的读写延迟比较高,无法有效满足这种需求,速度较快的RAM产品价格太高,无法大量使用。因此,性价比较高的闪存,就成为可以满足这类应用需求的首选产品。

2)重复数据删除:在云存储时代,百度网盘等云存储产品越来越受到网民的欢迎,每天都有大量的用户把相关数据传输到云存储中。另外,对于企业而言,为了保证数据的可用性和可靠性,也都建立了数据备份制度。无论是个人用户数据还是企业用户数据,都可能存在重复存储的可能。在当今数据大爆炸的时代,重复数据删除的作用就尤为凸显,可以帮助企业减少重复数据存储,节约存储空间,降低企业开销。重复数据删除的方法就是,把文件分成多个块,然后对每个块使用SHA-1哈希,从而判定两个块是否包含相同的数据。为了加快速度,一般都是放在RAM中运行重复数据删除算法。但是,随着数据的暴增,RAM有限的空间已经无法容纳如此多的数据,因此,就必须把数据保存到磁盘上,通过建立索引来快速获取磁盘中的数据。但是,磁盘的读写延迟较大,因此,如果采用性价比高的闪存来替代磁盘的角色,可以获得更好的重复数据删除性能。

12.2    相关研究

Debnath等人于2010年和2011年先后提出了两种相似的基于闪存的键值存储方案FlashStore[DebnathSL10]和SkimpyStash[DebnathSL11],两种方案很大的区别就在于,FlashStore用于磁盘和内存之间的中间存储,而SkimpyStash则可以直接作为底层的数据存储。

12.2.1      FlashStore

FlashStore[DebnathSL10]在闪存中以日志结构的方式来存储“键值对”,从而可以充分获得更好的顺序写性能。同时,采用常驻内存的哈希表来为这些“键值对”提供索引,当哈希表出现冲突时,采用一种“布谷鸟哈希”的变体来解决冲突。如图[FlashStore]所示,FlashStore包含了如下组件:

(1)RAM写缓冲区:所有更新操作都首先进入缓冲区保存,只有缓冲区的数据量可以正好填满一个闪存页时,才一次性把缓冲区的数据刷新到闪存中。

(2)RAM哈希表索引:这是为闪存中所存储的“键值对”建立的索引,通过这个索引,只需要一次闪存读操作,就可以获得闪存中的数据。采用了一种“布谷鸟哈希”的变体来解决哈希表冲突。

(3)RAM读缓存:保存最近读取过的“键值对”,并且采用最近最少使用策略进行替换。

(4)近期位向量:可以记录“哈希表索引”中近期有哪些条目被访问过。

(5)布隆过滤器[BroderM02]:用来判断一个“键值对”是否已经被保存到磁盘中。

(6)闪存:用来存储经常使用的“键值对”,闪存页被组织成循环链表的形式,系统会周期性地执行“回收”过程,从链表中回收无效页和不经常使用的页(这时会用到近期位向量)。

(7)磁盘:从闪存中驱逐出来的“键值对”会被保存到磁盘中,一般都是不常用的数据。

图[FlashStore]FlashStore的体系架构

图[FlashStore] FlashStore的体系架构

FlashStore采用了布隆过滤器来提升读写过程的速度,因此,这里简单介绍一下布隆过滤器的原理。布隆过滤器[BaiduBaike]是由布隆在1970年提出的,它实际上是一个很长的二进制向量和一系列随机映射函数。布隆过滤器可以用于检索一个元素是否在一个集合中,它的优点是空间效率和查询时间都远远超过一般的算法,缺点是有一定的误识别率。布隆过滤器采用的是哈希函数的方法,将一个元素映射到一个长度为m的阵列上的一个点,当这个点是 1 时,那么这个元素在集合内,反之则不在集合内。这个方法的缺点是,当检测的元素很多的时候可能有冲突,解决方法就是使用 k 个哈希函数对应 k 个点,如果所有点都是 1 的话,那么元素在集合内,如果有 0 的话,则判断元素不在集合内。

下面简单介绍FlashStore的读写操作过程:

(1)读操作过程:当使用一个键去读取数据时,首先到RAM读缓存中查找,如果找不到,就继续到RAM写缓冲区去查找,如果还是找不到,就查找RAM哈希表索引,如果仍然找不到,就利用布隆过滤器来判断这个键是否存在于磁盘中,如果布隆过滤器判断为存在,则到磁盘中寻找该键,如果判断为不存在,则返回空值。在上述过程中,如果在RAM读缓存之外的其他地方找到了该键,就把该键更新到RAM读缓存中,同时也会发生“近期位向量”的更新,从而记录最近被访问过的哈希表索引条目。

(2)写操作过程:首先把一个“键值对”写入到RAM写缓冲区,如果缓冲区中已经存在一个旧版本的数据,就让旧版本失效。如果发现缓冲区已满一页,就把数据刷新到闪存中,并更新RAM哈希表索引。

FlashStore的缺点是:(1)使用变种的布谷鸟哈希函数时,需要人为设定所使用的哈希函数的数目,这个值不同,所取得的性能也不同,而对于不同的负载类型,键值分布区间不同,为达到最好性能所需要设定的哈希函数的数目也是不同的。(2)随着键的数量的增加,布隆过滤器的误识别率也会增加,这是个不容忽视的问题。

下面进一步分析一下布隆过滤器的特性对于FlashStore的性能影响。对于布隆过滤器而言,在采用多个哈希函数时,如果至少有一个函数判断元素不在集合中,那肯定就不在。如果它们都说元素在集合中,却有一定的误判概率,也就是说实际上元素可能并不存在。这个特性对FlashStore有着很大的影响。

下面我们分析四种不同的读写操作情形。对于FlashStore的读操作而言,存在下面两种情形:

(1)情形一:如果布隆过滤器判断某个键不存在于磁盘中时,这个键就一定不会已经保存在磁盘中,所以返回空值;如果不采用布隆过滤器,就必须到磁盘进行查找,所以,在这种情形下,布隆过滤器可以帮助节省搜索磁盘的开销;

(2)情形二:如果布隆过滤器判断某个键存在于磁盘中,这个键未必真的已经保存在磁盘中,可能是误判。不过,不管是否误判,FlashStore都会到磁盘中去找这个键,如果找不到该键,就返回空值,如果找到了,就返回键值对。所以,在这种情形下,布隆过滤器不仅没有节省开销,而且增加了一个判断过程的开销。

对于FlashStore的写操作而言,也存在下面两种情形:

(1)情形三:如果布隆过滤器判断某个键不存在于磁盘中时,这个键就一定不会已经保存在磁盘中,所以,写操作必须把该键值对写入到磁盘中。如果不采用布隆过滤器,就必须先到磁盘查找该键是否存在,如果不存在就写入磁盘。可以看出,采用布隆过滤器以后,可以节省磁盘查找的开销。

(2)情形四:如果布隆过滤器判断某个键存在于磁盘中,这个键未必真的已经保存在磁盘中,可能是误判。不过,不管是否误判,FlashStore都会到磁盘中去找这个键,如果找不到该键,就写入磁盘,如果找到了,就不执行写操作。可以看出,采用过滤器以后,不仅没有节省开销,而且增加了判断过程的开销。

从上面的四种情形的讨论可以看出,情形一和情形三可以节省开销,而情形二和情形四则会增加开销。因此,FlashStore要想取得好的性能,必须保证情形一和情形三所节省开销,一定要大于情形二和情形四所增加的开销。也就是说,当布隆过滤器判断不存在的情形较多时,FlashStore收益会比较大。但是,从作者给出的大量实验结果来看,并没有关于这个方面的详细分析。

12.2.2      SkimpyStash

文献[DebnathSL11]提出了一个基于闪存的键值存储——SkimpyStash,只需要占用很少的RAM空间。如图[SkimpyStash]所示,SkimpyStash在RAM中使用了一个哈希表目录来索引“键值对”,而这些键值对是以日志结构的方式存储在闪存上的。为了确定每个键值对在闪存中的位置,需要为每个键值对设置闪存指针,这会带来大量的RAM空间开销,为了解决这个问题,作者采用如下的方法:(1)使用线性链表解决哈希表冲突,即当有多个键被映射到同一个哈希桶中时,就会被组织成一个链表;(2)把链表存储到闪存上,并且在每个哈希桶中存储一个指向链表头部的指针。

使用哈希函数把键分配到各个哈希桶中,可能会导致每个桶中包含的键的数量不同,有时候甚至有很大的扭曲分布,这就会使得部分哈希桶对应的链表长度过大,增加了平均查找时间。为了让每个哈希桶尽量包含相近数量的键,从而减少与其对应的链表的长度,加快查找速度,作者采用了基于双选的负载均衡策略[Azar94],该策略最初是用于把多个球均匀地投入到不同的箱子中。“双选”策略的基本思想是:每个键都使用两个不同的哈希函数f1f2,产生两个候选的哈希桶,然后,把该键放入到元素个数较少的那个哈希桶中。

但是,在采用双选策略后,又会增加查找过程的闪存读操作次数。因为,现在每个键都有两个可能存放的哈希桶,在查找时,如果在第一个哈希桶中找不到,就要再到第二个哈希桶中查找。在最坏的情况下,闪存读操作的次数会翻倍。为了解决查找延迟的问题,作者进一步采用了布隆过滤器(Bloom Filter)。

         采用布隆过滤器以后,针对某个键的读操作过程如下:使用两个不同哈希函数找到该键可能存在于其中的两个不同的候选桶,布隆过滤器判断认为该键在哪个桶中,则到该桶指向的哈希链表中继续寻找该键。但是,布隆过去器有一定的误识别率,即当它判断认为该键存在于某个哈希桶中时,实际上可能并不存在,这意味着,扫描一遍该桶对应的哈希链表以后,却找不到该键,浪费了查找时间。幸运的是,文献[DebnathSL11]通过大量实验显示,这种误识别率不会超过2%。因此,采用布隆过滤器总体上加快了键的平均查找速度。

图[SkimpyStash]SkimpyStash的体系架构

图[SkimpyStash] SkimpyStash的体系架构

SkimpyStash的缺点是:由于采用布隆过滤器判定一个键是否存在于某个哈希桶中,因此,不可避免地继承了布隆过滤器的缺点,即随着存入哈希桶的元素数量的不断增加,误识别率也会随之增加。

参考文献

[AilamakiDH02]Anastassia Ailamaki, David J. DeWitt, Mark D. Hill: Data page layouts for relational databases on deep memory hierarchies. VLDB J. (VLDB) 11(3):198-215 (2002)

[AgrawalGSDS09]Devesh Agrawal, Deepak Ganesan, Ramesh K. Sitaraman, Yanlei Diao, Shashi Singh: Lazy-Adaptive Tree: An Optimized Index Structure for Flash Devices. PVLDB 2(1):361-372 (2009)

[ArgeH02]L. Arge, K. Hinrichs et al. Efficient bulk operations on dynamic R-trees. In Algorithmica 33(1):104-128, 2002.

[AgrawalPWDMP08]N. Agrawal, V. Prabhakaran, T. Wobber, J. Davis, M. Manasse, and R. Panigrahy. Design Tradeoffs for SSD Performance. In USENIX ATC, 2008.

[AthanassoulisACGS10]Manos Athanassoulis, Anastasia Ailamaki, Shimin Chen, Phillip B. Gibbons, Radu Stoica: Flash in a DBMS: Where and How? IEEE Data Eng. Bull. (DEBU) 33(4):28-34 (2010)

[Azar94]Azar, Y., Broder, A., Karlin, A., Upfal, E. Balanced Allocations. In SIAM Journal on Computing (1994).

[Batory79]Don S. Batory: On Searching Transposed Files. ACM Trans. Database Syst. (TODS) 4(4):531-544 (1979)

[Ban95]A. Ban. Flash File System, Apr. 1995. United States Patent No. 5,404,485.

[Bentley79]J. L. Bentley, “Decomposable searching problems,” Inf. Process. Lett., vol. 8, no. 5, pp. 244–251, 1979.

[BaiduBaike]Bloom Filter. http://baike.baidu.com/view/4526149.htm

[BaiduBaikeSSD]http://baike.baidu.com/view/723957.htm

[BonnetB11]Philippe Bonnet, Luc Bouganim: Flash Device Support for Database Management. CIDR 2011:1-8

[BouganimJB09]Luc Bouganim, Björn Þór Jónsson, Philippe Bonnet: uFLIP: Understanding Flash IO Patterns. CIDR 2009

[BorodinLS87]Allan Borodin, Nathan Linial, Michael E. Saks: An Optimal Online Algorithm for Metrical Task Systems STOC 1987:373-382

[BroderM02]A. Broder and M. Mitzenmacher. Network Applications of Bloom Filters: A Survey. In Internet Mathematics, 2002.

[BlackS89]BLACK, D. L., AND SLEATOR, D. D. Competitive algorithms for replication and migration problems. Tech. Rep. CMU-CS-89-201, Carnegie Mellon University, 1989.

[Chang07] Li-Pin Chang. On Effieient Wear Leveling for Large-Scale Flash-Memory Storage Systems[C]. the 22st ACM Symposium on Applied Computing (ACM SAC), 2007.

[ChanO97]Chee Yong Chan, Beng Chin Ooi: Efficient Scheduling of Page Access in Index-Based Join Processing. IEEE Trans. Knowl. Data Eng. (TKDE) 9(6):1005-1011 (1997)

[Claburn08]T. Claburn. Google plans to use Intel SSD storage in servers. http://www.informationweek.com/news/207602745

[Chen09]Shimin Chen: FlashLogging: exploiting flash devices for synchronous logging performance. SIGMOD 2009:73-86

[Chenjianqiang10]陈坚强. 基于NAND Flash的数据库管理系统优化研究. 湖南大学, 硕士学位论文. 2010.

[CanimBMLR09]Mustafa Canim, Bishwaranjan Bhattacharjee, George A. Mihaila, Christian A. Lang, Kenneth A. Ross: An Object Placement Advisor for DB2 Using Solid State Storage. PVLDB 2(2):1318-1329 (2009)

[ChouD85]H.-T. Chou and D. J. DeWitt. An evaluation of buffer management strategies for relational database systems. In VLDB ’85, pages 127–141. VLDB Endowment, 1985.

[ChoudhuriG07]S. Choudhuri and T. Givargis. Performance Improvement of Block Based NAND Flash Translation Layer. In CODES+ISSS’07, Salzburg, Austria, 2007.

[ChazelleG86]B. Chazelle and L. J. Guibas, “Fractional cascading: I. a data structuring technique,” Algorithmica, vol. 1, no. 2, pp. 133–162, 1986.

[ChenGN10]Shimin Chen, Phillip B. Gibbons, Suman Nath: PR-join: a non-blocking join achieving higher early result rate with statistical guarantees. SIGMOD 2010:147-158

[ChangK04]Li-Pin Chang, Tei-Wei Kuo: An efficient management scheme for large-scale flash-memory storage systems. SAC 2004:862-868

[CopelandK85]George P. Copeland, Setrag Khoshafian: A Decomposition Storage Model. SIGMOD 1985:268-279

[ChenKZ09]Feng Chen, David A. Koufaty, Xiaodong Zhang: Understanding intrinsic characteristics and system implications of flash memory based solid state drives. SIGMETRICS/Performance 2009:181-192

[CuiLC10]崔斌,吕雁飞,陈学轩. 闪存环境下B+树索引重访. 计算机应用, 2010, Vol.30(1):1-4.

[ChenLZ11]Feng Chen, Rubao Lee, Xiaodong Zhang: Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. HPCA 2011:266-277

[CanimMBRL10]Mustafa Canim, George A. Mihaila, Bishwaranjan Bhattacharjee, Kenneth A. Ross, Christian A. Lang: SSD Bufferpool Extensions for Database Systems. PVLDB 3(2):1435-1446 (2010)

[ChungPPLLS06]Tae-Sun Chung, Dong-Joo Park, Sangwon Park, Dong-Ho Lee, Sang-Won Lee, Ha-Joo Song: System Software for Flash Memory: A Survey. EUC 2006:394-404

[DeWittKOSSW84]David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD 1984:1-8

[DoP09] Jaeyoung Do, Jignesh M. Patel: Join processing for flash SSDs: remembering past lessons. DaMoN 2009:1-8

[DebnathSL10]Biplob K. Debnath, Sudipta Sengupta, Jin Li: FlashStore: High Throughput Persistent Key-Value Store. PVLDB 3(2):1414-1425 (2010)

[DatacenterSSD]White Paper: Datacenter SSDs: Solid Footing for Growth. White Paper: Datacenter SSDs: Solid Footing for Growth. http://www.samsung.com/global/business/semiconductor/products/flash/FlashApplicationNote.html.

[Facebook]Releasing Flashcache. http://www.facebook.com/note.php?note_id=388112370932.

[FusionIO]The FusionIO drive. Technical speci¯cations available at: http://www.fusionio.com/photos/fusion-io-use-and-terminology/

[FanLM12]范玉雷,赖文豫,孟小峰.基于固态硬盘内部并行的数据库表扫描与聚集. 计算机学报, Vol35(11),2012,pp:2327-2336.

[Graefe07]Goetz Graefe: The five-minute rule twenty years later, and how flash memory changes the rules. DaMoN 2007:6

[GemmellBL06]Jim Gemmell, Gordon Bell, Roger Lueder, MyLifeBits: A Personal Database for Everything, Communications of the ACM, Vol. 49, No. 1, January 2006.

[GrayF07]Jim Gray and Bob Fitzgerald. Flash Disk Opportunity for Server-Applications. http://www.research.microsoft.com/˜gray, January 2007.

[GrayF06]Jim Gray, Bob Fitzgerald: Flash Disk Opportunity for Server Applications. ACM Queue (QUEUE) 6(4):18-23 (2006)

[GrayG97]Jim Gray, Goetz Graefe: The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb. SIGMOD Record (SIGMOD) 26(4):63-68 (1997)

[GraefeHKSTW10]Goetz Graefe, Stavros Harizopoulos, Harumi A. Kuno, Mehul A. Shah, Dimitris Tsirogiannis, Janet L. Wiener: Designing Database Operators for Flash-enabled Memory Hierarchies. IEEE Data Eng. Bull. (DEBU) 33(4):21-27 (2010)

[GuptaKU09]Aayush Gupta, Youngjae Kim, Bhuvan Urgaonkar: DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings. ASPLOS 2009:229-240

[GraefeLS94]Goetz Graefe, Ann Linville, Leonard D. Shapiro: Sort versus Hash Revisited. IEEE Trans. Knowl. Data Eng. (TKDE) 6(6):934-944 (1994)

[GanesanMS07] Yanlei Diao, Deepak Ganesan, Gaurav Mathur, Prashant J. Shenoy: Rethinking Data Management for Storage-centric Sensor Networks. CIDR 2007:22-31

[GrayP87]Jim Gray, Gianfranco R. Putzolu: The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD 1987:395-398

[GalT05]Eran Gal, Sivan Toledo: Algorithms and data structures for flash memories. ACM Comput. Surv. (CSUR) 37(2):138-163 (2005)

[Hwang03]Chang-gyu Hwang. Nanotechnology enables a new memory growth model. Proceedings of the IEEE, 2003, Vol.91(11):1765 – 1771.

[HsiehCK05]Jen-Wei Hsieh, Li-Pin Chang, Tei-Wei Kuo: Efficient on-line identification of hot data for flash-memory management. SAC 2005:838-842

[HsiehK06]Jen-Wei Hsieh and Tei-Wei Kuo, “Efficient identification of Hot Data for Flash Memory Storage Systems”, ACM Transactions on Storage,Vol. 2, No. 1, pp. 22-40, 2006.

[HennessyP03]J. Hennessy and D. Patterson. Computer Architecture – A Quantitative Approach. Morgan Kaufmann, 2003.

[Intel98]Intel. Understanding the Flash Translation Layer (FTL) Specification. Technical report, Intel Corporation, Dec.1998.

[InoueW04]Atsushi Inoue and Doug Wong. NAND Flash Applications Design Guide. Technical Report Revision 2.0, Toshiba America Electronic Components, Inc., March 2004.

[JungCJKL07]Dawoon Jung, Yoon-Hee Chae, Heeseung Jo, Jinsoo Kim, Joonwon Lee: A group-based wear-leveling algorithm for large-capacity flash memory storage systems. CASES 2007:160-164

[JoKPKL06]H. Jo, J.-U. Kang, S.-Y. Park, J.-S. Kim, and J. Lee. FAB: Flash-aware buffer management policy for portable media players. IEEE Transactions on Consumer Electronics, 52(2):485–493, 7 2006.

[JinOHL12]Peiquan Jin, Yi Ou, Theo Härder, Zhi Li: AD-LRU: An efficient buffer replacement algorithm for flash-based databases. Data Knowl. Eng. (DKE) 72:83-102 (2012)

[JungSPKC08]H. Jung, H. Shim, S. Park, S. Kang, and J. Cha. LRU-WSR: Integration of lru and writes sequence reordering for flash memory. IEEE Transactions on Consumer Electronics, 54(3):1215–1223, 10 2008.

[JungYSPKC07]Hoyoung Jung, Kyunghoon Yoon, Hyoki Shim, Sungmin Park, Sooyong Kang, Jaehyuk Cha: LIRS-WSR: Integration of LIRS and Writes Sequence Reordering for Flash Memory. ICCSA 2007:224-237

[JiangZ02]Song Jiang, Xiaodong Zhang: LIRS: an efficient low inter-reference recency set replacement policy to improve buffer cache performance. SIGMETRICS 2002:31-42

[Kooi80]Robert Kooi: The Optimization of Queries in Relational Databases. Case Western Reserve University 1980

[KimA08]Hyojun Kim, Seongjun Ahn: BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage. FAST 2008:239-252

[KangJKK07]Dongwon Kang, Dawoon Jung, Jeong-Uk Kang, Jin-Soo Kim: mu-tree: an ordered index structure for NAND flash memory. EMSOFT 2007:144-153

[KangJKL06]Jeong-Uk Kang, Heeseung Jo, Jinsoo Kim, Joonwon Lee: A superblock-based flash translation layer for NAND flash memory. EMSOFT 2006:161-170

[KimKNMC02]J. Kim, J.M. Kim, S.H. Noh, S. Min, and Y. Cho. A Space-Efficient Flash Translation Layer for Compactflash Systems. IEEE Transactions on Consumer Electronics, 48(2):366–375, 2002.

[KimL99] Han-joon Kim, Sang-goo Lee: A New Flash Memory Management for Flash Storage System. COMPSAC 1999:284-289

[KangLM12]Woon-Hak Kang, Sang-Won Lee, Bongki Moon: Flash-based Extended Cache for Higher Throughput and Faster Recovery. PVLDB 5(11):1615-1626 (2012)

[KimLJB08] Hyojun KimKi Yong LeeJaeGyu JungKyoungil Bahng:A New Transactional Flash Translation Layer for Embedded Database Systems Based on MLC NAND Flash Memory.

[KawaguchiNM95]A. Kawaguchi, S. Nishioka, and H. Motoda. A flash-memory based file system. In Proc. of USENIX Winter, 1995.

[KooS09]D. Koo and D. Shin. Adaptive Log Block Mapping Scheme for Log Buffer-based FTL (Flash Translation Layer). In IWSSPS’09, Grenoble, France, Oct. 2009.

[KoltsidasV08s]Ioannis Koltsidas, Stratis Viglas: Flashing up the storage layer. PVLDB 1(1):514-525 (2008)

[Lawson09]S. Lawson. Cloud computing could be a boon for flash storage. http://www.businessweek.com/technology/content/aug2009/tc20090824 219491.htm, 2009.

[LvCHC11]Yanfei Lv, Bin Cui, Bingsheng He, Xuexuan Chen: Operation-aware buffer management in flash-based systems. SIGMOD 2011:13-24

[LiHLY09]Yinan Li, Bingsheng He, Qiong Luo, Ke Yi: Tree Indexing on Flash Disks. ICDE 2009:1303-1306

[LiJSCY09]Z. Li, P. Jin, X. Su, K. Cui, and L. Yue. CCF-LRU: A new buffer replacement algorithm for flash memory. IEEE Transactions on Consumer Electronics, 55(3):1351–1359.

[LeeK07]Sang-Won Lee, Won Kim: On Flash-Based DBMSs: Issues for Architectural Re-Examination. Journal of Object Technology (JOT) 6(8):39-49 (2007)

[LeeKWCK09]Ki Yong Lee, Hyojun Kim, Kyoung-Gu Woo, Yon Dohn Chung, Myoung-Ho Kim: Design and implementation of MLC NAND flash-based DBMS for mobile devices. Journal of Systems and Software (JSS) 82(9):1447-1458 (2009)

[LeeL10]Hyun-Seob Lee, Dong-Ho Lee: An efficient index buffer management scheme for implementing a B-tree on NAND flash memory. Data Knowl. Eng. (DKE) 69(9):901-916 (2010)

[LuoLMCZ12]Tian Luo, Rubao Lee, Michael P. Mesnier, Feng Chen, Xiaodong Zhang: hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems. PVLDB 5(10):1076-1087 (2012)

[LeeM07]Sang-Won Lee, Bongki Moon: Design of flash-based DBMS: an in-page logging approach. SIGMOD 2007:55-66

[LeeMPKK08]Sang-Won Lee, Bongki Moon, Chanik Park, Jae-Myung Kim, Sang-Woo Kim: A case for flash memory ssd in enterprise database applications. SIGMOD 2008:1075-1086

[LuMZ10] LU ZePing, MENG XiaoFeng, ZHOU Da. HV-Recovery: A High Efficient Recovery Technique for Flash-Based Database. Vol.33(12), 2010:2258-2266.

[LeePCLPS07]Sang-Won Lee, Dong-Joo Park, Tae-Sun Chung, Dong-Ho Lee, Sangwon Park, Ha-Joo Song: A log buffer-based flash translation layer using fully-associative sector translation. ACM Trans. Embedded Comput. Syst. (TECS) 6(3) (2007)

[LiR99]Zhe Li, Kenneth A. Ross: Fast Joins Using Join Indices. VLDB J. (VLDB) 8(1):1-24 (1999)

[LeeSKK08]Sungjin Lee, Dongkun Shin, Young-Jin Kim, Jihong Kim: LAST: locality-aware sector translation for NAND flash memory-based storage systems. Operating Systems Review (SIGOPS) 42(6):36-42 (2008)

[LvCC09]Lv Yanfei, Chen Xuexuan,Cui Bin. Performance Evaluation and Optimization Analysis on Flash—Based Database. Journal of Computer Research and Development.46(suppl.):307-312,2009.

[LofgrenNT03] Lofgren KM,Norman RD,Thelin GB,et al. 2003. Wear Leveling Techniques for Flash EEPROM Systems. USA,6594183[P].

[LiOXCH09]Yu Li, Sai Tung On, Jianliang Xu, Byron Choi, Haibo Hu: DigestJoin: Exploiting Fast Random Reads for Flash-Based Joins. Mobile Data Management 2009:152-161

[LiXCH10]Yu Li, Jianliang Xu, Byron Choi, Haibo Hu: StableBuffer: optimizing write performance for DBMS applications on flash devices. CIKM 2010:339-348

[LeeYL09]H.-S. Lee, H.-S. Yun, and D.-H. Lee. HFTL: Hybrid Flash Translation Layer based on Hot Data Identification for Flash Memory. IEEE Transactions on Consumer Electronics, 55(4), Nov. 2009.

[Myspace]White Paper: MySpace Uses Fusion Powered I/O to Drive Greener and Better Data Centers. http://www.fusionio.com/PDFs/myspace-case-study.pdf.

[MaFL11]Dongzhe Ma, Jianhua Feng, Guoliang Li: LazyFTL: a page-level flash translation layer optimized for NAND flash memory. SIGMOD 2011:1-12

[MengJCY12]孟小峰,金培权,曹巍,岳丽华. 闪存数据库研究进展及发展趋势. 中国科学基金, 2012年第3期, 142-145.

[MerrettKY81]T. H. Merrett, Yahiko Kambayashi, H. Yasuura: Scheduling of Page-Fetches in Join Operations VLDB 1981:488-498

[MTD]MTD, “Memory Technology Device (MTD) subsystem for Linux,” http://www.linux-mtd.infradead.org

[NathG10]Suman Nath, Phillip B. Gibbons: Online maintenance of very large random samples on flash storage. VLDB J. (VLDB) 19(1):67-90 (2010)

[NathK07]Suman Nath, Aman Kansal: FlashDB: dynamic self-tuning database for NAND flash. IPSN 2007:410-419

[One08]One, A., 2008. YAFFS: Yet another Flash file system. http://www.aleph1.co.uk/yaffs

[Omiecinski89]Edward Omiecinski: Heuristics for Join Processing Using Nonclustered Indexes. IEEE Trans. Software Eng. (TSE) 15(1):18-25 (1989)

[Oracle10] Oracle Corp. Oracle Databasse Concepts 11g Rel. 2. Feb 2010.

[OuH10]Yi Ou, Theo Härder: Issues of Flash-Aware Buffer Management for Database Systems. BNCOD 2010:127-130

[OuHJ09] Yi Ou, Theo Härder, Peiquan Jin: CFDC: a flash-aware replacement policy for database buffer management. DaMoN 2009:15-20

[OnHLX09]Sai Tung On, Haibo Hu, Yu Li, Jianliang Xu: Lazy-Update B+-Tree for Flash Devices. Mobile Data Management 2009:323-328

[OnLHWLX10]Sai Tung On, Yinan Li, Bingsheng He, Ming Wu, Qiong Luo, Jianliang Xu: FD-buffer: a buffer manager for databases on flash disks. CIKM 2010:1297-1300

[OuH10]Yi Ou, Theo Härder: Clean first or dirty first?: a cost-aware self-adaptive buffer replacement policy. IDEAS 2010:7-14

[OnXCHH12] Sai Tung OnJianliang XuByron ChoiHaibo HuBingsheng He: Flag Commit: Supporting Efficient Transaction Recovery in Flash-Based DBMSs. IEEE Trans. Knowl. Data Eng. (TKDE) 24(9):1624-1639 (2012)

[Patterson05]David A. Patterson: Latency Lags Bandwidth. ICCD 2005:3-6

[PucheralBVB01]Philippe Pucheral, Luc Bouganim, Patrick Valduriez, Christophe Bobineau: PicoDBMS: Scaling down database techniques for the smartcard. VLDB J. (VLDB) 10(2-3):120-132 (2001)

[ParkJKKL06]Seon-Yeong Park, Dawoon Jung, Jeong-Uk Kang, Jinsoo Kim, Joonwon Lee: CFLRU: a replacement algorithm for flash memory. CASES 2006:234-241

[PrabhakaranRZ08] Vijayan PrabhakaranThomas L. RodehefferLidong Zhou: Transactional Flash. OSDI 2008:147-160

[ParkSSML10]Seon-Yeong Park, Euiseong Seo, Ji-Yong Shin, Seungryoul Maeng, Joonwon Lee: Exploiting Internal Parallelism of Flash-based SSDs. Computer Architecture Letters (CAL) 9(1):9-12 (2010)

[RamakrishnanG00]R. Ramakrishnan, J. Gehrke (2000) Database management systems. WCB/McGraw-Hill, NewYork.

[RohLP10]Hongchan Roh, Daewook Lee, Sanghyun Park: Yet another write-optimized DBMS layer for flash-based solid state storage. CIKM 2010:1345-1348

[RosenblumO92]Mendel Rosenblum, John K. Ousterhout: The Design and Implementation of a Log-Structured File System. ACM Trans. Comput. Syst. (TOCS) 10(1):26-52 (1992)

[Sliberschantz04]A. Sliberschantz et al: Operating System Concepts. 6th ed., John Wiley & Sons, Inc. (2004)

[StoicaAJA09]Radu Stoica, Manos Athanassoulis, Ryan Johnson, Anastasia Ailamaki: Evaluating and repairing write performance on flash devices. DaMoN 2009:9-14

[SilberschatzC98]A. Silberschatz and P. B. Calvin, Operating System Concepts, John Wiley & Sons, Inc., New York, NY, 1998.

[Samsung06]Hachman, M.: New Samsung notebook replaces hard drive with flash. http://www.extremetech.com/article2/0,1558,1966644,00.asp, May 2006

[Samsung2006]Samsung.K9G8G08UOM Datasheet.Soul:Samsung,2006

[ShahHWG08]Mehul A. Shah, Stavros Harizopoulos, Janet L. Wiener, Goetz Graefe: Fast scans and joins using flash drives. DaMoN 2008:17-24

[SilberschatzKS05]Silberschatz, A., Korth, H.F., Sudarshan, S., 2005. Database System Concepts, fifth ed. McGraw-Hill.

[SuJXCY09]Xuan Su; Peiquan Jin; Xiaoyan Xiang; Kai Cui; Lihua Yue. Flash-DBSim: A simulation tool for evaluating Flash-based database algorithms. Proceedings of 2nd IEEE International Conference on Computer Science and Information Technology, 2009, pp: 185 – 189.

[SeoS08]D. Seo and D. Shin. Recently-evicted-first buffer replacement policy for flash storage devices. IEEE Transactions on Consumer Electronics, 54(3):1228–1235, 10 2008.

[TangMLL11] Tang X, Meng XF, Liang ZC, Lu ZP. Cost-Based buffer management algorithm for flash database systems. Journal of Software, 2011,22(12):2951−2964.

[TPC-H] TPC-H Benchmark. http://www.tpc.org/tpch/

[Valduriez87]Valduriez P Join indices. ACM Trans Database Syst,1987, 12(2):218–246

[Woodhouse01] David Woodhouse. JFFS: The Joumnaling Flash File System, Ottawa Linux Symposium, 2001.

[WuKC07]Chin-Hsien Wu, Tei-Wei Kuo, Li-Ping Chang: An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embedded Comput. Syst. (TECS) 6(3) (2007)

[WuCK03]Chin-Hsien Wu, Li-Pin Chang, Tei-Wei Kuo: An Efficient B-Tree Layer for Flash-Memory Storage Systems. RTCSA 2003:409-430

[Xiang09] 向小岩. 闪存数据库若干关键问题研究. 博士学位论文. 中国科技大学. 2009.

[Yaffs] Aleph One Ltd., Yet Another Flash File System(YAFFS), 2002. http://www.yaffs.net.

[YueXJL10] YUE Lihua, XIANG Xiaoyan, JIN Peiquan, LIU Zhanzhan.An Out-page logging approach to storage management in flash-based DBMS.JOURNAL OF UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA. Vol.40(5), 2010:526-532.

[YangWHGC11]Liang Huai Yang, Jing Wang, Zhifeng Huang, Weihua Gong, Lijun Chen: An Efficient Buffer Scheme for Flash-based Databases. JCP 6(7):1307-1318 (2011)

[Zeinalipour-YaztiLKGN05]Demetrios Zeinalipour-Yazti, Song Lin, Vana Kalogeraki, Dimitrios Gunopulos, Walis A. Najjar, MicroHash: An Efficient Index Structure for Flash-Based Sensor Devices, 4th USENIX Conference on File and Storage Technologies (FAST 2005), December 2005.

[ZhouCL04]Yuanyuan Zhou, Zhifeng Chen, Kai Li: Second-Level Buffer Cache Management. IEEE Trans. Parallel Distrib. Syst. (TPDS) 15(6):505-519 (2004)

[ZhengGSZJKW03]Fengzhou Zheng, Nitin Garg, Sumeet Sobti, Chi Zhang, Russell E. Joseph, Arvind Krishnamurthy, Randolph Y. Wang, Considering the Energy Consumption of Mobile Storage Alternatives, Proceedings of the 11th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Orlando, Florida. Oct 2003.

[ZhangYKW02]Chi Zhang, Xiang Yu, Arvind Krishnamurthy, Randolph Y. Wang: Configuring and Scheduling an Eager-Writing Disk Array for a Transaction Processing Workload. FAST 2002:289-304

[LvCC09]吕雁飞,陈学轩,崔斌.基于闪存的数据库性能评测与优化分析.计算机研究与发展.46(增):307-312,2009.

[YueXJL10] 岳丽华,向小岩,金培权,刘沾沾.基于分离日志的闪存数据库系统存储管理方法.中国科学技术大学学报.Vol.40(5), 2010:526-532.

[LuMZ10] 卢泽萍,孟小峰,周大. HV-Recovery: 一种闪存数据库的高效恢复方法. Vol.33(12), 2010:2258-2266.

=================全书完====================

厦门大学数据库实验室

为中国科研事业贡献力量