fbpx
Wikipedia

dm-cache

dm-cache is a component (more specifically, a target) of the Linux kernel's device mapper, which is a framework for mapping block devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides secondary storage performance improvements.

dm-cache
Developer(s)Joe Thornber, Heinz Mauelshagen, Mike Snitzer and others
Initial releaseApril 28, 2013; 11 years ago (2013-04-28) (Linux 3.9)
Written inC
Operating systemLinux
TypeLinux kernel feature
LicenseGNU GPL
Websitekernel.org

The design of dm-cache requires three physical storage devices for the creation of a single hybrid volume; dm-cache uses those storage devices to separately store actual data, cache data, and required metadata. Configurable operating modes and cache policies, with the latter in the form of separate modules, determine the way data caching is actually performed.

dm-cache is licensed under the terms of GNU General Public License (GPL), with Joe Thornber, Heinz Mauelshagen and Mike Snitzer as its primary developers.

Overview edit

dm-cache uses solid-state drives (SSDs) as an additional level of indirection while accessing hard disk drives (HDDs), improving the overall performance by using fast flash-based SSDs as caches for the slower mechanical HDDs based on rotational magnetic media. As a result, the costly speed of SSDs becomes combined with the storage capacity offered by slower but less expensive HDDs.[1] Moreover, in the case of storage area networks (SANs) used in cloud environments as shared storage systems for virtual machines, dm-cache can also improve overall performance and reduce the load of SANs by providing data caching using client-side local storage.[2][3][4]

dm-cache is implemented as a component of the Linux kernel's device mapper, which is a volume management framework that allows various mappings to be created between physical and virtual block devices. The way a mapping between devices is created determines how the virtual blocks are translated into underlying physical blocks, with the specific translation types referred to as targets.[5] Acting as a mapping target, dm-cache makes it possible for SSD-based caching to be part of the created virtual block device, while the configurable operating modes and cache policies determine how dm-cache works internally. The operating mode selects the way in which the data is kept in sync between an HDD and an SSD, while the cache policy, selectable from separate modules that implement each of the policies, provides the algorithm for determining which blocks are promoted (moved from an HDD to an SSD), demoted (moved from an SSD to an HDD), cleaned, etc.[6]

When configured to use the multiqueue (mq) or stochastic multiqueue (smq) cache policy, with the latter being the default, dm-cache uses SSDs to store the data associated with performed random reads and writes, capitalizing on near-zero seek times of SSDs and avoiding such I/O operations as typical HDD performance bottlenecks. The data associated with sequential reads and writes is not cached on SSDs, avoiding undesirable cache invalidation during such operations; performance-wise, this is beneficial because the sequential I/O operations are suitable for HDDs due to their mechanical nature. Not caching the sequential I/O also helps in extending the lifetime of SSDs used as caches.[7]

History edit

Another dm-cache project with similar goals was announced by Eric Van Hensbergen and Ming Zhao in 2006, as the result of an internship work at IBM.[8]

Later, Joe Thornber, Heinz Mauelshagen and Mike Snitzer provided their own implementation of the concept, which resulted in the inclusion of dm-cache into the Linux kernel. dm-cache was merged into the Linux kernel mainline in kernel version 3.9, which was released on April 28, 2013.[6][9]

Design edit

In dm-cache, creating a mapped virtual block device that acts as a hybrid volume requires three physical storage devices:[6]

  • Origin device – provides slow primary storage (usually an HDD)
  • Cache device – provides a fast cache (usually an SSD)
  • Metadata device – records the placement of blocks and their dirty flags, as well as other internal data required by a cache policy, including per-block hit counts; a metadata device cannot be shared between multiple cache devices, and it is recommended to be mirrored

Internally, dm-cache references to each of the origin devices through a number of fixed-size blocks; the size of these blocks, equaling to the size of a caching extent, is configurable only during the creation of a hybrid volume. The size of a caching extent must range between 32 KB and 1 GB, and it must be a multiple of 32 KB; typically, the size of a caching extent is between 256 and 1024 KB. The choice of the caching extents bigger than disk sectors acts a compromise between the size of metadata and the possibility for wasting cache space. Having too small caching extents increases the size of metadata, both on the metadata device and in kernel memory, while having too large caching extents increases the amount of wasted cache space due to caching whole extents even in the case of high hit rates only for some of their parts.[6][10]

Operating modes supported by dm-cache are write-back, which is the default, write-through, and pass-through. In the write-back operating mode, writes to cached blocks go only to the cache device, while the blocks on origin device are only marked as dirty in the metadata. For the write-through operating mode, write requests are not returned as completed until the data reaches both the origin and cache devices, with no clean blocks becoming marked as dirty. In the pass-through operating mode, all reads are performed directly from the origin device, avoiding the cache, while all writes go directly to the origin device; any cache write hits also cause invalidation of the cached blocks. The pass-through mode allows a hybrid volume to be activated when the state of a cache device is not known to be consistent with the origin device.[6][11]

The rate of data migration that dm-cache performs in both directions (i.e., data promotions and demotions) can be throttled down to a configured speed so regular I/O to the origin and cache devices can be preserved. Decommissioning a hybrid volume or shrinking a cache device requires use of the cleaner policy, which effectively flushes all blocks marked in metadata as dirty from the cache device to the origin device.[6][7]

Cache policies edit

As of August 2015 and version 4.2 of the Linux kernel,[12] the following three cache policies are distributed with the Linux kernel mainline, out of which dm-cache by default uses the stochastic multiqueue policy:[6][7]

multiqueue (mq)
The multiqueue (mq) policy has three sets of 16 queues, using the first set for entries waiting for the cache and the remaining two sets for entries already in the cache, with the latter separated so the clean and dirty entries belong to each of the two sets. The age of cache entries in the queues is based on their associated logical time. The selection of entries going into the cache (i.e., becoming promoted) is based on variable thresholds, and queue selection is based on the hit count of an entry. This policy aims to take different cache miss costs into account, and to make automatic adjustments to different load patterns.
This policy internally tracks sequential I/O operations so they can be routed around the cache, with different configurable thresholds for the differentiation between random I/O and sequential I/O operations. As a result, large contiguous I/O operations are left to be performed by the origin device because such data access patterns are suitable for HDDs, and because they avoid undesirable cache invalidation.
stochastic multiqueue (smq)
The stochastic multiqueue (smq) policy performs in a similar way as the multiqueue policy, but requires fewer resources to operate; in particular, it uses substantially smaller amounts of main memory to track cached blocks. It also replaces the hit counting from the multiqueue policy with a "hotspot" queue, and decides on data promotion and demotion on a least-recently used (LRU) basis. As a result, this policy provides better performance compared to the multiqueue policy, adjusts better automatically to different load patterns, and eliminates the configuration of various thresholds.
cleaner
The cleaner policy writes back to the origin device all blocks that are marked as dirty in the metadata. After the completion of this operation, a hybrid volume can be decommissioned or the size of a cache device can be reduced.

Use with LVM edit

Logical Volume Manager includes lvmcache, which provides a wrapper for dm-cache integrated with LVM.[13]

See also edit

  • bcache – a Linux kernel's block layer cache, developed by Kent Overstreet
  • Flashcache – a disk cache component for the Linux kernel, initially developed by Facebook
  • Hybrid drive – a storage device that combines flash-based and spinning magnetic media storage technologies
  • ReadyBoost – a disk caching software component of Windows Vista and later Microsoft operating systems
  • Smart Response Technology (SRT) – a proprietary disk storage caching mechanism, developed by Intel for its chipsets
  • ZFS – a cross-OS storage management system that has a similar integrated caching device support (L2ARC)

References edit

  1. ^ Petros Koutoupis (November 25, 2013). "Advanced Hard Drive Caching Techniques". Linux Journal. Retrieved December 2, 2013.
  2. ^ . visa.cs.fiu.edu. Archived from the original on July 18, 2014. Retrieved July 24, 2014.
  3. ^ Dulcardo Arteaga; Douglas Otstott; Ming Zhao (May 16, 2012). . visa.cs.fiu.edu. Archived from the original (PDF) on December 3, 2013. Retrieved December 2, 2013.
  4. ^ Dulcardo Arteaga; Ming Zhao (June 21, 2014). . visa.cs.fiu.edu. ACM. Archived from the original (PDF) on September 6, 2015. Retrieved August 31, 2015.
  5. ^ "Red Hat Enterprise Linux 6 Documentation, Appendix A. The Device Mapper". Red Hat. October 8, 2014. Retrieved December 23, 2014.
  6. ^ a b c d e f g Joe Thornber; Heinz Mauelshagen; Mike Snitzer (July 20, 2015). "Linux kernel documentation: Documentation/device-mapper/cache.txt". kernel.org. Retrieved August 31, 2015.
  7. ^ a b c Joe Thornber; Heinz Mauelshagen; Mike Snitzer (June 29, 2015). "Linux kernel documentation: Documentation/device-mapper/cache-policies.txt". kernel.org. Retrieved August 31, 2015.
  8. ^ Eric Van Hensbergen; Ming Zhao (November 28, 2006). "Dynamic Policy Disk Caching for Storage Networking" (PDF). IBM Research Report. IBM. Retrieved December 2, 2013.
  9. ^ "Linux kernel 3.9, Section 1.3. SSD cache devices". kernelnewbies.org. April 28, 2013. Retrieved October 7, 2013.
  10. ^ Jake Edge (May 1, 2013). "LSFMM: Caching – dm-cache and bcache". LWN.net. Retrieved October 7, 2013.
  11. ^ Joe Thornber (November 11, 2013). "Linux kernel source tree: kernel/git/torvalds/linux.git: dm cache: add passthrough mode". kernel.org. Retrieved February 6, 2014.
  12. ^ Jonathan Corbet (July 1, 2015). "4.2 Merge window part 2". LWN.net. Retrieved August 31, 2015.
  13. ^ Red Hat, Inc. "lvmcache — LVM caching". Debian Manpages. A read and write hot-spot cache, using the dm-cache kernel module.

External links edit

  • Linux Block Caching Choices in Stable Upstream Kernel (PDF), Dell, December 2013
  • Performance Comparison among EnhanceIO, bcache and dm-cache, LKML, June 11, 2013
  • EnhanceIO, Bcache & DM-Cache Benchmarked, Phoronix, June 11, 2013, by Michael Larabel
  • SSD Caching Using dm-cache Tutorial, July 2014, by Kyle Manna
  • Re: [dm-devel] [PATCH 8/8] [dm-cache] cache target, December 14, 2012 (guidelines for metadata device sizing)

cache, component, more, specifically, target, linux, kernel, device, mapper, which, framework, mapping, block, devices, onto, higher, level, virtual, block, devices, allows, more, fast, storage, devices, such, flash, based, solid, state, drives, ssds, cache, m. dm cache is a component more specifically a target of the Linux kernel s device mapper which is a framework for mapping block devices onto higher level virtual block devices It allows one or more fast storage devices such as flash based solid state drives SSDs to act as a cache for one or more slower storage devices such as hard disk drives HDDs this effectively creates hybrid volumes and provides secondary storage performance improvements dm cacheDeveloper s Joe Thornber Heinz Mauelshagen Mike Snitzer and othersInitial releaseApril 28 2013 11 years ago 2013 04 28 Linux 3 9 Written inCOperating systemLinuxTypeLinux kernel featureLicenseGNU GPLWebsitekernel wbr org The design of dm cache requires three physical storage devices for the creation of a single hybrid volume dm cache uses those storage devices to separately store actual data cache data and required metadata Configurable operating modes and cache policies with the latter in the form of separate modules determine the way data caching is actually performed dm cache is licensed under the terms of GNU General Public License GPL with Joe Thornber Heinz Mauelshagen and Mike Snitzer as its primary developers Contents 1 Overview 2 History 3 Design 4 Cache policies 5 Use with LVM 6 See also 7 References 8 External linksOverview editdm cache uses solid state drives SSDs as an additional level of indirection while accessing hard disk drives HDDs improving the overall performance by using fast flash based SSDs as caches for the slower mechanical HDDs based on rotational magnetic media As a result the costly speed of SSDs becomes combined with the storage capacity offered by slower but less expensive HDDs 1 Moreover in the case of storage area networks SANs used in cloud environments as shared storage systems for virtual machines dm cache can also improve overall performance and reduce the load of SANs by providing data caching using client side local storage 2 3 4 dm cache is implemented as a component of the Linux kernel s device mapper which is a volume management framework that allows various mappings to be created between physical and virtual block devices The way a mapping between devices is created determines how the virtual blocks are translated into underlying physical blocks with the specific translation types referred to as targets 5 Acting as a mapping target dm cache makes it possible for SSD based caching to be part of the created virtual block device while the configurable operating modes and cache policies determine how dm cache works internally The operating mode selects the way in which the data is kept in sync between an HDD and an SSD while the cache policy selectable from separate modules that implement each of the policies provides the algorithm for determining which blocks are promoted moved from an HDD to an SSD demoted moved from an SSD to an HDD cleaned etc 6 When configured to use the multiqueue mq or stochastic multiqueue smq cache policy with the latter being the default dm cache uses SSDs to store the data associated with performed random reads and writes capitalizing on near zero seek times of SSDs and avoiding such I O operations as typical HDD performance bottlenecks The data associated with sequential reads and writes is not cached on SSDs avoiding undesirable cache invalidation during such operations performance wise this is beneficial because the sequential I O operations are suitable for HDDs due to their mechanical nature Not caching the sequential I O also helps in extending the lifetime of SSDs used as caches 7 History editAnother dm cache project with similar goals was announced by Eric Van Hensbergen and Ming Zhao in 2006 as the result of an internship work at IBM 8 Later Joe Thornber Heinz Mauelshagen and Mike Snitzer provided their own implementation of the concept which resulted in the inclusion of dm cache into the Linux kernel dm cache was merged into the Linux kernel mainline in kernel version 3 9 which was released on April 28 2013 6 9 Design editIn dm cache creating a mapped virtual block device that acts as a hybrid volume requires three physical storage devices 6 Origin device provides slow primary storage usually an HDD Cache device provides a fast cache usually an SSD Metadata device records the placement of blocks and their dirty flags as well as other internal data required by a cache policy including per block hit counts a metadata device cannot be shared between multiple cache devices and it is recommended to be mirrored Internally dm cache references to each of the origin devices through a number of fixed size blocks the size of these blocks equaling to the size of a caching extent is configurable only during the creation of a hybrid volume The size of a caching extent must range between 32 KB and 1 GB and it must be a multiple of 32 KB typically the size of a caching extent is between 256 and 1024 KB The choice of the caching extents bigger than disk sectors acts a compromise between the size of metadata and the possibility for wasting cache space Having too small caching extents increases the size of metadata both on the metadata device and in kernel memory while having too large caching extents increases the amount of wasted cache space due to caching whole extents even in the case of high hit rates only for some of their parts 6 10 Operating modes supported by dm cache are write back which is the default write through and pass through In the write back operating mode writes to cached blocks go only to the cache device while the blocks on origin device are only marked as dirty in the metadata For the write through operating mode write requests are not returned as completed until the data reaches both the origin and cache devices with no clean blocks becoming marked as dirty In the pass through operating mode all reads are performed directly from the origin device avoiding the cache while all writes go directly to the origin device any cache write hits also cause invalidation of the cached blocks The pass through mode allows a hybrid volume to be activated when the state of a cache device is not known to be consistent with the origin device 6 11 The rate of data migration that dm cache performs in both directions i e data promotions and demotions can be throttled down to a configured speed so regular I O to the origin and cache devices can be preserved Decommissioning a hybrid volume or shrinking a cache device requires use of the cleaner policy which effectively flushes all blocks marked in metadata as dirty from the cache device to the origin device 6 7 Cache policies editAs of August 2015 update and version 4 2 of the Linux kernel 12 the following three cache policies are distributed with the Linux kernel mainline out of which dm cache by default uses the stochastic multiqueue policy 6 7 multiqueue mq The multiqueue mq policy has three sets of 16 queues using the first set for entries waiting for the cache and the remaining two sets for entries already in the cache with the latter separated so the clean and dirty entries belong to each of the two sets The age of cache entries in the queues is based on their associated logical time The selection of entries going into the cache i e becoming promoted is based on variable thresholds and queue selection is based on the hit count of an entry This policy aims to take different cache miss costs into account and to make automatic adjustments to different load patterns This policy internally tracks sequential I O operations so they can be routed around the cache with different configurable thresholds for the differentiation between random I O and sequential I O operations As a result large contiguous I O operations are left to be performed by the origin device because such data access patterns are suitable for HDDs and because they avoid undesirable cache invalidation stochastic multiqueue smq The stochastic multiqueue smq policy performs in a similar way as the multiqueue policy but requires fewer resources to operate in particular it uses substantially smaller amounts of main memory to track cached blocks It also replaces the hit counting from the multiqueue policy with a hotspot queue and decides on data promotion and demotion on a least recently used LRU basis As a result this policy provides better performance compared to the multiqueue policy adjusts better automatically to different load patterns and eliminates the configuration of various thresholds cleaner The cleaner policy writes back to the origin device all blocks that are marked as dirty in the metadata After the completion of this operation a hybrid volume can be decommissioned or the size of a cache device can be reduced Use with LVM editLogical Volume Manager includes lvmcache which provides a wrapper for dm cache integrated with LVM 13 See also edit nbsp Free and open source software portal nbsp Linux portal bcache a Linux kernel s block layer cache developed by Kent Overstreet Flashcache a disk cache component for the Linux kernel initially developed by Facebook Hybrid drive a storage device that combines flash based and spinning magnetic media storage technologies ReadyBoost a disk caching software component of Windows Vista and later Microsoft operating systems Smart Response Technology SRT a proprietary disk storage caching mechanism developed by Intel for its chipsets ZFS a cross OS storage management system that has a similar integrated caching device support L2ARC References edit Petros Koutoupis November 25 2013 Advanced Hard Drive Caching Techniques Linux Journal Retrieved December 2 2013 dm cache Dynamic Block level Storage Caching visa cs fiu edu Archived from the original on July 18 2014 Retrieved July 24 2014 Dulcardo Arteaga Douglas Otstott Ming Zhao May 16 2012 Dynamic Block level Cache Management for Cloud Computing Systems visa cs fiu edu Archived from the original PDF on December 3 2013 Retrieved December 2 2013 Dulcardo Arteaga Ming Zhao June 21 2014 Client side Flash Caching for Cloud Systems visa cs fiu edu ACM Archived from the original PDF on September 6 2015 Retrieved August 31 2015 Red Hat Enterprise Linux 6 Documentation Appendix A The Device Mapper Red Hat October 8 2014 Retrieved December 23 2014 a b c d e f g Joe Thornber Heinz Mauelshagen Mike Snitzer July 20 2015 Linux kernel documentation Documentation device mapper cache txt kernel org Retrieved August 31 2015 a b c Joe Thornber Heinz Mauelshagen Mike Snitzer June 29 2015 Linux kernel documentation Documentation device mapper cache policies txt kernel org Retrieved August 31 2015 Eric Van Hensbergen Ming Zhao November 28 2006 Dynamic Policy Disk Caching for Storage Networking PDF IBM Research Report IBM Retrieved December 2 2013 Linux kernel 3 9 Section 1 3 SSD cache devices kernelnewbies org April 28 2013 Retrieved October 7 2013 Jake Edge May 1 2013 LSFMM Caching dm cache and bcache LWN net Retrieved October 7 2013 Joe Thornber November 11 2013 Linux kernel source tree kernel git torvalds linux git dm cache add passthrough mode kernel org Retrieved February 6 2014 Jonathan Corbet July 1 2015 4 2 Merge window part 2 LWN net Retrieved August 31 2015 Red Hat Inc lvmcache LVM caching Debian Manpages A read and write hot spot cache using the dm cache kernel module External links editLinux Block Caching Choices in Stable Upstream Kernel PDF Dell December 2013 Performance Comparison among EnhanceIO bcache and dm cache LKML June 11 2013 EnhanceIO Bcache amp DM Cache Benchmarked Phoronix June 11 2013 by Michael Larabel SSD Caching Using dm cache Tutorial July 2014 by Kyle Manna Re dm devel PATCH 8 8 dm cache cache target December 14 2012 guidelines for metadata device sizing Retrieved from https en wikipedia org w index php title Dm cache amp oldid 1214095309, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.