What to use deduplication for?

       2112 words, 10 minutes

I recently discover a storage feature named “Data deduplication” also called “Deduplication”.

Quoting Wikipedia:

In computing, data deduplication is a specialized data compression technique for eliminating coarse-grained redundant data, typically to improve storage utilization. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored, along with references to the unique copy of data. Deduplication is able to reduce the required storage capacity since only the unique data is stored.

I was first thinking “well, a video/document/… is a file ; and a file is a sets of 0 and/or 1. Like a ZIP archive doesn’t care if it stores pictures or text files, I may be able to use deduplication to store my 32GB of personnal pictures into a smaller storage size… sounds great!”. That’s what I want to figure out.

POC environnement

I read a bunch of documentations and reports about deduplication that claim it was a great way to reduce storage space. Many of them cited virtual machine storage. Some were pointing email storage. I already know that Microsoft Exchange stores email and attachment in such a way that the space is saved ; basically, storing the data once and storing links to this data when it can.

But I found nothing on using deduplication to store your 64GB of holiday HD movies on a 16GB USB stick… Doesn’t it work ? Or would sticks reseller rather hide this from the buying crowd ?

The guys from the NexentaStor Project are king enough to provide a VMware image of their free Community Edition. It is ready to use with VMware virtualisation products and offers “up to 12TB of storage” ; much more than I require.

I downloaded the appliance, attached six 2GB SCSI virtual disks and run it with VMware Fusion.
Depending on the testings, I configured various volume size and configurations. The only required thing was that ZFS was used.

Note that deduplication is only available since ZFS Pool Version 20. This leaves out the current FreeBSD and OpenSolaris implementations. NexentaStor uses v22.

Most of the configuration will be done through the Web interface. Here, it’s available on http://192.168.12.129:2000/.

I first created a RAIDZ volume with all available disks and deduplication set to “sha256,verify”:

nmc@nexenta:/$ zpool list
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G   182K  11.9G     0%  1.00x  ONLINE  -
syspool   7.94G  2.37G  5.57G    29%  1.00x  ONLINE  -
nmc@nexenta:/$ zfs list
NAME                     USED  AVAIL  REFER  MOUNTPOINT
Stockage                 151K  9.70G  46.5K  /volumes/Stockage

Then I created a volume using the whole space, the default record size of 128K and deduplication set to “sha256,verify”:

nmc@nexenta:/$ zfs list -o name,used,avail,refer,mounted,quota,dedup,compress
NAME                     USED  AVAIL  REFER  MOUNTED  QUOTA          DEDUP  COMPRESS
Stockage                 222K  9.70G  48.1K      yes   none  sha256,verify       off
Stockage/Dropbox        46.5K  9.70G  46.5K      yes   none  sha256,verify       off

Finally, I enabled CIFS (SAMBA) sharing on this volume. That enables copying data from remote computer. That’s the usual way to provide shared folder to Windows computers. I’ll be copying from a Snow Leopard MacBook Pro but that shouldn’t change anything at all.
The default configuration is “Anonymous Read-Write” but I enabled a full access to a dedicated users. Mostly because OSX and Nexenta don’t work together out of the box with anonymous access. The shared folder gets mounted as //me@192.168.12.129/stockage_dropbox.

Deduplication and text documents

I dropped about 550MB of text documents. Those are TXT, HTML, PDF and DOC files grabbed from various technical sources:

nmc@nexenta:/$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G   662M  11.2G     5%  1.00x  ONLINE  -

humpf… not really convincing…

Let’s make a pure full copy of that directory on the same ZFS volume:

nmc@nexenta:/$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G   668M  11.2G     5%  1.99x  ONLINE  -

Looks better ! Two times 550MB are stored using only 668MB.

That’s probably not useful in my environment… I rarely store two times the same file. But in the enterprise environment, this can make sense. There are many times when you attach a document to a mail, send it to someone who will end storing the attachment on the filer.

Let’s try with my full documentation repository 2.48GB. This would add PPT, RTFD and a bunch of other files. Maybe deduplication works better when you store really really much stuff…

nmc@nexenta:/$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  2.72G  9.15G    22%  1.02x  ONLINE  -

I’ll consider this as a “NO”.

Deduplication and images

Let’s see what happens when I copy 8.4GB from my photo library onto the ZFS volume:

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  9.90G  1.98G    83%  1.00x  ONLINE  -

Same thing as with text documents… After all, at the disk level, there is no way to guess if a file a DOC, JPEG, PPT or PNG… So we gotta dig in some other directions…

Deduplication and iso files

I’ve read that dedup was nice in virtual machine environments. And to setup such environments, you have to store the installation ISO files. Let’s have a look at what we can get here.

I selected a few of the ISO files (3,1GB) we use (or used) here at work. Namely, Windows XP 32-bit and 64-bit, Windows XP with SP3 included and Windows Server 2003 Standard and Enterprise editions. There must be redundant things in those files…

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  3.49G  8.38G    29%  1.00x  ONLINE  -

Damm… Can’t understand how dedup doesn’t find redundant parts between a Standard and an Enterprise edition of Windows 2003… Come on guy… Nothing to be deduplicated in those two 600GB ISO files ???

Deduplication and virtual machines

One says deduplication is great for virtual machine storage… Well, I used VMware Fusion to create two Windows Server 2003, a Windows XP and an OpenBSD virtual machines and stored their whole sets of files in the NexentaStor shared folder:

nmc@nexenta:/Stockage/Dropbox$ ls -alh
total 209
drwxr-xr-x+  7 root     root           8 Nov 11 16:50 .
drwxr-x---+  2 root     sys            3 Nov 10 15:15 .$EXTEND
drwxr-xr-x   3 root     root           3 Nov 10 15:06 ..
-rwx------+  1 jdoe    staff        39K Nov 11 16:50 .DS_Store
drwx------+  3 jdoe    staff         14 Nov 11 18:19 OpenBSD #1.vmwarevm
drwx------+  5 jdoe    staff         12 Nov 11 22:17 Win2K3 #1.vmwarevm
drwx------+  5 jdoe    staff         12 Nov 11 22:14 Win2K3 #2.vmwarevm
drwx------+  4 jdoe    staff         12 Nov 11 19:30 WinXP #1.vmwarevm

nmc@nexenta:/Stockage/Dropbox$ zfs list -o name,used,avail,refer,mounted,quota,dedup,compress
NAME                     USED  AVAIL  REFER  MOUNTED  QUOTA          DEDUP  COMPRESS
Stockage                6.19G  3.80G  48.1K      yes   none  sha256,verify       off
Stockage/Dropbox        6.16G  3.80G  6.16G      yes   none  sha256,verify       off

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  7.10G  4.77G    59%  1.04x  ONLINE  -

There used to be a nice DEDUP value when the VM were newly created. But since I finished the installation, configured “download but not automatically install updates” and leave the VM up for two days, the DEDUP value got down to this very low value…

The fact that VMware Fusion uses special technics to keep the VM disk small may have something to deal with poor dedup ratio. The Windows Server 2003 VMs have a 3GB disk but the whole sets only takes 2,24GB in the shared folder.

Deduplication and backup archives

Backup is usually configured with a policy that looks like:

Incremental parts of backups would probably not deduplicate well ; after all, there are the only varying bits of the data.
So I’ll look at the “archiving” part of backups. That is, the full backups that have bits changed but also that may have lots in common.

What I’ll do is create archive files (ZIP) which contains more and more of my personal “Documents” repository. For example, archive ZIP1 will contain directories DIR1 and DIR2 ; archive ZIP2 will contain directories DIR1, DIR2 and DIR3 ; etc… Let’s see what we get (when copying one archive after the other) :

nmc@nexenta:/Stockage/Dropbox$ ls -alh
total 7203984
drwxr-xr-x+  3 root     root           8 Nov 12 00:10 .
drwxr-x---+  2 root     sys            3 Nov 10 15:15 .$EXTEND
drwxr-xr-x   3 root     root           3 Nov 10 15:06 ..
-rwx------+  1 jdoe    staff        39K Nov 12 00:10 .DS_Store
-rwx------+  1 jdoe    staff        13M Nov 11 23:54 Archive 1.zip
-rwx------+  1 jdoe    staff       494M Nov 11 23:55 Archive 2.zip
-rwx------+  1 jdoe    staff       1.4G Nov 11 23:58 Archive 3.zip
-rwx------+  1 jdoe    staff       1.5G Nov 12 00:03 Archive 4.zip

nmc@nexenta:/Stockage/Dropbox$ zfs list -o name,used,avail,refer,mounted,quota,dedup,compress
NAME                     USED  AVAIL  REFER  MOUNTED  QUOTA          DEDUP  COMPRESS
Stockage                3.48G  6.64G  48.1K      yes   none  sha256,verify       off
Stockage/Dropbox        3.46G  6.64G  3.46G      yes   none  sha256,verify       off

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  3.68G  8.20G    30%  1.14x  ONLINE  -

Not much deduplicated when you think that the archives 3 and 4 have 494MB in common with archive 2 and that archive 3 and 4 have 1.4GB in common… Well, at least in the functional point of view.

Here’s the results of a test with two mySQL dumps:

nmc@nexenta:/Stockage/Dropbox$ ls -alh
total 8292731
drwxr-xr-x+  3 root     root           6 Nov 12 07:36 .
drwxr-x---+  2 root     sys            3 Nov 10 15:15 .$EXTEND
drwxr-xr-x   3 root     root           3 Nov 10 15:06 ..
-rwx------+  1 jdoe    staff        39K Nov 12 00:10 .DS_Store
-rwx------+  1 jdoe    staff       2.0G Sep 24 21:14 zarafa.dump
-rwx------+  1 jdoe    staff       1.9G Nov 12 07:29 zarafa.dump.old

nmc@nexenta:/Stockage/Dropbox$ zfs list -o name,used,avail,refer,mounted,quota,dedup,compress
NAME                     USED  AVAIL  REFER  MOUNTED  QUOTA          DEDUP  COMPRESS
Stockage                4.00G  5.71G  48.1K      yes   none  sha256,verify       off
Stockage/Dropbox        3.98G  5.71G  3.98G      yes   none  sha256,verify       off

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G  4.82G  7.06G    40%  1.00x  ONLINE  -

Those dumps are full of e-mail (from Zarafa) and web stuff (from WordPress and Drupal sandbox).

Here’s what happens when I drop the 2010 syslog’s content:

nmc@nexenta:/Stockage/Dropbox$ ls -alh
total 159
drwxr-xr-x+ 11 root     root          12 Nov 12 07:55 .
drwxr-x---+  2 root     sys            3 Nov 10 15:15 .$EXTEND
drwxr-xr-x   3 root     root           3 Nov 10 15:06 ..
-rwx------+  1 jdoe    staff        39K Nov 12 00:10 .DS_Store
drwx------+ 10 jdoe    staff         10 Mar 24  2010 10.0.0.29
drwx------+  9 jdoe    staff          9 Feb 25  2010 airport
drwx------+ 28 jdoe    staff        241 Nov 12 00:05 akela
drwx------+ 23 jdoe    staff         23 Jun  1 22:04 guarana
drwx------+ 18 jdoe    staff         22 Nov  7 23:24 luuna
drwx------+ 17 jdoe    staff         17 Mar  8  2010 pak
drwx------+ 18 jdoe    staff         23 Oct 30 00:05 thundera
drwx------+ 18 jdoe    staff        299 Nov 12 00:09 zarafa

nmc@nexenta:/Stockage/Dropbox$ zfs list -o name,used,avail,refer,mounted,quota,dedup,compress
NAME                     USED  AVAIL  REFER  MOUNTED  QUOTA          DEDUP  COMPRESS
Stockage                 180M  9.53G  48.1K      yes   none  sha256,verify       off
Stockage/Dropbox         165M  9.53G   165M      yes   none  sha256,verify       off

nmc@nexenta:/Stockage/Dropbox$ zpool list Stockage
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
Stockage  11.9G   222M  11.7G     1%  1.02x  ONLINE  -

The logs are zipped but there still would have a lot’s of redundancy in those files…

Out of scope

While copying the images, I had a look at I/O operations:

nmc@nexenta:/Stockage/Dropbox$ zpool iostat -v Stockage
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
Stockage    3.94G  7.94G      5     86   174K  2.27M
  raidz1    3.94G  7.94G      5     86   174K  2.27M
    c2t0d0      -      -      2     19  33.6K   471K
    c2t1d0      -      -      1     18  32.6K   470K
    c2t2d0      -      -      2     19  33.6K   471K
    c2t3d0      -      -      1     18  32.5K   470K
    c2t4d0      -      -      2     19  33.5K   472K
    c2t5d0      -      -      2     18  33.0K   470K
----------  -----  -----  -----  -----  -----  -----

That’s nice to see much more write than read operations when performing a global write.

RAIDZ1 gave a 1:17 ratio (r/w) and 7.94G of storage
RAID0 gave a 0:132 ratio (r/w) and 11GB of storage.
RAID1 gave a 1:54 ratio (r/w) and 1.85GB of storage.
RAID5 is not achievable with NexentaStor.
RAID10 gave 1:83 ratio (r/w) and 5.49GB of storage.

Obviously RAID0 rocks for writing operations… if you accept to loose all your data on a single disk failure.
My RAID1 configuration is highly redundant… but the bandwidth is limited to what the slowest disk can achieve.
RAID10 still looks better for databases access ; but half the storage is used for redundancy.

Conclusion

I couldn’t find a scenario where reduplication really helps preserving storage.

Setting “deduplication” to “on” (rather than “sha256,verify”) doesn’t seem to change dedup ratio.
Switching to “record size” of “4KB” or “512B” doesn’t seem to change the dedup ratio either.

There’s probably something I missed here but I don’t guess what…

Sources

ZFS Deduplication, by Jeff Bonwick
Deduplication now in ZFS
Guide d’administration Oracle Solaris ZFS