Frequently Asked Questions

« Previous
How do I lookup LDAP or Active Directory via command line on Mac and Linux?
Next »
My Android device is running out of storage. What is using it?

windows ubuntu macos xibo network zenworks android storage



80. How do I get deduplication to work in Linux?

See also: How do I configure my resolver on a Linux machine?
See also: How do I install Ubuntu?

ZFS

ZFS is great for compression and snapshots, but regarding deduplication: Don't go there. ZFS on Linux is doing inline deduplication and requires at least 5 GB of RAM for each TB of storage. It is usually better to get more hard drives. When using too much RAM everything will slow down to a crawl.

Btrfs


Btrfs is not as old and stable as ZFS, but it has compression, snapshots and deduplication. The deduplication in Btrfs is out-of-band.

Compression is stable. Go ahead.

When using snapshots and Btrfs, we recommend not saving more than 24+6+3+11 snapshots, each hour for a day, each day for a week, each week for a month and each month for a year. Otherwise (like saving a snapshot every day and not removing them) the snapshots may take too long time to remove. It seems like Btrfs is checking each file for each snapshot when snapshots are removed on order to know if the original file can be removed. There must be more than enough time (and IOPS to spare) to remove snapshots before new can be created.

Deduplication is run using en external tool. Easiest is to use duperemove on the dataset, we have however not tried any larger datasets.

Other ways...

There most probably are other ways to do this. Let us know.

 

This entry deduplication was last modified 2017-12-12

   

This documentation is covered by GNU Free Documentation License.