Data Deduplication

What to use deduplication for?

       2112 words, 10 minutes

I recently discover a storage feature named “Data deduplication” also called “Deduplication”. Quoting Wikipedia: In computing, data deduplication is a specialized data compression technique for eliminating coarse-grained redundant data, typically to improve storage utilization. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored, along with references to the unique copy of data. Deduplication is able to reduce the required storage capacity since only the unique data is stored. I was first thinking “well, a video/document/… is a file ; and a file is a sets of 0 and/or 1. Like a ZIP archive doesn’t care if it stores pictures or text files, I may be able to use deduplication to store my 32GB of personnal pictures into a smaller storage size… sounds great!”. That’s what I want to figure out.

Continue reading...