EMC Glossary

Fixed-Length Deduplication

To determine unique, repetitive segments, fixed-length deduplication is a data deduplication algorithm that breaks a file system into subfile, fixed-length data segments.


The main limitation of this approach is that when the data in a file is shifted, for example, when adding a slide to a Microsoft PowerPoint deck, all subsequent blocks in the file will be rewritten and are likely to be considered as different from those in the original file. As a result, the compression effect is less significant. Smaller blocks get better deduplication than large ones, but it takes more processing to deduplicate.

Variable-length deduplication is a more advanced approach, which anchors variable-length segments based on their interior data patterns. This solves the data-shifting problem of the fixed-size block approach.