VNXe Deduplication and Compression

I had to do some digging recently to get some technical details on the deduplication technology imbedded in the EMC VNXe arrays so I thought I’d share it.


The file-level dedup on the new VNX arrays is only for Shared Folders and NFS datastores.  This deduplication is policy driven and utilizes both file deduplication (single instancing) and compression.  The compression of files has NO impact on data sharing.  The deduplication acts on whole files and does not look at the metadata.  In addition, shared folders and NFS datastores can have different dedup settings and the dedup will work in conjunction with replication, snapshots and file level retention.

Policy Engine

The policy engine which controls deduplication runs automatically and is governed by high and low CPU watermarks for the Datamovers.  When the CPU is below 40% the policy engine runs at full speed.  When CPU utilization is between the low and high watermarks (40 – 75%) the policy engine is throttled.  CPU utilization above 75% will cause the policy engine to go into a paused state until the utilization drops.

There are 2 main settings that can be configured in the policy engine, the File Extension Exclude list and the Path Exclude list.  Here you can set certain file types or data paths to be excluded from deduplication. 

Additional Thresholds

The threshold for deduplication on files is 30 Days.  Files that have been read or modified in the last 30 days are exluded from the deduplication process.  There is also a file size threshold to be aware of.  Files greater than 8TB and smaller than 24KB are excluded from the deduplication process also.

So, while you certainly won’t see deduplication ratios like you would on DataDomain or Avamar appliances the VNX deduplication can prove to be very useful and save you precious production storage space.  Best of all, VNX deduplication is easy to turn on and manage through the Unisphere GUI. 


3 Responses to “VNXe Deduplication and Compression”

  1. Scott Pelletier (@scottpelletier) Says:

    awesome information. just so i understand, this is file/nas only? this does not apply to block based LUNs/volumes?

    you mention NFS specifically, are you saying this excludes CIFS, or you just didn’t test that? you did mention “shared folder” so maybe this is the CIFS piece?

    on another “block note” it sounded like this is more file based granularity versus block based granularity (whether block based LUN or file based presentation)?

    • bcalfo Says:

      Scott, Thanks for the reply. CIFS and NFS are both included in the dedupe process. I should have been more clear with “Shared Folders” but that is CIFS. The dedupe process does not apply to block, only file.

  2. VNXe Dedup and Compression… « Lewan & Associates IT Solutions Technical Blog Says:

    […] 23, 2011 by scottpelletier Leave a Comment Share this:StumbleUponDiggRedditLike this:LikeBe the first to like this […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: