Wednesday, December 4, 2013

Data Domain Solutions Design Overview

To review some key features and benefits of Data Domain:

Data Invulnerability Architecture (DIA) protects data from hardware and software failure.  DD calls deduplication "compression" which removes redundant data.  It also will replicate data fro mone DD to another DD.

DD uses Variable Length Segment deduplication giving maximum efficiency to the deduplication process.  It performs dedupe inline compared to post-process, which saves disk utilization and number of writes.  It uses less WAN bandwidth by only sending deduped data, and using its Virtual Tape Library feature it provides easy migration from legacy tape systems.

SISL, or Stream-informed Segment Layout is the technology used to deduplicate data.  See this post for more info on SISL.

Data Domain has 3 types of data streams.  They are write, read and replication.  Max number of streams is dictated by model.

Data Domain will replicate data at the collection level, which is a full system replication (like system mirroring), at the directory level, which could be used to replicate to multiple targets or for specific directories within the collection.  It can replicate at the MTree level, which uses snapshot technology and is faster than directory replication, or it can replicate at the Pool level for VTL files and tapes.

A replication context is a pair - one source and one target.  Replication takes place over port 2051, using Data Domain's proprietary replication protocol.

Data Domain has an Extended Retention feature.  It is available only on the DD860 and DD990 models and is claimed to be the industry's first online, long-term archive solution.  Archiving gets the most efficiency if at least 2 years will be archived, and systems should be sized for 2-3 years of archiving.  When designing an extended retention system, make sure that the daily change rate is low (less than 10%) and that the long-term change rate is also low (less than 8% for best results).  A lower growth rate is best for archiving, also, and we also need to take the archive frequency (daily, weekly, monthly, yearly, etc) into account.

Extended Retention Licensing must be applied to unused shelves only, as partial shelf extended retention is not available.  Also, once defined as active or archive tier those shelves cannot be changed.  Each shelf in the system requires a capacity license, and there are different licenses for active and archive tiers.  There is also an Expanded Storage license required if more than the controller is required.  For the DD860 that is 64TB of raw storage, and the DD990 has 360TB of raw storage.

The Extended Retention replication type depends on the data to be protected.  When used as a source, the DD Extended Retention supports collection, MTree and DD Boost replication, and when used as a destination it will replicate those as well as directories.

No comments:

Post a Comment