Thursday, December 5, 2013

Data Domain Operational Considerations

There are several factors to take into consideration when designing a Data Domain backup system.  Some are external to the DD, others are internal. 

Some of the external factors affecting system performance are the number of simultaneous client connections, or number of streams.  Also consider the connectivity between the client and the backup storage node as well as the number, type and connectivity of the storage nodes.  Is the node powerful enough to handle all the backup being sent to it?

Internal factors affecting the performance of a Data Domain are the number of simultaneous streams and the number of shelves in the system.  Also consider if encryption is being used.  There are different methods of compressing data locally, and they have different performance capabilities.  Also be aware of cleaning, or rebuild of RAID should it be taking place.  System performance will also take a hit if it is running in "Mixed Mode" where there are a large number of restores while trying to back up.  Furthermore, the number of MTrees plays a part.  In DDOS 5.3, there is a best practice limit of 32 MTrees for the DD880, DD890 and DD990 compared to the regular 14, but all other units are still recommended to use 4.

There is also a need to take the host OS and connectivity method into account - for instance are you backing up Windoes or Unix servers?  Are you using multiple streams or a single stream?  Different OS and method of backup will play a part in performance.

The cleaning process will impact the DD as well.  It is not recommended to run cleaning daily, and it is never a means to fix a DD that is nearing capacity.  Setting the cleaning throttle to greater than 50%  will have a negative impact on the performance.  Also, if you change compression methods the DD will have to uncompress, recompress the entire data set which will take time and put additional load on the system.  Cleaning process will not restart itself if there is a power outage in the middle of the operation, nor will it restart itself if the filesystem is disabled during cleaning.  Replication after an extended period of disconnection will take a  long time.Full systems or systems with multiple shelves may also benefit from having multiple cleaning cycles.

The local compression methods are lz, gzfast and gz.  lz is the default and is quick.  It offers the best throughput of the others.  gzfast is a tool that requires less space but more CPU (approximately 2x lz).  This method is used typically on systems where greater compression with lower performance is needed, like in a remote office. gz offers the bast compression but5 requires the most CPU (5x lz).  gz is often used for nearline storage.When replication is going to be used, it is recommended to seed the offsite device first whenever possible.  If the replication is configured as a collection replication, the destination is read-only.  Before DDOS 5.1, you can not replicate MTrees, and even at the current level DD Boost and NDMP are not supported.  Last to consider is that, if the DD is the source of the replication, snapshots only replicate when using collection replication.

Some things to remember about encryption on a DD system, are that MTrees can be individually encrypted, and that Data-at-Rest encryption is not automatically turned on with Data-in-Flight encryption.  If data is not being sent via a backup application, be certain to not compress or encrypt it before hitting the DD.  Encryption is the last step in the process of writing to disk.

Also remember that 14 MTrees is the best practice (unless using DD880, DD890 or DD990 where the MTree limit is 32).  Also DD Extended Retention is only supported as a replication target for MTree replication, and DDOS 5.1 does not include VTL or NDMP replication.

No comments:

Post a Comment