Friday, November 1, 2013

Data Domain Replication

Data Domain replication involves copying data from one DD to another.   A replication pair is called a "context" in the DD GUI.  The primary purpose of DD replication is disaster recovery, although archiving data for longer periods than is desired to have onsite is also a function.

There are 4 types of replication in Data Domain:

Collection replication is a full-system mirror of the col1 directory.  The destination is read-only, and can back up one source only.  Collection replication backups up snapshots as well as users and permissions, and is the fastest and lightest replication method of replicating one DD to another.

Directory replication is more flexible in its ability to configure replicated data from one DD to point to multiple destinations and selecting just the data to be replicated.  CIFS and NFS shares can be sent to the same DD destination, but not within the same directory.  When directory replication is configured, the destination folder will be automatically created if it does not exist, and after created ownership and permissions on files and folders is maintained.  The directory replication process is triggered when a file closes, but to conserve high-demand bandwidth can be forced to a specific time.  Snapshots are not replicated automatically and must be configured separately.

MTree replication is a new feature with the introduction of MTrees.  In this method, the destination MTree is set read only, and the main difference is in how data is determined for transfer.  MTree replication still sends changed data only, but instead of sending vector summaries back and forth, MTree replication creates a snapshot and compares one to another.  MTree replication can use encryption and Retention Lock as well.

Pool replication is exactly like directory replication, but the source is VTL data.  A separate VTL license is not needed onthe destination DD if VTL restores will not be done directly from it.

DD systems can replicate in the following topologies:
  1. One to one
  2. Many to one
  3. Cascading
  4. Bi-directional
  5. ONe to many
  6. Cascading one to many
Offsite replicas can be used to restore primary source in the event of a failure, and it is available to the DR site immediately.

4 comments:

  1. So which do you recommend? Collection, Directory, or Mtree? I don't plan to use pools since I dont have the VTL license. Whats least impactful on the target DD?

    ReplyDelete
  2. Hi Chad - I apologize for the delay in responding. I've been out of touch for a while.

    What I have found is that the replication method really depends on how and what you're using your Data Domain and the remote replica to accomplish. For instance, are you performing two-way replication, where each DD is both a source and a destination for the other? If that's the case, you need to use mtree or directory replication.

    Collection replication is a faster method, but is a root-to-root replication and you end up with a duplicated system. This works if your only objective is to duplicate one site to another or in the case of upgrading to a new system.

    One other note of interest when migrating a Data Domain to a newer box is that the replication snaps the existing Data Domain and begins moving data. It will not release that snapshot until it has a valid replica. This can cause a problem when you are nearing capacity on the source Data Domain and needing to continue backup to it. It will add data, but cleanup never takes place until the replication is completed. Keep that in mind if you are hitting that 90% utilization and need to upgrade instead of expand.

    Not that it ever happened to me... ;)

    ReplyDelete
  3. Hello Greg,

    I need to migrate a dd660 to a new dd2500 and am backing up with networker 8.1.
    How would I migrate the data from the dd660 to the dd2500 and still have networker see the data from the dd660 on the new dd2500?

    Thanks
    T

    ReplyDelete
    Replies
    1. Hello T -

      As with everything, it depends a little bit on how your envirnoment is ocnfigured and what you are migrating. I fyou ware just wanting to do a full appliance migration, you could use a collection replication. This would actually duplicate your DD660 onto the DD2500. I used this method when I replaced my DD660's with a DD2500's, and when the 2500's came online they had the same IP addresses and system configuration.

      BE SURE TO READ THE DOCUMENTATION CAREFULLY AND UNDERSTAND WHAT YOU ARE DOING, HOWEVER.

      Other factors to consider would be (a) how much bandwidth you have between systems, and (b) how close to capacity is your 660. When you begin theinitial replication, the Data Domain is going to take a snapshot of the 660 and start to replicate that over to the 2500. That allows it to be consistent when the 2500 comes online, but it also means that it is not going to allow that data to be cleaned up and removed until the migration is complete. If you're sitting at 90% utilization and start that process, and continue to write backup to your 660 you could very well end up in a jam where you fill the Data Domain and aren't able to back up any more data.

      Also remember that you can contact EMC support for assistance with migration. I worked with a few engineers there who knew this process like they know their own names adn they were excellent.

      Good luck, and let us know how it turns out!

      Delete