Friday, December 6, 2013

Avamar System Sizing

Industry standard is to report capacity in decimal numbers, whereas Avamar OS reports in binary.  Thus, the system reports 1.8TB on a 2TB system.  Also, RAIN capacity is factored into the capacity report beforehand.

When planning for Avamar capacity, it's important to understand that there are different "views."  The OS View is the total amount of disk space.  Then there is a GSAN view, which is total allocated to stripes minus space freed by garbage collection plus RAIN parity.  The user capacity is 65% of GSAN view, and when user capacity hits 100% it is the read-only threshold.  There is also the CUR view, which is all the current working directories.  In the OS view, 20-25% is reserved for checkpoints, and 15% of the disk is reserved for the OS itself.

Major factors when planning capacity required would be the amount of primary storage (or how much are we backing up?), commonality which increases as retention period increases.  Also to take into consideration is the daily change rate.  Remember to plan for 80% of the user capacity to follow best practice.

The number of nodes should be kept to 12-14 for the initial implementation to allow for growth to the max 16 nodes in the future.  Capacity can not be added to a single node server, either.  A rot-root migration needs to take place to increase a single node, so plan for growth.  The number of client connections are also a consideration in that an Avamar can only handle 27 at a time.  In environments where there are a large number of clients it may be best suited to have several smaller nodes as opposed to one larger node.

Type of data to be backed up plays a part in the sizing requirements, also.  File System, database, rich media and virtual machine files all have different dedupe ratios, and a high daily change rate and lots of files (millions of smaller files) will also increase backup time and disk utilization.

RAIN is random array of independent nodes and provides for failover and fault tolerance across nodes in the Avamar grid.  The minimum number of nodes is 1x3 (1 controller and 3 storage nodes) plus a spare storage node, unless you have premium hardware support at which time the spare is no longer required after GEN4S.

Checkpoints require 20% of the OS view, and are snapshots of the system allowing rollback in case of corruption or operational difficulty. 

Encryption will add 33% to resources, which equates to 33% of the user capacity, so the system will now handle 43% of the user capacity instead of 65%.

The longer retention period you are able to implement, the better your dedupe will be.  Weekly backups have the same amount of unique data as 3 daily backups, and monthly backups have the same amount of data as 6 daily.  Therefor, using a regular daily/weekly/monthly rotation will decrease the amount of disk utilized.  For instance, 30 daily + 2 monthly will result in the initial backup + 42 days of unique data compared to 90 daily backup.  The minimum retention period should be longer than the time between backup so that you don't have to constantly do initial backup because the data aged off.

When clients are distributed geographically, it may make sense from an RTO perspective to have several, smaller nodes closer to the clients in each site rather than centralized nodes for all as long as there is adequate bandwidth and system resources to handle them.

Best practice is to replicate all Avamar servers to another Avamar server.  When a single node is implemented, replication is required - such as in the case of the AVE.  Be sure to size appropriately to accommodate the replication type and amount.

Be sure to consider what types of data are being backed up when AVE sizing.  Will this be file system, Apps (like Exchange, SQL) or mixed?  The daily change rate needs to have an appropriate amount of I/O performance, and when initially implemented the read-only threshold should be set to 40% until actual system metrics can be gathered.  The max daily change rates for AVE are <2GB/day with a total of 650 GB total protected data on a 0.5 TB AVE if backing up non-structured data like file servers, and <5GB/day and 500 GB total protected if structured or mixed data.  The 1 TB unit can handle <4GB/day and 1.3 total protected for non-structured, and <10GB/day and 1 TB of structured data.  The 2TB unity can handle <8GB/day and up to 2.6 TB protected non-structured data and <20 GB/day and 2TB structured.  So the difference between them is that you can have more daily change in a mixed backup environment, but the total protected data reduces.

PAT is a 24-hour test to ensure both acceptable I/O, and that the Avamar server will not only have sufficient resources to run but that it will not have a negative impact on other VMs on the ESX server.  Performance of AVE increases as t6he I/O increases, as well as with a lower daily change rate and less capacity of the AVE used.  When putting in an AVE, be sure to thick-provision and eager-zero the disks, which should be R1 or R10 physically.

No comments:

Post a Comment