Press "Enter" to skip to content

Configuring EMC DataDomain Boost FC with Veeam B&R

The idea beside this blog post comes up after a meeting with a customer, who configured a physical Veeam proxy for writing VMs backups on a EMC DataDomain system.

I was checking the proxy configuration when I noticed that the customer presented 32 DD Boost FC devices to the proxy. So, I asked why this number. The answer was: “because 32 looks like a good number”.

In my past working experience, I configured a lot of DataDomain but I have never used DD Boost via FC bacause I have always found that is a way easy configure it via IP: no SAN, no zoning, just an LACP port channel on the switch and, most important, no HCL to be checked. I used to configure FC devices only with the very first generations of DataDomain, when the only way to write via FC was using the VTL.

So I started wondering which could be a good number of DD Boost FC devices to be presented to a proxy. After some googling, I found this very confusing official document but before digging into it, it is very important to understand how, in Veeam B&R, tasks are assigned to a proxy.

 

Task assignment in Veeam B&R

In Veeam B&R every single virtual disk, during a backup process with parallel processing enabled, takes 1 core and around 500 MB of RAM of the backup proxy server.

So, if we have a backup proxy with 2 CPUs with 10 cores, we can backup up to 20 virtual disks at the same time (20*2).

You can check (and modify) how many concurrent backup a proxy can handle by editing the proxy properties:

Proxy properties

There is another detail to consider when using a DataDomain DD Boost repository: since Veeam B&R v9, when using this kind of repository, the default write method is per-VM backup chain. It means that every VM has its own backup file instead of having one backup file per job. It is a huge improvement, especially when restoring from dedupe appliances: the smaller the backup file, the lower the amount of data to be re-hydrated.

Repository settings

When the gateway server writes data into the DataDomain DD Boost repository with the per-VM backup chain option enabled, every VM has its own write stream. In this way, the throughput is higher and  every write is multi-threaded.

 

How many devices?

The first thing to consider here is that there is a huge difference between different OS and how they handle SCSI requests over a SCSI pass-through interface.

Linux OS can handle a huge number of SCSI requests over a single device, on the other hand, the Windows SCSI pass-through interface mechanism can only conduct 1 SCSI request at a time through each of its generic SCSI devices. This impact the performance of the DD Boost over FC solution if multiple connections (backup jobs) are trying to use the same generic SCSI device.

Veeam B&R uses Windows proxies to process, work and move data from the virtual infrastructure to the backup repository and tune up the software will surely help to increase the performance.

 

The calculation should be performed in two steps: first we need to determine the maximum number of concurrent connections from all the proxies to the DataDomain and then the number of devices for every single proxy.

Part 1 – On the DataDomain system

The Data Domain system imposes a limit on the number of simultaneous requests to a single DFC SCSI device. Because of this limit, the number of devices advertised needs to be tuned depending on the maximum number of simultaneous jobs to the system at any given time.

Here is the formula:

D = minimum(64, 2*(S/128)), round up

Where:

J =  maximum number of simultaneous jobs running using DFC, to the Data Domain System at any given time

=  maximum number of connections per job ( 3 for DD Extended Retention System – 1 for other types of DataDomain systems)

64 = maximum number of DFC devices that a single system can advice

 

Part 2 – On the Windows system

The Data Domain server path management logic spreads out connections across available logical paths (initiator, target endpoint, DFC device). The goal here is to determine how many device we need to be sure that every data stream can use a DFC device.

Here is the formula:

= max(X, (S/P))

Where:

X = number of devices configured on the DataDomain system (see part 1)

P =  number of physical paths between media server and Data Domain system

J = maximum number of simultaneous jobs

C = maximum number of connections per job ( 3 for DD Extended Retention System – 1 for other types of DataDomain systems)

S = J *C (round up, up to a maximum of 64)

D = number of devices, to be calculated

 

Example of sizing

Suppose that we want to configure three Veeam backup proxies with 2 CPUs, 10 cores and 4 FC paths available to the DataDomain each.

On the DataDomain

= 60 (3 proxies, 20 concurrent disks each)

S = J * C = 60 * 1

D = minimum(64, 2*(S/128)) = minimum(64, 2*(60/128)) = 1

 

On the Windows system

X = 1 (it depends on the calculation above)

P = 4

J = 20

C = 1

= max(X, (S/P)) = max(1, (20/4)) = 5

In this example, every proxy should have DD Boost FC devices to be right sized. 

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *