Spictera

logo

Climate impact of taking backups

We will try to describe the environmental impact of taking backup of data.

The illustrations are not intended to be exact, but can be useful to give an understanding of how different backup and restore methods can impact the climate.

First we are going to describe our environment. Lets play with a database with a size of 10TB useable capacity, and that we take 1 full copy per week, with a retention of 30 days (approximately 6 full copies + all incrementals), the backup storage without any data reduction techniques ~60TB.

When the backup is running, we probably goona get ~500Mb/s (~2TB/h), so a full backup will take 5h. One single HDD SAS has at an average 130MB/s, so we need at least 6 disks for just the backup performance.

Each Disk consumes raughly 6W when it is reading/writing, addtional 1,5W is heating leakage, which ends up with ~7,5-8W per disk for read and write to the disk.

6 disks x 8W = 48W

It is typically normal to have some kind of RAID protection, lets say RAID-5, where 1 extra is used for checksum, so there will be additional power consumptions for this extra checksum disk. Just to give an example here, we will assume 1 RAID-5 per 5 disks, which means 2 extra 1TB disks if we are using 1TB disks, an additional 2*8W for these checksum disks.

10 disks x 8W = 80W

The RAID controllers also need power, to keep up with buffers, and RAID functionalities. These normally uses 15W

80W for disk + 15W RAID controller = 95W

Mirroring of storage capacity accross sites are also very common, which means that the storage capacity are duplicated to two datacenters. The power consumptions are now duplicated for just the storage disks, and adapters. Additional 15W.

95W x 2 mirrors = 190W

As the storage might be connected over a SAN, than this will use 3W per port, and probably at least two port per site (As there is likely at least two SAN switches, not counting the ones in the middle). In total 3W * 2 per site * 2 sites = 12 W. You will probably say we need it anyway, which is true, but it also consumes additionally 0,1W per GB/s.

4 ports x 0,1W per GB/s x 0,5GB/s x 2 site = 0,4W

190,4W for two mirrors

Assume we uses disk as a backup media too, than we would need additional storage for that too. Without any data reduction techniques, we need at least 6 times more storage capacity compared to the databases size, but we can go with larger disks (perhaps 3-4 times larger too) 10TB x 6 full copies / 3 times larger disks = 20TB of disk capacity, or same power consumptions as the database server storage (10TB mirrored to two sites).

20 disks x 8W + 15W RAID controllers = 175W

The network card of 10Gbit consumes about 20W, and 400% CPU. We need at least one for the client and one for the backup server.

2 NIC x 20W = 40W

40W + 175W = 215W

Let us skip other power consumings, and try to sum it up.

NIC for backup server: 20W

Storage for backup server: 175W

==========================

Backup power: 195W

 

NIC for client: 20W

Storage for client: 185W

SAN ports: 0,4W

============================

client power: 205,4W

 

If the backup is running for 5h, than this will be 205,4W + 195W = 300,4W/h

300,4 x 5h = 1,5kW

There will be 52 full copies per year, so the total energy would be 1,5kW x 52 = 78kW/year.

How much polution?

According to an article in France, each kW/h has 80-100g carbon oxide polution. That would mean that 78kWh x 100g = 7800g CO2 polution yearly.

I know what you are thinking. We need these disk performance anyway?

But why designing it for the backup?

Can this polution be reduced?

with spictera storage brick solution, there is no need to take periodically full copies of the system.

This will reduce not only the amount of backup storage needed, but also all the resouces needed to read the data and send the data.

So if the daily changes on the system is 10%, than the reduction in power consumption just to back it up will be 90% compared to when the full backup is performed. But there are 52 full copies, so the magnituted of the reduction is 52x 90% less energy consumptions, 46.8x less consumption.

So the 78kW/year needed can be reduced to 1,7kW/year. Or 1,7kW x 100g = 170g CO2 polution per year.

Note that these calculations are not intended to be exact, but will show that the solution is not only saving money and time, it also reduces the carbon oxide polution.

That is why our spictera solutions are climate smart solutions

How much energy can you save during a restore?

A restore is often performed in several steps. Restore from last full copy, all daily incrementals, and lastly all transactional data.

That means that the actual transfered volume is larger than the total volume of the system.

Plus that you have to restore all the data even though you only need a small portion of the data, like a field in a table, a table, a row, etc.

Assuming a daily changes of 10%, weekly full backup, and the crash happen to be as far away from when the full backup is performed. Than you need 100% of the data; that is the full copy + 6x 10% for the incremental copies, plus lastly all of its transactional data, can also be up to 10%. In total 100+6×10+10=170% to be transferred, just to have your application restored to the given point in time.

With the spictera storage brick solution, the backups can be taken more often, which means less transactional data is needed for the restoration and recoveries.

Besides that, each backup represents a view of how the system looks at that point in time, which means that the transfered volume can at a maximum be 100%, regardless of the point in time selected. If we want to restore all data, than 100% + transactional data, lets say 10% to be kind here, in total 110% compared to 170% in the above example, that is a reduction of 60% just here.

But if the need to restore only want portion of the data it will be much less data needed to be transfered.

Spictera storage brick has a technique called instant restore, where each snapshot is provisioned from the IBM Spectrum Protect storage via normal Spectrum Protect Client API calls to the server; and bound to the volume.

This snapshot is than reverted to by the software using normal OS revert to previous snapshot features (merge back to previous snapshot in background), which means that the data is available nearly instantly regardless of size of the data.

The administrator can than start to pull out the data of interest, while the restoration is performed in back ground by the operating system.

When the administrator has pulled out the data of interest, the restore can be terminated. This means that only the time it take for the administrator to pull out the data of interest x times the spead it take to restore is the actuall amount of data transfered during the restoration. If it takes 10 minutes to pull out the data, out of 5h, than the savings are 10 minutes vs 300 minutes (5h x 60 minutes) + the incremental backups (70% x 300minutes = 210 minutes) ~ 10 vs 510 minutes in time, and about the same for the energy consumption reduction, or less polution.

Scroll to Top