Hello and Happy Monday to you all!
This is quite a long tale, but it involves a serious issue so I hope you can bare with it.
I have been using (initially) HP Data Protector express and (more recently) Yosemite Server Backup to look after the tape backups that I rely on at my site. The initial software came with out HP Tape autoloader device 1/8 G2 Autoloader with an LTO 4 tape drive.
The tape system is used for weekly data off-site and archive purposes, using weekly, monthly and yearly tapes sets. The tape sets started off by running to two volumes, but five years later, we are now up to 5 tapes. The hardware has pretty much performed flawlessly, but in general, it only runs for around 36 hours even given week.
Until a week ago, I have always been able to restore my data without incident and in a time not dissimilar to the actual backup time. I had enough faith (and no option) in the system that I was confident enough to trust the setup when I needed to remove and then recover so very large VM’s after a serious SAN issue and subsequent rebuild.
To the point. Last week I was asked to recover a copy of excel files back for a user. The data was stored on tapes from last August 2016, but no problem I thought. Due to storage limitations, I have to restore a full VM from tape back to my backup server running Dell’s (now Quest) vRanger software.I then basically rub a file level restore from the manifest file and the jobs a good one. This process would usually take around 6 - 8 hours, which is not a problem. Last week however, the tape recovery for the files was still running three days later. This is not acceptable.
At this point I called support, and the help I got ranged from outstanding - poor. The upshot though, is that I am still no further on with the problem after exhausting all the support available from Barracuda / Yosemite Server Backup.
With further research, I noticed that this problem seemed to have started when we moved from 4 to 5 tapes. The first time that the backup ran with the additional tape, the backup seemed to run at usual, but the verify would linger on the last tape with a very slow throughput. The next time that the job would run however, everything would be back to normal with regards to backup & verify times.
My testing has proved that this only seems to cause a problem on the last tape in a set, and the other tapes can be read as quickly as usual, with a 4-4GB/minute throughput. The final tape though, only gets read at between 60-100/MB/minute, and as the VM I am trying to recover is >600GB, this takes to long.
The extra tapes by the way are brand new HP, and work perfectly in all other circumstances. No errors are being captured by the OS (server 2008 R2) event logs, application software or HP Tape hardware logs. The data is all coming from the same server(s) and components, and the network is not to blame.
Can anybody advise me where to go from here? I would love to just try and read the tapes with a third party product to try and identify where the throughput problem lies, or even better, communicate with somebody who may have gone through something similar, even with a different software application.