I have been using DPM for about 7 months now. (I tested with it for a few months before that.) I never installed 2006, but 2007 seems to be working ok. I have a few complaints, but I have complaints about all the backup software that I have ever used. None of it really makes me happy. But on to the story…
I have 3 production DPM servers. One of them has a large number of protection group members. 8 Protection Groups, 328 Members. And that is just to protect 39 computers, but one of the SQL servers has about 150 databases.
I noticed the problem because I kept running out of space on the Recovery Point volumes. I had a particular 2008 Domain Controller that the system state recovery point volume would have to be extended every couple of days. I was keeping the recovery points on disk for 5 days, so it finally occurred to me that it should take more that 200 GB to keep 5 days work of recovery points for the system state.
I called and opened a ticket with Microsoft and we have been working on this for almost 2 months. So far, the best that I can tell is that the process that clears the old recovery points slowly eats up memory. This coupled with the fact that I have a lot of PG members, and means that the job frequently fails before it completes. If the number of recovery points continues to grow, the job that clears them (pruneshadowcopies) takes longer and takes more memory. This increases the chance that it will fail…
I don’t have a solution to this problem yet, other than a few work-arounds and a way to manually run the process:
- add more RAM to your DPM Server. Especially if you are running SQL locally on the box.
- reduce the number of PG members. Fewer members, less recovery points, less chance the prune job will fail.
- open the DPM Management Shell (DPM PowerShell) and run “pruneshadowcopies.ps1”. This will manually run the job that is triggered by DPM at midnight every night. If you have a lot of recovery points that haven’t been pruned, then this will probably fail (crash) a few times before it finishes. I have had it run all weekend before and then crash, and I have seen it run for just an hour and then crash. Keep running it, and it will eventually finish.
- Hope that Microsoft comes up with a real fix soon…
To see if you have this problem, there is a version of the pruneshadowcopies script that just shows the recovery points, without actually expiring them. The tech that I have been working with on my case sent it to me.