We see an identical failure in StorageCraft’s design. The truth that an admin decommissioning a single server prompted them to lose all metadata suggests a scarcity of geo-redundancy and fault-tolerance of their backup storage structure. This had a single server with a single copy of the one knowledge that will permit them to piece collectively all of their buyer’s backups. Once more, a hearth, electrical quick, or flood may have simply as simply worn out this firm. In the long run, nonetheless, it was a single human error. As a reminder – human error and malfeasance are the primary and two the reason why we again up within the first place.
As a backup skilled with a long time of expertise, these incidents make me cringe. The three-2-1 backup rule exists for a purpose – 3 copies of your knowledge, on 2 completely different media, with 1 copy off-site. Any accountable backup supplier ought to be architecting their cloud with a number of layers of redundancy, geo-replication, and fault-isolation. Something much less is placing buyer knowledge at unacceptable threat. The lack of a single copy of any knowledge in any backup surroundings mustn’t consequence within the lack of all copies.
When one thing like this occurs, you additionally look to how the corporate dealt with it. Carbonite’s response was to sue their storage vendor, pointing fingers as a substitute of taking accountability. They noticed nothing unsuitable with their design; it was their storage vendor’s storage array that prompted them to lose buyer knowledge. (The lawsuit was settled out of court docket with no public file of what occurred.) Carbonite’s CEO additionally tried to publicly downplay the incident, saying it was solely backup knowledge, not manufacturing knowledge that had been misplaced. This was some extent that was in all probability misplaced on the 54 corporations who did lose manufacturing knowledge as a result of they wanted to carry out a restore that will have been doable solely with the backup knowledge.