C2 Storage Data Encryption and Durability Strategy

Data Durability

Cloud storage providers are not only expected to prevent unauthorized access to private files, but also to guarantee that data are continuously available and remain in top shape over long periods of time.

Users must be able to retrieve the files at a moment’s notice, free of any data errors, even after years of storage. The ability to keep stored data consistent and intact, without the influence of bit rot, drive failures, or any form of corruption, is called durability.

Many cloud storage providers list a number of “nines” of durability. Synology follows procedures preferred by industry leaders to offer an estimated "nine nines" (99.9999999%) to “twelve nines” of data durability according to widely-used definitions. The protection provided exceeds that of available RAID configurations.

For a critical discussion of the calculation and use of this statistic, refer to this blog about data durability by our research and development staff.

Fault-Tolerant Storage

Synology C2 data center architecture ensures that no valuable data is lost. Highly available and redundant infrastructure minimizes risks by physically eliminating so-called "single points of failure." This means parallel systems stand ready to take over if the main configuration experiences downtime.

Meanwhile, strategic policies and coding measures prevent data loss or corruption if hardware failures nevertheless occur.

Erasure Coding for Data Durability

3.png

Figure 3: Data uploaded to C2 servers are split into pieces and encoded, generating parity pieces that keep data retrievable when one ore more servers are down.

Synology employs erasure coding, the gold standard in data durability technologies, to safeguard data integrity in the face of server crashes, drive failures, and writing errors. Erasure coding takes a similar approach to most RAID configurations by relying on smart distribution with redundant data to enable checks and recovery.

Hyper Backup divides files selected for backup into data chunks of about 50 MB for upload. After encryption and transmission (as well as optional compression and deduplication) each chunk is distributed over several data pieces hosted on as many discrete servers.

Several pieces out of each set are redundant, so that a number of servers, drives, or pieces may be lost or damaged at any time without compromising the ability to retrieve the original chunk or to check its integrity. The configurations used reduce the likelihood of such events to many digits behind the decimal point, or practical impossibility.

Bit rot can also be detected. When data is written to the hard drive, the storage service calculates the MD5 checksum and records it in the extended file attributes, which allow users to correlate computer files with metadata that the file system has not yet processed. When the storage service reads the data, the MD5 checksum will be re-calculated and will check for any abnormalities with the MD5 checksum recorded on extended file attributes. If any data has detected abnormalities, it will be quarantined, and the redundant and healthy data will be used to recover the data set.

Synology’s Erasure Coding Setup

Synology employs erasure coding setups that ensure at least three pieces of redundancy. This means that if each data chunk is distributed over 15 pieces on 15 nodes, only 12 of these are needed to reconstruct any file.

In other words, up to three devices or pieces can be compromised without users losing access. Hardware-level failures, if they occur, are thus highly unlikely to affect Synology C2 data.

In the above example, any file can be reconstructed from any combination of 12 data pieces. This means Synology’s cloud storage setup offers significantly higher redundancy than RAID 5 configurations (1 disk redundancy) or RAID 6 storage setups (which can tolerate 2 broken disks).

Unlike with RAID configurations, which need time to rebuild following failure, recovery of files in C2 Storage's erasure-coded setup is fast and painless.


Download PDF
Fault-Tolerant Storage
Synology’s Erasure Coding Setup