n0derunner

Detecting and correcting hardware errors using Nutanix Filesystem.

Published: (Updated: ) in Nutanix, , by .

It’s good to detect corrupted data.  It’s even better to transparently repair that data and return the correct data to the user.  Here we will demonstrate how Nutanix filesystem detects and corrects corruption.  Not all systems are made equally in this regard.  The topic of corruption detection and remedy was the focus of this excellent Usenix paper Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions. The authors find that many systems that should in theory be able to recover corrupted data do not in fact do so.

Within the guest Virtual Machine

[root@gary]# od -x /dev/sdg

0000000 adde efbe adde efbe adde efbe adde efbe
*
10000000000
[root@gary]# sha1sum /dev/sdg

1c763488abb6e1573aa011fd36e5a3e2a09d24b0  /dev/sdg

Within the Nutanix CVM

nutanix@NTNX $ od -x 10808705.egroup

0000000 adde efbe adde efbe adde efbe adde efbe
*
20000000
nutanix@ $ dd if=/dev/zero of=10846352.egroup bs=1024k count=4

Back to the client VM

[root@gary-tpc tmp]# sha1sum /dev/sdg

1c763488abb6e1573aa011fd36e5a3e2a09d24b0  /dev/sdg. <— Same SHA1 digest as the "pre corruption" state.

Magic revealed

Logs from the Nutanix VM

Here are the logs from Nutanix:  notice group 10846352 is the one that we deliberately corrupted earlier

E0315 13:22:37.826596 12085 disk_WAL.cc:1684] Marking extent group 10846352 as corrupt reason: kSliceChecksumMismatch

I0315 13:22:37.826755 12083 vdisk_micro_egroup_fixer_op.cc:156] vdisk_id=10808407 operation_id=387450 Starting fixer op on extent group 10846352 reason -1 reconstruction mode 0 (gflag 0)  corrupt replica autofix mode  (gflag auto)  consistency checks 0 start erasure overwrite fixer 0

I0315 13:22:37.829532 12086 vdisk_micro_egroup_fixer_op.cc:801] vdisk_id=10808407 operation_id=387449 Not considering corrupt replica 38 of egroup 10846352