Tuesday, March 28, 2023

NAS file system repair Terra-master F5-422 NAS

I'm writing this incase anyone else runs into the problem I had with the F5-422 NAS last week.

The Terra-master 50TB RAID1 F5-422 NAS  here experienced its first power failure, probably while in active use and did not take it well.

While I had to reboot this a few times since it was deployed in 2020, it's always back up normally. This time was different.

Two thing that went against it this time:

  • Actual power failure vs a reboot via pressing the power button
  • Actively writing files during the power failure event

The immediate symptom was that mount requests from client devices were refused and the WEB UI was indicating a factory-reset state (was asking for a email address etc to starting configuration).

Fortunately both telnet and ssh to root user on port 9222 with the previously set admin password worked.

I tried Terra-master support, and got exactly the level of support I expected. They suggested I reboot. I explained I did. They suggested I reboot again. And again. And again. Ok, enough of this. I'm on my own.

BTRFS is the file system used in this NAS. Attempting to manually mount the file system from command line resulted in a superblock error:

mount -t btrfs -o ro,usebackuproot /dev/mapper/vg0-lv0 /mnt/jdraidrecovery/
mount: /mnt/jdraidrecovery: can't read superblock on /dev/mapper/vg0-lv0.

I tried doing a read-only check:

btrfs check  /dev/mapper/vg0-lv0

That quickly failed due to RAM exhaustion. It came with 4GB RAM and it would seem that is no where enough to check a 50TB RAID. There is supposed to be a low mem checker mode, but that didn't appear to be available on the installed btrfs utilities. That would explain why a reboot wasn't recovering the NAS. From the logs it looks like recovery was started but was quickly killed by the OS for exhausting RAM.

The fix:

Mount an external SATA SSD via USB adapter and create a 16GB swap file and then manually set the OS to swapon that.

btrfs check --clear-space-cache v2 /dev/mapper/vg0-lv0 

btrfs check --repair  /dev/mapper/vg0-lv0


This took 5 hours+ to complete. However when competed I was able to manually mount the RAID device. Then reboot and all was normal.

Generally I'm quite pleased with this device. It offered a 10Gbps ethernet port at a very competitive price point. However the price you pay for that cheapness is having to figure problems like this on your own.