Recovering a raid array in “[E]” state on a Synology nas

Tuesday, May 19th, 2015

WARNING: If you encounter a similar issue, try to contact Synology first, they are ultra responsive and solved my issue in less than a business day (although I’m no enterprise customer). Commands that Synology provided me and that I mention below can wipe away all your data, so you’ve been warned :)

TL;DR: If you have a RAID array in [E] (DiskError) state (Synology-specific error state), then the only option seems to re-create the array and run a file system check/repair afterwards (assuming that your disks are fine to begin with).

Recently I’ve learned that Synology introduced Docker support in their 5.2 firmware (yay!), but unfortunately for me, just when I was about to try it out, I noticed an ugly ORANGE led on my NAS where I always like to see GREEN ones..

The NAS didn’t respond at all so I had no choice but to power it off. I first tried gently but that didn’t help so I had to do it the hard way. Once restarted, another disk had an ORANGE led and at that point I understood that I was in for a bit of command-line fun :(

The Web interface was pretty clear with me, my Volume2 was Crashed (that didn’t look like good news :o) and couldn’t be repaired (through the UI that is).

After fiddling around for a while through SSH, I discovered that my NAS created RAID 1 arrays for me (with one disk in each), which I wasn’t aware of; I actually never wanted to use RAID in my NAS!

I guess it makes sense for beginner users as it allows them to easily expand capacity/availability without having to know anything about RAID, but in my case I wasn’t concerned about availability and since RAID is no backup solution (hope you know why!), I didn’t want that at all, I have proper backups (on & off-site).

Well in any case I did have a crashed RAID 1 single disk array so I had to deal with it anyway.. :)

Here’s the output of some commands I ran which helped me better understand what was going on.

The /var/log/messages showed that something was wrong with the filesystem:

May 17 14:59:26 SynoTnT kernel: [   49.817690] EXT4-fs warning (device dm-4): ext4_clear_journal_err:4877: Filesystem error recorded from previous mount: IO failure
May 17 14:59:26 SynoTnT kernel: [   49.829467] EXT4-fs warning (device dm-4): ext4_clear_journal_err:4878: Marking fs in need of filesystem check.
May 17 14:59:26 SynoTnT kernel: [   49.860638] EXT4-fs (dm-4): warning: mounting fs with errors, running e2fsck is recommended

Running e2fsck at that point didn’t help.

A check of the disk arrays gave me more information:

> cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [
md2 : active raid1 sda3[0]
      3902296256 blocks super 1.2 [1/1] [U]

md6 : active raid1 sdc3[0]
      3902296256 blocks super 1.2 [1/1] [U]

md5 : active raid1 sdf3[0]
      3902296256 blocks super 1.2 [1/1] [U]

md3 : active raid1 sde3[0](E)
      3902296256 blocks super 1.2 [1/1] [E]

md7 : active raid1 sdg3[0]
      3902296256 blocks super 1.2 [1/1] [U]

md4 : active raid1 sdb3[0]
      1948792256 blocks super 1.2 [1/1] [U]

md1 : active raid1 sda2[0] sdb2[2] sdc2[4] sde2[1] sdf2[3] sdg2[5]
      2097088 blocks [8/6] [UUUUUU__]

md0 : active raid1 sda1[0] sdb1[2] sdc1[4] sde1[1] sdf1[3] sdg1[5]
      2490176 blocks [8/6] [UUUUUU__]

unused devices: 

As you can see above, the md3 array was active but in a weird [E] state. After Googling a bit I discovered that the [E] state is specific to Synology, as that guy explains here. Synology doesn’t provide any documentation around this marker; they only state in their documentation that we should contact them if a volume is Crashed.

Continuing, I took a detailed look at the md3 array and the ‘partition’ attached to it, which looked okay; so purely from a classic RAID array point of view, everything was alright!

> mdadm --detail /dev/md3
        Version : 1.2
  Creation Time : Fri Jul  5 14:59:33 2013
     Raid Level : raid1
     Array Size : 3902296256 (3721.52 GiB 3995.95 GB)
  Used Dev Size : 3902296256 (3721.52 GiB 3995.95 GB)
   Raid Devices : 1
  Total Devices : 1
    Persistence : Superblock is persistent

    Update Time : Sun May 17 18:21:27 2015
          State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 0
  Spare Devices : 0

           Name : SynoTnT:3  (local to host SynoTnT)
           UUID : 2143565c:345a0478:e33ac874:445e6e7b
         Events : 22

    Number   Major   Minor   RaidDevice State
       0       8       67        0      active sync   /dev/sde3

> mdadm --examine /dev/sde3
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 2143565c:345a0478:e33ac874:445e6e7b
           Name : SynoTnT:3  (local to host SynoTnT)
  Creation Time : Fri Jul  5 14:59:33 2013
     Raid Level : raid1
   Raid Devices : 1

 Avail Dev Size : 7804592833 (3721.52 GiB 3995.95 GB)
     Array Size : 7804592512 (3721.52 GiB 3995.95 GB)
  Used Dev Size : 7804592512 (3721.52 GiB 3995.95 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : a2e64ee9:f4030905:52794fc2:0532688f

    Update Time : Sun May 17 18:46:55 2015
       Checksum : a05f59a0 - correct
         Events : 22

   Device Role : Active device 0
   Array State : A ('A' == active, '.' == missing)		

See above, all clean!

So at this point I realized that I only had few options:

  • hope that Synology would help me fix it
  • try and fix it myself using arcane mdadm commands to recreate the array
  • get a spare disk and copy my data to it before formatting the disk, re-creating the shares and putting the data back (booooringgggggg)

To be on the safe side, I saved a copy of the output for each command so that I had at least the initial state of the array. To be honest at this point I didn’t dare go further as I didn’t know what re-creating the raid array could do to my data if I did something wrong (which I probably would have :p).

Fortunately for me, my NAS is still supported and Synology fixed the issue for me (they connected remotely through SSH). I insisted to get the commands they used and here’s what they gave me:

> mdadm -Cf /dev/md3 -e1.2 -n1 -l1 /dev/sde3 -u2143565c:345a0478:e33ac874:445e6e7b
> e2fsck -pvf -C0 /dev/md3

As you can see above, they’ve used mdadm to re-create the array, specifying the same options as those used to initially create it:

  • force creation: -Cf
  • the 1.2 RAID metatada (superblock) style: -e1.2
  • the number of devices (1): -n1
  • the RAID level (1): -l1
  • the device id: /dev/sde3
  • the UUID of the array to create (the same as the one that existed before!): -u2143565c….

The second command simply runs a file system check that repairs any errors automatically.

And tadaaaa, problem solved. Thanks Synology! :)

As a sidenote, here are some useful commands:

# Stop all NAS services except from SSH
> syno_poweroff_task -d

# Unmount a volume
> umount /volume2

# Get detailed information about a given volume
> udevadm info --query=all --name=/dev/mapper/vol2-origin
P: /devices/virtual/block/dm-4
N: dm-4
E: DEVNAME=/dev/dm-4
E: DEVPATH=/devices/virtual/block/dm-4
E: ID_FS_LABEL=1.42.6-3211
E: ID_FS_LABEL_ENC=1.42.6-3211
E: ID_FS_TYPE=ext4
E: ID_FS_USAGE=filesystem
E: ID_FS_UUID=19ff9f2b-2811-4941-914b-ef8ea3699d33
E: ID_FS_UUID_ENC=19ff9f2b-2811-4941-914b-ef8ea3699d33
E: MAJOR=253
E: SYNO_PLATFORM=cedarview

That’s it for today, time to play with Docker on my Synology NAS!


