NSA325v2 RAID 1 recovery

Options
Alan
Alan Posts: 2  Freshman Member
edited August 2018 in Personal Cloud Storage
Hi,

I have had my NSA325 for a few years now. It is configured with 2x5TB WD Red HD''s in RAID1 (mirror) configuration.  Recently, it has reported that the RAID was degraded and SMART was reporting that DISK 2 was bad.  I replaced DISK 2 with another 5TB WD Red HD that I had but when I go back into the web console it shows the internal volume (volume1) as Inactive. When I go to Storage->Volume, the only action that I am presented with is delete.  I am able to create a new volume on the new DISK 2 (as JBOD) so the NAS is seeing the new HD but it isn''t allowing me to repair the RAID volume.  When I re-installed the bad DISK 2 it shows the RAID as degraded. After a few more disk swap attempts it is now in a state where it is showing as inactive with both of the original disks.  Attempting to repair it fails after a few minutes.

Any suggestions?

Thanks

Alan

#NAS_Aug
«13

All Replies

  • Mijzelf
    Mijzelf Posts: 2,607  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    edited September 2018
    Options
    Can you open the telnet backdoor, login over telnet as root, and post the output of
    cat /proc/partitions
    cat /proc/mdstat
    mdadm --examine /dev/sd?2
  • Alan
    Alan Posts: 2  Freshman Member
    Options
    Thanks for replying.

    I had also emailed Zyxel support about this issue.  Their solution was to backup the data, destroy the raid array and re-create it.

    Not a very helpful "solution".  They didnt even ask to look at any output from mdadm to work out what was going on.

    I decided to dump the device and purchased a Synology.
  • Mr_C
    Mr_C Posts: 14  Freshman Member
    Options
    Hi - I've exactly the same problem.  Just raised a case with Zyxel.  Any suggestions would be life saving....
  • Mr_C
    Mr_C Posts: 14  Freshman Member
    Options
    BTW, output per the above in case anyone's keen...:

    ~ $ cat /proc/partitions
    major minor  #blocks  name
       7        0     143360 loop0
       8        0 1953514584 sda
       8        1 1953512448 sda1
       8       16 1953514584 sdb
       8       17     514048 sdb1
       8       18 1952997952 sdb2
      31        0       1024 mtdblock0
      31        1        512 mtdblock1
      31        2        512 mtdblock2
      31        3        512 mtdblock3
      31        4      10240 mtdblock4
      31        5      10240 mtdblock5
      31        6      48896 mtdblock6
      31        7      10240 mtdblock7
      31        8      48896 mtdblock8

    ~ $ cat /proc/partitions
    major minor  #blocks  name
       7        0     143360 loop0
       8        0 1953514584 sda
       8        1 1953512448 sda1
       8       16 1953514584 sdb
       8       17     514048 sdb1
       8       18 1952997952 sdb2
      31        0       1024 mtdblock0
      31        1        512 mtdblock1
      31        2        512 mtdblock2
      31        3        512 mtdblock3
      31        4      10240 mtdblock4
      31        5      10240 mtdblock5
      31        6      48896 mtdblock6
      31        7      10240 mtdblock7
      31        8      48896 mtdblock8
    ~ $ cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1]
    md0 : inactive sdb2[2](S)
          1952996928 blocks super 1.2
    unused devices: <none>

    ~ $ mdadm --examine /dev/sd?2
    mdadm: cannot open /dev/sda2: Permission denied
    mdadm: cannot open /dev/sdb2: Permission denied
    mdadm: cannot open /dev/sdc2: Permission denied
    mdadm: cannot open /dev/sdd2: Permission denied
    mdadm: cannot open /dev/sde2: Permission denied
    mdadm: cannot open /dev/sdf2: Permission denied
    mdadm: cannot open /dev/sdg2: Permission denied
    mdadm: cannot open /dev/sdh2: Permission denied
    mdadm: cannot open /dev/sdi2: Permission denied
    mdadm: cannot open /dev/sdj2: Permission denied
    mdadm: cannot open /dev/sdk2: Permission denied
    mdadm: cannot open /dev/sdl2: Permission denied
    mdadm: cannot open /dev/sdm2: Permission denied
    mdadm: cannot open /dev/sdn2: Permission denied
    mdadm: cannot open /dev/sdo2: Permission denied
    mdadm: cannot open /dev/sdp2: Permission denied
    mdadm: cannot open /dev/sdq2: Permission denied
    mdadm: cannot open /dev/sdr2: Permission denied
    mdadm: cannot open /dev/sds2: Permission denied
    mdadm: cannot open /dev/sdt2: Permission denied
    mdadm: cannot open /dev/sdu2: Permission denied
    mdadm: cannot open /dev/sdv2: Permission denied
    mdadm: cannot open /dev/sdw2: Permission denied
    mdadm: cannot open /dev/sdx2: Permission denied
    mdadm: cannot open /dev/sdy2: Permission denied
    mdadm: cannot open /dev/sdz2: Permission denied
  • Mijzelf
    Mijzelf Posts: 2,607  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    &nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0 1953514584 sda<br>&nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1 1953512448 sda1<br>&nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 16 1953514584 sdb<br>&nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 17&nbsp;&nbsp;&nbsp;&nbsp; 514048 sdb1<br>&nbsp;&nbsp; 8&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 18 1952997952 sdb2<br>
    For some reason your new disk sda isn't repartitioned. It still contains it's factory partition. The old disk, sdb, has 2 partitions. sdb1, a small 514048kB one containing some firmware stuff, and sdb2, spanning the rest of the disk.

    Is your raid array degraded or down? If it's down, is it possible that you pulled the wrong disk?

    For mdadm you need root rights. Please repeat with
    <div>su</div><div>mdadm --examine /dev/sdb2</div>

  • Mr_C
    Mr_C Posts: 14  Freshman Member
    Options
    Mijzelf - you are indeed a hero for getting back to me.

    So, the output for sdb2 is below.  In answer to the other question, I swapped out the disk which was failing (sdb1 presumably) as the volume wouldn't repair properly.  I've tried putting the failed drive back into the unit and trying to repair through the UI but this failed (hence popping in the replacement disk).

    Anyhow, output from mdadm below (sorry for being dense).  Any feedback would be awesome.  Someone from Zyxel did get back to me and suggested switching the HDDs around and trying again but this didn't work - there other suggestion was to replace the unit but I'm less keen on that one obviously.

     mdadm --examine /dev/sdb2
    /dev/sdb2:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x2
         Array UUID : 1ae146c1:423305c9:1ee2a382:b4dc443b
               Name : Nelly:0  (local to host Nelly)
      Creation Time : Sun Sep 14 09:15:30 2014
         Raid Level : raid1
       Raid Devices : 2
     Avail Dev Size : 1952996928 (1862.52 GiB 1999.87 GB)
         Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
      Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
    Recovery Offset : 294400 sectors
              State : clean
        Device UUID : a3a2e13d:5c50a491:7795795c:485d74a7
        Update Time : Mon Oct  7 22:47:26 2019
           Checksum : b6d79d5c - correct
             Events : 34784074

       Device Role : Active device 1
       Array State : AA ('A' == active, '.' == missing)
    ~ # mdadm --examine /dev/sdb2
    /dev/sdb2:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x2
         Array UUID : 1ae146c1:423305c9:1ee2a382:b4dc443b
               Name : Nelly:0  (local to host Nelly)
      Creation Time : Sun Sep 14 09:15:30 2014
         Raid Level : raid1
       Raid Devices : 2
     Avail Dev Size : 1952996928 (1862.52 GiB 1999.87 GB)
         Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
      Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
    Recovery Offset : 294400 sectors
              State : clean
        Device UUID : a3a2e13d:5c50a491:7795795c:485d74a7
        Update Time : Mon Oct  7 22:47:26 2019
           Checksum : b6d79d5c - correct
             Events : 34784074

       Device Role : Active device 1
       Array State : AA ('A' == active, '.' == missing)
    ~ #

  • Mijzelf
    Mijzelf Posts: 2,607  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    I think this is the faulty disk. The raidmanager writes the array status to the header (which you dumped here), and it says  "Array State : AA ('A' == active, '.' == missing)", so according to this header the array consists of 2 healthy disks. Which means this one is the one which was dropped, and no longer updated.
    Your array went down shortly after 'Update Time : Mon Oct  7 22:47:26 2019', Correct?
    Can you insert the other 'old' disk and look at it's raid header?

    In answer to the other question, I swapped out the disk which was failing (sdb1 presumably)
    No. sdb is the disk, sdb1 and sdb2 are the partitions on that disk. The disk you pulled was sda. Which doesn't mean it will be sda again if you plug it in again. The first disk found on boot is sda, the 2nd one sdb. When 2 disks are inserted the detection is dependent on the slot. With only one disk inside it's always sda, no matter which slot you use.

  • Mr_C
    Mr_C Posts: 14  Freshman Member
    Options
    Hi again - sorry for the delayed response and thanks again for your help.  

    So, I've switched back in what I had thought to be the originally faulty disk in slot 1 (where it cam from) and the replacement disk in slot 2.  The result is that I now have a red LED on slot 1 but the NAS is seemingly up.

    The UI is telling me that the volume is degraded but it doesn't want to accept a repair command from the UI - tried once, then won't take the command again despite showing degraded.

    I've rerun the mdam commands again and get the response below.  I think I pulled the correct disk last time but perhaps I should've put original slot 2 into slot 1 to make it sda and the replacement drive would then become sdb (although entirely possible I've misunderstood your comments above...).


    Again, any help much appreciated.

     mdadm --examine /dev/sd?2
    /dev/sda2:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x0
         Array UUID : 1ae146c1:423305c9:1ee2a382:b4dc443b
               Name : Nelly:0  (local to host Nelly)
      Creation Time : Sun Sep 14 10:15:30 2014
         Raid Level : raid1
       Raid Devices : 2
     Avail Dev Size : 1952996928 (1862.52 GiB 1999.87 GB)
         Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
      Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
              State : clean
        Device UUID : 90c7d407:3d4cb530:732a5902:96ebd9b1
        Update Time : Fri Oct 11 21:48:12 2019
           Checksum : d02191d0 - correct
             Events : 34785181

       Device Role : Active device 0
       Array State : AA ('A' == active, '.' == missing)
    /dev/sdb2:
              Magic : a92b4efc
            Version : 1.2
        Feature Map : 0x2
         Array UUID : 1ae146c1:423305c9:1ee2a382:b4dc443b
               Name : Nelly:0  (local to host Nelly)
      Creation Time : Sun Sep 14 10:15:30 2014
         Raid Level : raid1
       Raid Devices : 2
     Avail Dev Size : 1952996928 (1862.52 GiB 1999.87 GB)
         Array Size : 1952996792 (1862.52 GiB 1999.87 GB)
      Used Dev Size : 1952996792 (1862.52 GiB 1999.87 GB)
        Data Offset : 2048 sectors
       Super Offset : 8 sectors
    Recovery Offset : 215808 sectors
              State : clean
        Device UUID : 1e2e7d0c:2a220cdc:46664cf8:4edf3d6a
        Update Time : Fri Oct 11 21:48:12 2019
           Checksum : 2e7b48dc - correct
             Events : 34785181

       Device Role : Active device 1
       Array State : AA ('A' == active, '.' == missing)



    major minor  #blocks  name
       7        0     143360 loop0
       8        0 1953514584 sda
       8        1     514048 sda1
       8        2 1952997952 sda2
       8       16 1953514584 sdb
       8       17     514048 sdb1
       8       18 1952997952 sdb2
      31        0       1024 mtdblock0
      31        1        512 mtdblock1
      31        2        512 mtdblock2
      31        3        512 mtdblock3
      31        4      10240 mtdblock4
      31        5      10240 mtdblock5
      31        6      48896 mtdblock6
      31        7      10240 mtdblock7
      31        8      48896 mtdblock8

    cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1]
    md0 : active raid1 sdb2[2] sda2[0]
          1952996792 blocks super 1.2 [2/1] [U_]
    unused devices: <none>
       9        0 1952996792 md0






  • Mijzelf
    Mijzelf Posts: 2,607  Guru Member
    First Anniversary 10 Comments Friend Collector First Answer
    Options
    I suppose the last line of '/proc/mdstat' in your post is actually the last line in '/proc/partitions'?

    The disk is parititioned, and assigned to the raid array. But the array isn't synced yet. Nor is it syncing. If it was syncing, /proc/mdstat should show a percentage. Now it only shows the second member isn't up. ([U_])
    The header of sdb2 shows 'Recovery Offset : 215808 sectors', which I think means that it has synced 215808 sectors. About 100MB, that's about 2 seconds of syncing.
    (And now I look back, your 'original' sdb2 also showed a 'Recovery Offset' of 294400 sectors. Which is about the same number. Weird)

    Did you somehow interrupt the process after initiating the rebuild? You are aware that the rebuild will take about 40000 seconds, 10 hours?

    Let's manually initiate a new rebuild, to see what happens. To do so we have to remove sdb2 (the partition on the new disk) from the array, zero out the raid header, and add it again. The raidmanager will start a new rebuild.
    mdadm --remove /dev/md0 /dev/sdb2<br>mdadm --zero-superblock /dev/sdb2
    mdadm --add /dev/md0 /dev/sdb2
    After this you'll have to leave the box alone, until the sync is ready. 'cat /proc/mdstat' should show a percentage.



  • Mr_C
    Mr_C Posts: 14  Freshman Member
    Options
    Hello again, so, various faffing about later and:

     mdadm --remove /dev/md0 /dev/sdb2
    mdadm: hot remove failed for /dev/sdb2: Device or resource busy

    I can't seem to get past this.
    I've also tried getting the array to repair again through the UI but it got precisely nowhere before it bombed out unaided.

    Below following kicking off the repair in the space of about a minute or two (instead of 13 hours).

    ~ # cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1]
    md0 : active raid1 sdb2[2] sda2[0]
          1952996792 blocks super 1.2 [2/1] [U_]
          [>....................]  recovery =  0.0% (107840/1952996792) finish=88419.7min speed=368K/sec
    unused devices: <none>
    ~ # cat /proc/mdstat
    Personalities : [linear] [raid0] [raid1]
    md0 : active raid1 sdb2[2] sda2[0]
          1952996792 blocks super 1.2 [2/1] [U_]
    unused devices: <none>

    Any thoughts?

Consumer Product Help Center