Today the disks for my new ZFS NAS arrived, rejoice! 😍

Now I ask myself: If some day one of the drives fails, how am I supposed know which of the physical ones it is? My preliminary plan is to plug them into to disk container one by one, writing down the newly appearing blkids and labeling the corresponding drive. This is somewhat time consuming, so you folks have a better idea?

Cheers!

@asciiphil@lemmy.ml
link
fedilink
2
edit-2
8M

One super-easy way to identify disks on the fly is just to do a cat </dev/sdx >/dev/null and see which disk activity light stays on.

What I do is figure out which names in /dev/disk/by-path correspond to which disks. The by-path names are stable, even if you replace the disks (as long as the cabling doesn’t change). Then I set up aliases in /etc/zfs/vdev_id.conf to give the disks names that correspond to the external labels on the enclosure.

For example, disk /dev/disk/by-path/pci-0000:06:08.0-sas-0x5842b2b2167fc188-lun-0 might be the disk in slot zero in the array I’ve designated as “array0”. So /etc/zfs/vdev_id.conf would have:

alias  array0-0  pci-0000:06:08.0-sas-0x5842b2b2167fc188-lun-0

Then I create the pool with the /dev/disk/by-vdev names so I can tell immediately what each disk is. (If you’ve already created the pool, you can export it and then use zpool import -d /dev/disk/by-vdev to switch to the vdev names.)

In theory, you can use some other settings in /etc/zfs/vdev_id.conf to get the system to enumerate the disks itself, rather than working out the aliases by hand. In my case, my enclosures don’t have stable numbering that the automatic settings can work with.

@asciiphil@lemmy.ml
link
fedilink
1
edit-2
8M

A rather more sophisticated way to identify a disk, if it’s in an enclosure that has ID LEDs, is to use sg_ses.

The rough process for that is:

  • Run lsscsi -g to get the generic SCSI device (/dev/sgN) for the enclosure.
  • Run lsscsi -t to get the SAS address for a disk. (Not sure whether this will work if it’s a SATA enclosure; all of mine are SAS.)
  • Run sg_ses -p aes /dev/sgN | less, where /dev/sgN is the enclosure’s generic SCSI device. Look through the output to find the SAS address and, from that, get the index number of the disk.
  • Run sg_ses --set ident --index I /dev/sgN, where I is the disk index number and /dev/sgN is the enclosure’s device. This will turn on the ID LED for the disk.
  • Run sg_ses --clear ident --index I /dev/sgN to turn the LED off.

You can also use fault instead of ident to turn on the “drive fault” LED, in case the enclosure has those but not ID LEDs.

@nlfx@lemmy.ml
link
fedilink
2
edit-2
8M

You can get the disk serial with smartctl -i /dev/.... Serial should be written on disk. Keep a mapping of disk ID -> serial.

If serial is not visible without taking all disks apart, it’s a good idea to put a sticker with a copy of it on the side of the disk or disk tray depending on your NAS form factor.

@Aarkon
creator
link
fedilink
18M

I thought of something similar, but that again doesn’t save me from having to plug in the disks one by one. Don’t know what I expected though, because you can’t make a hard drive suddenly beep or turn a light on. ^^

@nlfx@lemmy.ml
link
fedilink
2
edit-2
8M

I thought of something similar, but that again doesn’t save me from having to plug in the disks one by one.

I just plug all disks in my server, then run the following script to get the mapping GPTID -> partition -> disk serial:

#!/bin/sh

glabel status | awk '/^gptid/ { print $1, $3 }' | while read -r gptid part; do
        disk="/dev/${part%p*}"
        serial="$(smartctl -i "$disk" | awk '/^Serial Number:/ { print $3 }')"
        printf '%s\t%s\t%s\n' "$gptid" "$part" "$serial"
done

Then, when a disk fails, I just check with zpool status which one is unavailable or completely missing, and see to which serial it corresponds in the previously stored output of the above script.

This script is for FreeBSD and assumes you add disks using their GPTID in your ZFS pool (default on TrueNAS), but it can easily be adapted to Linux with a mix of lsblk --nodeps -o +WWN,SERIAL and the symlinks in /dev/disk/by-id/.

Don’t know what I expected though, because you can’t make a hard drive suddenly beep or turn a light on. ^^

You can create random read to try to identify a disk (using badblocks for instance). If the bad disk is not completely dead, create random read on it and try to “feel” which disk is constantly spinning and creating vibration. If disk is completely dead, do the same on all other disks and feel which one is inactive.

But writing down the disk ID -> serial mapping, if the serial is written on the hard drives is a lot easier and more reliable.

poVoq
link
fedilink
2
edit-2
8M

You can switch to device ID for the drive identifier. That has several advantages and also makes it easier to identify the actual drive.

@Aarkon
creator
link
fedilink
18M

That’s precisely what I’m planning to do. Sadly, the disk ids are not printed on its outside. 😉

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word “Linux” in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

  • Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
  • No misinformation
  • No NSFW content
  • No hate speech, bigotry, etc

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

  • 0 users online
  • 5 users / day
  • 13 users / week
  • 73 users / month
  • 210 users / 6 months
  • 64 subscribers
  • 1.08K Posts
  • 3.05K Comments
  • Modlog