Important Note: The definitive source for Lustre documentation is the Lustre Operations Manual available at https://wiki.hpdd.intel.com/display/PUB/Documentation. These documents are copied from internal SSEC working documentation that may be useful for some, but be we provide no guarantee of accuraccy, correctness, or safety. Use at your own risk. |
How-To for Replacing Disks for Lustre on ZFS
Note this is specifically for ZFS with "JBOD" attached disks. If you use a RAID card your procedure is somewhat different.
The correct enclosure and slot should be listed in your monitoring system warning about the problem (we use nagios/check_mk).
For example:
arch01e37s9
That translates to the Arch01 rack, the enclosure at unit 37, and disk in slot 9. For the slot 9 disk, you can look at the numbering on the side of the front of the enclosure to identify the correct slot. Before going upstairs to pull the disk, note the path to the disk in /etc/zfs/vdev.id.conf and save it. We're trying to figure out if new disks change that path.
Pull the correct disk out of the slot and insert the replacement. They will not be lit up, so don't wait for it to initialize.
ZFS will note the absence of the disk.
zpool status
We need to make the alias arch01e37s9 match up with the new disk path.
ls -l /dev/disk/by-path/
The new disk will not have any partitions on it (/dev/sdaf or a similar name without 2 partitions hanging off it). Copy the path and replace the path to that disk in vdev.id.conf if the paths are different. Afterwards, run
udevadm trigger
That enables the alias path. Now you can issue the rebuild command to zfs.
If this disk were in pool "server1-ost19" for example
zpool replace server1-ost19 arch01e37s9
If you run a zpool status command, you'll see the disk resilvering in the pool with a rate and time to completion.