I had a disk (/dev/sda) show signs of eventual failure in a RAID1 array, so I failed and then removed it from the array.
I then replaced the disk, booted back up, and began the process of replicating the partition in order to add the disk to the array, however something went wrong.
The final command I used was:
sgdisk -R /dev/sdb /dev/sda
Now lsblk shows the correct partitioning for /dev/sdb:
[root@server /]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
nvme0n1 259:1 0 477G 0 disk
├─nvme0n1p3 259:4 0 7.8G 0 part
│ └─md3 9:3 0 7.8G 0 raid1 /tmp
├─nvme0n1p1 259:2 0 511M 0 part /boot/efi
├─nvme0n1p4 259:5 0 7.8G 0 part [SWAP]
└─nvme0n1p2 259:3 0 460.8G 0 part
└─md2 9:2 0 460.8G 0 raid1 /
sdb 8:16 0 3.7T 0 disk
└─sdb1 8:17 0 3.7T 0 part
└─md4 9:4 0 3.7T 0 raid1 /var
nvme1n1 259:0 0 477G 0 disk
├─nvme1n1p4 259:9 0 7.8G 0 part [SWAP]
├─nvme1n1p2 259:7 0 460.8G 0 part
│ └─md2 9:2 0 460.8G 0 raid1 /
├─nvme1n1p3 259:8 0 7.8G 0 part
│ └─md3 9:3 0 7.8G 0 raid1 /tmp
└─nvme1n1p1 259:6 0 511M 0 part
sda 8:0 0 3.7T 0 disk
However, sda does not show the same, but worse, when I run:
sgdisk -p /dev/sdb
It does not show me a partition table, same for /dev/sda:
[root@server dev]# sgdisk -p /dev/sdb
Disk /dev/sdb: 7814037168 sectors, 3.6 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 34DA93D9-0A46-433D-BDE3-6AF2566E2183
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 7814037134
Partitions will be aligned on 2048-sector boundaries
Total free space is 7814037101 sectors (3.6 TiB)
Number Start (sector) End (sector) Size Code Name
[root@server dev]# sgdisk -p /dev/sda
Disk /dev/sda: 7814037168 sectors, 3.6 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): EBADBC60-3D20-48F7-880B-5CCF1B645A44
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 7814037134
Partitions will be aligned on 2048-sector boundaries
Total free space is 7814037101 sectors (3.6 TiB)
Number Start (sector) End (sector) Size Code Name
[root@server dev]#
When I ran partprobe, it gave me the following error:
[root@server dev]# partprobe
Error: Partition(s) 1 on /dev/sdb have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
Now I am not the most experienced linux administrator (obviously), but I am guessing that somehow instead of duplicating /dev/sdb to /dev/sda I actually did the reverse and cleared the partition table for /dev/sdb.
Thankfully I have not rebooted the machine, so the system is live and functioning and I would expect that there would be some way to recover the working partition table?
Now the big unfortunate is that this is a production server and it going down/offline for an extended period of time would be pretty devastating. So I'm hoping someone could guide me through getting this back to normal.
I'm not sure what else to share here to get help, so feel free to ask me to post results of anything.
Thanks in advance.
1 Answer 1
If the kernel still knows the correct partition table, you can query partition start offsets and sizes like this:
# partition start offsets
head /sys/block/sdf/sdf*/start
# partition sizes
head /sys/block/sdf/sdf*/size
Sample output:
$ head /sys/block/sdf/sdf*/start
==> /sys/block/sdf/sdf1/start <==
2048
==> /sys/block/sdf/sdf2/start <==
4198400
==> /sys/block/sdf/sdf3/start <==
8394752
==> /sys/block/sdf/sdf4/start <==
64
$ head /sys/block/sdf/sdf*/size
==> /sys/block/sdf/sdf1/size <==
4194304
==> /sys/block/sdf/sdf2/size <==
4194304
==> /sys/block/sdf/sdf3/size <==
52166656
==> /sys/block/sdf/sdf4/size <==
1984
What it looks like in parted
:
# parted /dev/sdf unit s print
Model: Patriot Memory (scsi)
Disk /dev/sdf: 60566016s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: pmbr_boot
Number Start End Size File system Name Flags
4 64s 2047s 1984s grub bios_grub
1 2048s 4196351s 4194304s fat32 freedos msftdata
2 4198400s 8392703s 4194304s ext2 boot lvm
3 8394752s 60561407s 52166656s ext2 iso lvm
That way, you can easily re-create partitions at the correct offsets.
If there are any special partition flags (bios_grub, boot, esp, ...) they have to be provided manually, but in your case that seems to be on SSD and the HDD just has a simple data partition, so one less thing to worry about.
Since your /dev/sdb
only has a single partition /dev/sdb1
, chances are it will start at 1 MiB and extend to the full size of the disk. So re-creating that one partition couldn't be any simpler. Still, it's still good to double check.
Alternatively, you could use testdisk
to deduce the partition table from raw data.
-
Thanks. Yes, /dev/sdb (and a) are just one single partition (/dev/sdb1). Start is 2048 and size is 7814035087. Would you mind with further instructions on recreating the partition table?Luke Pittman– Luke Pittman2020年08月17日 08:20:30 +00:00Commented Aug 17, 2020 at 8:20
-
Oh - and how will the array be affected by this?Luke Pittman– Luke Pittman2020年08月17日 08:23:31 +00:00Commented Aug 17, 2020 at 8:23
-
@LukePittman in that case try
parted --align=none /dev/sdb
,unit s
,print free
,mklabel gpt
,mkpart md4-b 2048s 7814037134s
,print free
,quit
- if the partition is created correctly, the array won't be affected at all — sorry, I can't advise regardingsgdisk
as I'm not too familiar with itfrostschutz– frostschutz2020年08月17日 10:14:58 +00:00Commented Aug 17, 2020 at 10:14 -
Thank you! Unfortunately I don't have quite enough confidence in how this will turn out after reboot, so I've decided to deploy another server and migrate the data. Once that's done I shall try this so I can learn though. Cheers.Luke Pittman– Luke Pittman2020年08月17日 18:38:43 +00:00Commented Aug 17, 2020 at 18:38