I created a software RAID 5 array one year ago using mdadm v3.2.x which shipped from CentOS 6.3, after few months, I moved/assembled the array to/in Fedora 19 (now Fedora 20).
It had three 3TB disks (Seagate ST3000DM001) in it, and it's almost full, so I added 2 disks and grow the array to 4 disks + 1 hot spare disk. Now it's size is 8383.55 GiB.
# mdadm -D /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Fri Jan 11 17:56:18 2013
Raid Level : raid5
Array Size : 8790792192 (8383.55 GiB 9001.77 GB)
Used Dev Size : 2930264064 (2794.52 GiB 3000.59 GB)
Raid Devices : 4
Total Devices : 5
Persistence : Superblock is persistent
Update Time : Tue Mar 25 11:04:15 2014
State : clean
Active Devices : 4
Working Devices : 5
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 512K
Name : RecordBackup01:127 (local to host RecordBackup01)
UUID : dfd3bbe7:4b0231fe:9007bc4a:e106acac
Events : 7264
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
3 8 49 2 active sync /dev/sdd1
5 8 81 3 active sync /dev/sdf1
4 8 65 - spare /dev/sde1
Then I created another array (RAID 6) using mdadm v3.3 (which shipped from Fedora 20) with 5 3TB disks (Toshiba DT01ACA300), but it's size is 8383.18 GiB, which is slightly smaller than 8383.55 GiB.
# mdadm -D /dev/md127
/dev/md127:
Version : 1.2
Creation Time : Fri Mar 21 18:12:00 2014
Raid Level : raid6
Array Size : 8790402048 (8383.18 GiB 9001.37 GB)
Used Dev Size : 2930134016 (2794.39 GiB 3000.46 GB)
Raid Devices : 5
Total Devices : 5
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Mar 25 11:18:51 2014
State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : RecordBackup02:127 (local to host RecordBackup02)
UUID : 923c9658:12739258:506fc8b0:f8c5edf3
Events : 8172
Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
2 8 49 2 active sync /dev/sdd1
3 8 65 3 active sync /dev/sde1
4 8 81 4 active sync /dev/sdf1
The partition size of each disk in the two arrays are identical (all partitions have 5860531087 logical sectors, see the following partition information), so why the array size are different? Is it caused by different mdadm version or different array level or something else?
Array 1 (RAID 5) disks/partitions information
# LANG=en parted /dev/sdb "unit s print all"
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s pri
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdc: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s ext4 primary
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdd: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s ext4 primary
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sde: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: ATA ST3000DM001-1CH1 (scsi)
Disk /dev/sdf: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: Linux Software RAID Array (md)
Disk /dev/md127: 17581584384s
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags:
Number Start End Size File system Flags
1 0s 17581584383s 17581584384s xfs
Array 2 (RAID 6) disks/partitions information
# LANG=en parted /dev/sdb "unit s print all"
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdb: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdc: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdd: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sde: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: ATA TOSHIBA DT01ACA3 (scsi)
Disk /dev/sdf: 5860533168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 5860533134s 5860531087s primary
Model: Linux Software RAID Array (md)
Disk /dev/md127: 17580804096s
Sector size (logical/physical): 512B/4096B
Partition Table: loop
Disk Flags:
Number Start End Size File system Flags
1 0s 17580804095s 17580804096s xfs
-
1You did a good job forcing all the disk/partitions to be the same size. Still I would not be surprise to found out that RAID6 has a slightly bigger overhead than RAID5.Ouki– Ouki2014年03月25日 09:16:55 +00:00Commented Mar 25, 2014 at 9:16
-
@Ouki, you mean the metadata size overhead, or the performance overhead?LiuYan 刘研– LiuYan 刘研2014年03月26日 03:19:02 +00:00Commented Mar 26, 2014 at 3:19
2 Answers 2
The obvious difference is:
Intent Bitmap : Internal
Is it possible that the mdadm versions have different defaults for whether or not the intent bitmap is enabled?
As I understand it, the internal intent bitmap uses a portion of the disks to store what it is about to write, so it doesn't need to validate every block, when it rebuilds should you replace a failed disk.
Try explicitly creating your RAID with mdadm --bitmap=none ...
or mdadm --bitmap=internal ...
-
I'm not sure if it's the write intent bitmap caused the difference, highly doubt it (because it will not take many spaces). I asked same question in linux.raid mailing list, Mikael Abrahamsson guess it's because of different data offset, which I believe it (but still no explanation for it). spinics.net/lists/raid/msg46175.htmlLiuYan 刘研– LiuYan 刘研2014年04月16日 05:06:06 +00:00Commented Apr 16, 2014 at 5:06
You are asking about size ... so yes size overhead. My best guess is that the extra parity requires some sort of extra index or something (without knowing much of the inside of RAID6). We are talking about 370+ MB for a 8.3 TB ... this is less than 0.005% of the total space!
-
I just did a quick RAID creation test on loop devices in CentOS 6.5 (mdadm v3.2.x) and Fedora 20 (mdadm v3.3), found out that it's not the size overhead of RAID 6. Size of RAID 5 & RAID 6 are identical in same OS, but they are different between CentOS and Fedora. Test script:
MAKEDEV /dev/loop; truncate -s 10M hdd5{1..5} hdd6{1..5}; for hdd in {1..5}; do losetup /dev/loop5$hdd hdd5$hdd; losetup /dev/loop6$hdd hdd6$hdd; done; mdadm -C /dev/md5 -l 5 -n 4 -x 1 /dev/loop5{1..5}; mdadm -C /dev/md6 -l 6 -n 5 /dev/loop6{1..5}; mdadm -D /dev/md5; mdadm -D /dev/md6
LiuYan 刘研– LiuYan 刘研2014年03月26日 10:24:27 +00:00Commented Mar 26, 2014 at 10:24 -
Doubtful it is about the OS itself. I would be more about some overhead related to the
mdadm
versions.Ouki– Ouki2014年03月26日 11:03:25 +00:00Commented Mar 26, 2014 at 11:03 -
The extra parity of RAID 6 requires exactly one full drive of space, no more, no less.Mark– Mark2015年06月10日 01:26:43 +00:00Commented Jun 10, 2015 at 1:26