If I have the following contents in a directory:
empty_dir
: Empty directory.empty_file
: Empty file.one_char
: File consisting of one character.several_blocks
: File consisting of several blocks (but not "too" large or "sparse").
Then, ls
will display the following †:
$ ls -Gghs
total 152K
8,0K drwxr-xr-x 2 4,0K dec 21 23:34 empty_dir
4,0K -rw-r--r-- 1 0 dec 21 23:21 empty_file
8,0K -rw-r--r-- 1 1 dec 21 23:22 one_char
132K -rw-r--r-- 1 127K dec 22 00:14 several_blocks
Secondly, stat
displays the following:
$ stat empty_dir/
File: empty_dir/
Size: 4096 Blocks: 16 IO Block: 4096 directory
...
$ stat empty_file
File: empty_file
Size: 0 Blocks: 8 IO Block: 4096 regular empty file
...
$ stat one_char
File: one_char
Size: 1 Blocks: 16 IO Block: 4096 regular file
...
$ stat several_blocks
File: several_blocks
Size: 129760 Blocks: 264 IO Block: 4096 regular file
...
Thirdly, du
displays the following:
$ du -h empty_dir/
8,0K empty_dir/
$ du -h empty_file
4,0K empty_file
$ du -h one_char
8,0K one_char
$ du -h several_blocks
132K several_blocks
Lastly:
$ tune2fs /dev/nvme0n1p2 -l
...
Block size: 4096
...
Inode size: 256
...
The size of the blocks reported by stat
is 512 B, which means that the output between stat
, ls
, and du
is consistent:
empty_dir
: 16 * 512 / 1024 =わ 4096 +たす 4096 =わ 8 KiB.empty_file
: 8 * 512 / 1024 =わ 0 +たす 4096 =わ 4 KiB.one_char
: 16 * 512 / 1024 =わ 4096 +たす 4096 =わ 8 KiB.several_blocks
: 264 * 512 / 1024 =わ 129760 +たす 5408 =わ 129760 +たす 1312 +たす 4096 =わ 131072 +たす 4096 =わ 32 * 4096 +たす 4096 =わ 132 KiB.
Questions
- Why is the allocated size for
empty_dir
andone_char
two blocks (of size 4096 B) and not one? - Why is the allocated size for
empty_file
one block and not zero? - Why is the allocated size for
several_blocks
(and larger files in general) more than one block larger than the apparent size ((264 * 512) - 129760 = 5408> 4096)?
I suspect the additional block is the one containing the inode
, like this questioner asks (but goes unanswered). Similarly this questioner has observed the double size, but it is incorrectly formulated in the question and receives an answer to the other part of the question. However, this answer to a different question, suggests that there should be no additional blocks (which was my intuition).
- Are our systems incorrectly configured?
- Assuming the block containing the
inode
is counted: When usingdu
on multiple files, does it compensate for counting theinode
block several times, should multipleinodes
be in the same block (since one block can contain 16inodes
(4096 / 256 = 16))?
Appendix
@WumpusQ.Wumbley speculated that it could be extended attributes and this turned out to be the case!
getfattr
returns user.com.dropbox.attributes
. Turns out the testing directory was a subdirectory deep down in a directory that was symbolically linked into my Dropbox folder. See the accepted answer below.
† This uses GNU Core Utilities 8.30 on GNU/Linux with kernel 4.19.1 (Manjaro) on ext4 on a NVME SSD.
2 Answers 2
@WumpusQ.Wumbley pointed out the cause in a comment: extended attributes.
For completeness sake the answers are presented below.
Extended attributes, in this case applied by Dropbox (getfattr
returns user.com.dropbox.attributes
), uses additional blocks for storage. Without these extended attributes ls
(and the other commands) returns:
$ ls -Gghs
total 136K
4,0K drwxr-xr-x 2 4,0K dec 22 20:11 empty_dir
0 -rw-r--r-- 1 0 dec 22 20:11 empty_file
4,0K -rw-r--r-- 1 1 dec 22 20:12 one_char
128K -rw-r--r-- 1 127K dec 22 20:13 several_blocks
As expected.
In addition, stat
for the only interesting case of several_blocks
returns:
$ stat several_blocks
File: several_blocks
Size: 129760 Blocks: 256 IO Block: 4096 regular file
...
Which is also as expected, since 256 * 512 -ひく 129760 =わ 1312 < 4096, i.e., no extra block used.
- Due to extended attributes.
- Due to extended attributes.
- Due to extended attributes.
- No, but be aware of extended attributes added by applications.
- Incorrect assumption.
The "additional blocks" are not due to some inconsistency in configuration. (Hypothetically, it could always be wrong for some other reason though. Like cosmic rays that corrupted your kernel code :-)).
I say this because there is no option to manually tweak details of the calculation of the disk usage for these commands. The commands only convert the disk usage to different units, by multiplying or dividing. The disk usage is obtained by calling the stat() system call. The kernel returns a number of synthetic "blocks", which are always 512 bytes. Nor is there any kernel option that affects how stat() calculates the number of blocks.
I can tell you the block which contains the inode is not supposed to be counted on your ext4 filesystem. In general, Giles says it is not counted on any filesystem that he is aware of. Perhaps in part due to the point you raise :-). Inodes tend to be smaller than the 512-byte blocks reported by stat
. ext4 defaults to 256-byte inodes; ext3 defaulted to 128 bytes.
If we look through the related questions (right sidebar), we notice one case where there can be additional blocks. The extent tree (or indirect blocks, if extents are disabled) is counted on ext4. (Why is the difference in file size and it's size on disk bigger than 4 KiB?)
A second answer to the linked question suggests another case. Some uses of fallocate() might allow creating files with an arbitrarily large difference between their size, and the number of blocks allocated to them.
That said, I suspect the above is not sufficient to explain any of your examples.
-
1The fallocate hypothesis is interesting, but it doesn't appear to work on directories.user41515– user4151512/22/2018 16:43:34Commented Dec 22, 2018 at 16:43
You must log in to answer this question.
Explore related questions
See similar questions with these tags.
getfattr