Ext3

From OSDev Wiki
Jump to navigation Jump to search

This article is a stub! This page or section is a stub. You can help the wiki by accurately contributing to it.

Filesystems
Virtual Filesystems

VFS

Disk Filesystems
CD/DVD Filesystems
Network Filesystems
Flash Filesystems

The Ext3 filesystem, or third extended filesystem, is an Ext2 filesystem with additional features. It most notably includes support for journaling and H-Tree indexed directories.

Since indexed directories are easier to implement than journaling, this is what we will focus on for now.

Indexed Directories

Normally, Ext2 directories store entries in a linked list, i.e. each directory block consists of a number of directory entries, one following the other in a linear way. Therefore, any directory operation (search, insertion, deletion) requires a linear search until the required entry is found. This results in an increasing cost as the directory grows larger.

H-Tree (or Hashed B-Tree) indexing was developed as an Ext2 extension to address this issue. In this scheme, file names are hashed and the hashes are used as keys to the H-Tree. To maintain backward compatibility with disk drivers that do not understand H-Trees, indexed directories contain linked list entries alongside the indexed entries. This includes the addition of fake entries, which are described below.

Indexed Directory Blocks

Like linked list directories, H-Tree indexed directories consist of disk blocks, with direct and indirect pointers stored in the inode structure (see the ext2 page for details). Unlike linear directories, an H-Tree indexed directory has 2 types of blocks:

  • Index blocks: store pairs of hashes and their associated block numbers.
  • Leaf blocks: store normal directory entries.

The first block (block #0) in the directory is the hash tree root. It begins with the Indexed Directory Root structure, which consists of 2 fake linked list entries, followed by indexed directory root information. After this structure, there are a number of Indexed Directory Entries, which continue until the end of the block, or until all files in the directory have been indexed.

Index blocks (other than the first block) start with a fake entry, followed by Indexed Directory Entries, similar to the first block.

Indexed Directory Root Structure

Offset Field Size Description
0 Fake dot entry 8 Normal linked list entry
8 Fake dot entry name 4 '.' and 3 padding bytes
12 Fake dot-dot entry 8 Normal linked list entry
20 Fake dot-dot entry name 4 '.', '.', and 2 padding bytes
24 Reserved 4 Must be zero
28 Hash version 1 Valid values in the table below
29 Info length 1 Size of index root info (must be 8)
30 Indirect levels 1 Max. tree depth (currently 1)
31 Reserved 1 Unused flags

Indexed Directory Hash Versions

Name Value
DX_HASH_LEGACY 0
DX_HASH_HALF_MD4 1
DX_HASH_TEA 2

Indexed Directory Entries

Indexed Directory entries are present in all index blocks, including the first (or root) block. In the case of the root block, these entries appear after the Indexed Directory Root structure. In index blocks other than the root block, they appear after a single fake directory entry.

Each indexed directory entry contains a hash value and its associated block number, which is calculated relative to the start of the directory (i.e. block 0). The entries are sorted by hash value, from smallest to largest.

The first indexed directory entry is special, it contains the limit (maximum number) and count (actual used number) of directory entries that fit in the block.

Indexed Directory Entry Structure

Offset Field Size Description
0 Hash value 4 32-bit hash of the filename represented by this entry
4 Block number 4 Block number (relative to block 0 or the beginning of the directory) of the block containing the directory entry corresponding to the hashed filename

Indexed Directory Entry Count & Limit Structure

Offset Field Size Description
0 Limit 2 Maximum number of entries per block
2 Count 2 Number of used entries in the block

Lookup Algorithm

To find a directory entry in an H-Tree indexed directory:

  1. Calculate the hash value of the filename (using the hash algorithm specified in the Indexed Directory Root Structure).
  2. Read block 0 (the index root).
  3. Iterate through the Indexed Directory Entries in the block until a suitable hash value is found. This is the hash value that equals the hash value calculated in step 1, or is the largest value that is smaller than the calculated value if an exact match is not found.
  4. Read the block that is referenced by the matched hash entry. Remember that block indices are relative, so you will have to find the actual disk block number from the directory's inode structure (e.g. for relative block #4, you will need to read the block indicated by direct block index #4 in the inode structure).
  5. If the H-Tree's depth is zero, the block will be a leaf block, containing linked list entries. Read the block and iterate through the entries until you find the requested entry, as you would for a normal linked list Ext2 directory.
  6. If the H-Tree's depth is one, the block will be an index block, which contains a fake directory entry, followed by a series of Indexed Directory Entries. Read the block and repeat steps 1-4 until you find your entry.

TODO

  • Insertion algorithm
  • Key collisions
  • Leaf block splitting
  • Journaling

See Also

Resources

External Links

Retrieved from "https://wiki.osdev.org/index.php?title=Ext3&oldid=29777"