mirror of
https://github.com/brain-hackers/linux-brain.git
synced 2024-06-09 23:36:23 +09:00
docs: ext4.rst: document case-insensitive directories
Introduces the case-insensitive features on ext4 for system administrators. Explain the minimum of design decisions that are important for sysadmins wanting to enable this feature. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit is contained in:
parent
b886ee3e77
commit
0a790fe438
|
@ -91,10 +91,48 @@ Currently Available
|
||||||
* large block (up to pagesize) support
|
* large block (up to pagesize) support
|
||||||
* efficient new ordered mode in JBD2 and ext4 (avoid using buffer head to force
|
* efficient new ordered mode in JBD2 and ext4 (avoid using buffer head to force
|
||||||
the ordering)
|
the ordering)
|
||||||
|
* Case-insensitive file name lookups
|
||||||
|
|
||||||
[1] Filesystems with a block size of 1k may see a limit imposed by the
|
[1] Filesystems with a block size of 1k may see a limit imposed by the
|
||||||
directory hash tree having a maximum depth of two.
|
directory hash tree having a maximum depth of two.
|
||||||
|
|
||||||
|
case-insensitive file name lookups
|
||||||
|
======================================================
|
||||||
|
|
||||||
|
The case-insensitive file name lookup feature is supported on a
|
||||||
|
per-directory basis, allowing the user to mix case-insensitive and
|
||||||
|
case-sensitive directories in the same filesystem. It is enabled by
|
||||||
|
flipping the +F inode attribute of an empty directory. The
|
||||||
|
case-insensitive string match operation is only defined when we know how
|
||||||
|
text in encoded in a byte sequence. For that reason, in order to enable
|
||||||
|
case-insensitive directories, the filesystem must have the
|
||||||
|
casefold feature, which stores the filesystem-wide encoding
|
||||||
|
model used. By default, the charset adopted is the latest version of
|
||||||
|
Unicode (12.1.0, by the time of this writing), encoded in the UTF-8
|
||||||
|
form. The comparison algorithm is implemented by normalizing the
|
||||||
|
strings to the Canonical decomposition form, as defined by Unicode,
|
||||||
|
followed by a byte per byte comparison.
|
||||||
|
|
||||||
|
The case-awareness is name-preserving on the disk, meaning that the file
|
||||||
|
name provided by userspace is a byte-per-byte match to what is actually
|
||||||
|
written in the disk. The Unicode normalization format used by the
|
||||||
|
kernel is thus an internal representation, and not exposed to the
|
||||||
|
userspace nor to the disk, with the important exception of disk hashes,
|
||||||
|
used on large case-insensitive directories with DX feature. On DX
|
||||||
|
directories, the hash must be calculated using the casefolded version of
|
||||||
|
the filename, meaning that the normalization format used actually has an
|
||||||
|
impact on where the directory entry is stored.
|
||||||
|
|
||||||
|
When we change from viewing filenames as opaque byte sequences to seeing
|
||||||
|
them as encoded strings we need to address what happens when a program
|
||||||
|
tries to create a file with an invalid name. The Unicode subsystem
|
||||||
|
within the kernel leaves the decision of what to do in this case to the
|
||||||
|
filesystem, which select its preferred behavior by enabling/disabling
|
||||||
|
the strict mode. When Ext4 encounters one of those strings and the
|
||||||
|
filesystem did not require strict mode, it falls back to considering the
|
||||||
|
entire string as an opaque byte sequence, which still allows the user to
|
||||||
|
operate on that file, but the case-insensitive lookups won't work.
|
||||||
|
|
||||||
Options
|
Options
|
||||||
=======
|
=======
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user