Disks, Partitions, & Filesystems

These notes look at managing disks, partitions, and filesystems. By and large, the context here emphasizes the prevalent DOS disk type. GNU/Linux also supports GPT (GUID Partition Table) and other disk types, and these note provides a few pointers for these types. There is also some treatment of disks as simple block devices.

Quick Guide to Utilities

This is a quick guide to the utilities for managing disks, partitions, and filesystems.

GUI Utilities

GUIs GParted and GNOME Disk Utility (Palimpsest) offer much of the same functionality as well.

Packages: gparted, gnome-disk-utility.

Command-line Utilities

Disks: ATA parameters.
To get or set hardware parameters for ATA disk drives (PATA/IDE/SATA), use hdparm (package: hdparm). (Neither GParted nor Disk Utility address ATA hardware parameters.)
Disks: bad blocks.
To examine a disk for bad blocks, use badblocks. (See also: Bad Blocks HowTo.) (Neither GParted nor Disk Utility consider bad blocks.)
Disks: create MBR (DOS disk label)
To create an MBR (or DOS disk label) on a disk, use fdisk or parted. To retore an MBR suitable for Windows, use ms-sys.
Disks: create GPT or other disk label
To create a GPT disk label, use parted.
Partitions: create, edit, delete
To manage disk partitions use fdisk, sfdisk, parted (overlapping functionality). For GPT partitions, use gdisk.
Partitions: flags
To set partition flags, such as bootable (active), use fdisk, sfdisk, or parted.
Partitions: label
To display or change the label of an existing partition, use e2label, dosfslabel, ntfslabel.
Partitions: recovery
To recover lost partition information, use gpart, ddrescue, testdisk.
Partitions: resizing
To resize an existing partition, use resize2fs, ntfsresize
filesystems: creation
To make a filesystem on a partition, use mke2fs, mkdosfs, mkntfs.
filesystems: integrity
To check and repair a filesystems, use e2fsck, dosfsck, ntfsck.
filesystems: details
To view the parameters io and existing ext2/3/4 filesystem, use dumpe2fs.
filesystems: parameters
To modify the parameters of an existing ext2/3/4 filesystem, use tune2fs.
Fragmentation and free space
To report information on free space and file fragmentation in an ext2/3/4 partition, use e2freefrag, filefrag

Packages: e2fsprogs, dosfsprogs, ntfsprogs; ddrescue, testdisk.

Host Protected Area

The man page for hdparm has a clear and concise explanation under option -N ("Get/set max visible number of sectors").

-> hdparm -N  /dev/sda

/dev/sda:
 max sectors   = 78125000/78125000, HPA is disabled
-> hdparm -I /dev/sdb
...
Commands/features:
        Enabled Supported:
           *    SMART feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    DOWNLOAD_MICROCODE
                SET_MAX security extension
...

See also: Wikipedia

Advanced Format

General

Partitions

Miscellaneous Hardware Notes

WD Green Drives

To identify a disk's path, UUID, or label

To find the UUID, label, or type of a partition, use blkid:

-> /sbin/blkid /dev/sda1
/dev/sda1: UUID="E274F34374F3194D" LABEL="winxp" TYPE="ntfs" 

Use findfs (as root) to identify a filesystem's path by label or UUID.

-> findfs LABEL=backup
/dev/sdb2
-> findfs UUID=0e6c0610-e4ce-4c4a-9135-754eb401a5d5
/dev/sdb2

(The label is case-sensitive.) Alternatively, search directories under /dev/disk:

-> ls /dev/disk/
by-id/  by-label/  by-path/  by-uuid/
-> ls -l /dev/disk/by-label/backup
lrwxrwxrwx. [...] /dev/disk/by-label/backup -> ../../sdb2

Or, use GParted and look through devices.

udisks(7), udisks(1),

Micro HowTo

To look-up a disk drive's sector and size information:

-> hdparm -I /dev/sda | grep -E "LBA|Sector|device size"
	LBA    user addressable sectors:   78125000
	Logical/Physical Sector size:           512 bytes
	device size with M = 1024*1024:       38146 MBytes
	device size with M = 1000*1000:       40000 MBytes (40 GB)
	LBA, IORDY(can be disabled)

To make a partition active or bootable, use parted or sfdisk from the command line:

-> parted /dev/sdb set 2 boot on
-> sfdisk -A2 /dev/sde

For GUI alternatives, use GParted or Palimpsest Disk Utility.

Here's how to copy the Master Boot Record into a file, perhaps for backup. As user root:

-> dd if=/dev/hda of=hda.mbr bs=512 count=1

UUIDs for Disk Partitions

A UUID, or Universally Unique Identifier, is a hexadecimal number that uniquely identifies a single disk partition. It has 32 hex digits encoding 16 bytes (128 bits). Here's an example:

059f957a-c7f6-4d63-9cf2-5a6d4cefded9

The libuuid library generates and parses UUIDs. Commands like uuid, blkid, dumpe2fs, fsck, mke2fs, mkntfs, and tune2fs use libuuid to handle UUIDs as needed in managing disk partitions. For example:

-> ldd `which mkntfs` | grep uuid
libuuid.so.1 => /lib/libuuid.so.1 (0x00788000)

A UUID identifies a single partition on a storage device but not the device as a whole. Two separate partitions on the same disk drive have different UUIDs:

-> blkid /dev/sdc1  /dev/sdc2
/dev/sdc1: UUID="1c1048ab-308f-481b-b236-8883264f5a56" TYPE="ext4" 
/dev/sdc2: UUID="059f957a-c7f6-4d63-9cf2-5a6d4cefded9" TYPE="ext2" 

A partition's UUID is assigned when a filesystem is made on the partition, and it is unique to that instance of the filesystem.

-> mke2fs -q /dev/sdc1
-> blkid /dev/sdc1
/dev/sdc1: UUID="53f48509-90f5-45bf-9b70-2797e69edd70" TYPE="ext2" 
-> mkntfs -qfI /dev/sdc1
-> blkid /dev/sdc1
/dev/sdc1: LABEL="" UUID="3F5F39343A765FE2" TYPE="ntfs"
-> mke2fs -q /dev/sdc1
-> blkid /dev/sdc1
/dev/sdc1: UUID="ebae56c4-9395-465c-ad99-0df72ab6831c" TYPE="ext2" 
-> uuidgen 
e0cbb0f8-ddd2-461c-9313-07dd05addc52
-> mke2fs -q -U e0cbb0f8-ddd2-461c-9313-07dd05addc52 /dev/sdc1
-> blkid /dev/sdc1
/dev/sdc1: UUID="e0cbb0f8-ddd2-461c-9313-07dd05addc52" TYPE="ext2" 
-> tune2fs -U `uuidgen` /dev/sdc1
tune2fs 1.41.9 (22-Aug-2009)
-> blkid /dev/sdc1
/dev/sdc1: UUID="0198cb76-23fb-4211-900d-e10f8b1ec392" TYPE="ext2" 

(Note the different type of UUID for the NTFS partition, above.)

UUIDs have utility beyond disk partitions. The description in the libuuid package indicates their general utility:

A UUID is an identifier that is unique across both space and time, with respect to the space of all UUIDs. A UUID can be used for multiple purposes, from tagging objects with an extremely short lifetime, to reliably identifying very persistent objects across a network. [From libuuid-2.16.2-7.fc12.i686.rpm.]

Wikipedia's UUID article gives details and additional information.

Package util-linux-ng provides the uuidgen command. Library libuuid has its own eponymous package.

Excursion on Disk Label

The presence of a partition table transforms a disk from a simple block device to a bona fide substrate for the directories and files of everyday computing. A partition table divides the disk into distinct subsets of contiguous sectors; each subset constitutes a partition. Each partition can then house a separate filesystem, such as an instance of an ext4 filesystem. Each filesystem in turn gives life to directories and files, which are safely insulated from the goings-on of other partitions.

There are several partitioning schemes available. In the Linux world, the main types are DOS Partition Tables (DPTs) and GUID Partition Tables (GPTs). Schemes differ fundamentally in how they define their tables: which sectors hold the table, what fields are specified, and how many bytes compose a field. These differences among formats ultimately emerge as differences in properties. Limits on the number and size of partitions stand out. Notions of simplicity, robustness, and extensibility may matter as well. Since a disk has only one partition table, the format of the table imparts a format to the disk as a whole. For example, a disk with a DPT may be referred to as an MSDOS disk, while a disk with a GPT may be called a GPT disk. "Formatting" a disk means creating a partition table on it, an act that abandons and perhaps overwrites existing data on the disk. A partition table also includes some sort of signature that identifies its format. Disk utilities must know what sectors and offsets to examine for signatures. If the utility supports the disk's format, it can then further interpret and process the data on the disk according to its abilities.

For Linux systems running on x86 computers, the DOS Partition Table ("DPT" herein only) dominates. This is the familiar partitioning scheme that supports up to four primary partitions or up to three primary partitions plus one extended partition (of multiple logical partitions). A DPT defines these in four descriptors of 16 bytes each. The first sector of the disk holds the partition table as well as a two-byte signature, the famous 0xaa55. It may also include an initial boot loader. The whole sector is called the MBR (Master Boot Record), and thus DOS partitioning is sometimes referred to as MBR partitioning. If the table lacks an extended partition, then the MBR contains the entire table. Otherwise, a chain of extended boot records (EBR) specifies the logical partitions. Each EBR uses a single sector within the extended partition, but there is no standard for the location of these sectors therein. Disk utilities must traverse the chain to identify the logical partitions.

The GUID Partition Table (GPT) scheme looks to be the up-and-coming disk format poised to end the yet vibrant reign of the DPT in Linux systems. The GPT scheme mandates support for at least 128 partitions, permits many more, and drops the notions of primary, extended, and logical partitions. Each partition has its own UUID and a standardized type indicating its function, such as "Linux/Windows data," "Linux swap," and "Linux LVM" (cf. sgdisk -L). For robustness, the main GPT table residing in the initial disk sectors is replicated in the final sectors. A GPT header embeds a checksum on its table as well. The GPT newcomer of course benefits from decades of hindsight unavailable to the DPT pioneer, and it reflects this experience internally with a clean and extensible format. Finally, the GPT format boasts the subtle but significant feature of standardization.

Disk and partition capacities under GPT and DPT

Comparisons of GPT with DPT usually point out that GPT accommodates much larger disk capacity than does DPT: 8 ZiB (273 B) for GPT versus 2 TiB (241 B) for DPT, or 9.4 ZB versus 2.2 TB. That's a huge increase, a factor of 232 in favor of GPT. These numbers need a bit of attention because they stem from counting sectors and assuming the current norm of 512 bytes per sector. Advanced Technology Format disks with sector sizes of 4096 bytes are beginning to appear, however, so sector counts are sharper. Also, these figures overlook differences in the ways GPT and DPT specify partitions. So here follow some details.

A GPT allots 8 bytes (64 bits) for enumerating an LBA, and it specifies a partition by the two bounding LBAs. GPT thus supports 264 sectors. Hence the 8 ZiB figure:

264 sectors × 29 bytes/sector = 23 × 270 B = 8 ZiB

A DPT allots 4 bytes (32 bits) for an LBA and can thus enumerate up to 232 sectors. Hence the 2 TiB figure:

232 sectors × 29 bytes/sector = 2 × 240 B = 2 TiB

But DPT specifies a partition by the first sector and subsequent number of sectors. It allots 4 bytes for the latter, too, so the theoretical capacity for DPT is actually 233 (232 + 232) sectors. These span 4 TiB on disks with 512 bytes per sector:

233 sectors × 29 bytes/sector = 22 × 240 B = 4 TiB

Whatever the sector size and bookkeeping details, GPT capacity improves DPT capacity by a huge factor.

For either scheme, the size of the largest-possible partition falls short of the disk's full capacity, however, because the partition table itself consumes sectors. A DPT of primary partitions reserves just one sector, the MBR. An extended partition reserves an additional sector for each logical partition. A typical GPT supporting up to 128 partitions reserves 66 sectors (1 for the protective MBR, 1 for the header, 32-each for the main and backup tables). Disk editors may also skip sectors to align or space partitions. But these reductions in sectors available for partitions are minuscule relative to modern disk capacities. So bean counting notwithstanding, the size of the disk itself serves as a reasonable approximation for the size of the largest-possible partition.

Tools

There are several utilities for creating and editing partition tables. GParted and GNOME Disk Utility (aka Palimpsest) are GUIs; parted, fdisk, and gdisk are interactive CLIs; sfdisk and sgdisk are typical shell commands. These programs use different terminology. For example, GParted talks about creating partition tables, Disk Utility talks about formatting drives, and parted talks about disk labels. GParted, Disk Utility, and parted can all create partition tables for DPT, GPT, and even other schemes. While fdisk can create and edit only DPT tables, it does recognize a GPT table when it sees one and smartly runs for the hills. sfdisk is a non-interactive version of fdisk for scripts (e.g. "s" for "script") and confines itself to editing existing DPT tables. gdisk is essentially fdisk for GPT, and sgdisk is its non-interactive sibling.

Package parted also provides libparted, which both GParted and GNOME Disk Utility require. Package util-linux-ng provides fdisk and sfdisk. Package gdisk provides sgdisk in addition to its eponymous utility. GNOME Disk Utility comprises packages gnome-disk-utility, gnome-disk-utility-libs, and gnome-disk-utility-ui-libs.

Terms of Entanglement

Some care must be taken for terminology... Disk format not to be confused with file-system format, disk type not to be confused with fs-type or a partition. Expand on "one format at a time." Expand on "signature."> Disks, partitions, and filesystems are intertwined. Directories and files live in filesystems, filesystems live in a single partitions, and partitions live on disks. Workhorse terms "type," "format," and "label" have different meanings. ...

A disk inherits its format from its partition table, and the correspondence is one-to-one. Accordingly, a disk lacking a partition table is not formatted. Sometimes "type" or "disklabel" is used as a synonym for format.

A partition has a type, but not really a format. A DPT partition may be primary, extended, or logical.

See also

Excursion on the Master Boot Record

This section takes a closer look at the Master Boot Record, or MBR, on an x86 computer with the standard BIOS interface (rather than UEFI, for example). The MBR is the first sector of a disk--an HDD or UFD--and spans 512 bytes. If the disk is partitioned, the MBR contains the partition table. If the disk has a bootable partition, in particular, the MBR also holds the bootstrap code that the BIOS loads. It may optionally hold a disk identifier. Or it may hold nothing more than a two-byte magic number proclaiming itself an MBR. In light of this variability, there are other terms for the MBR: boot sector, disk label, partition table, partition table sector, partition map, etc. Sometimes "DOS," "MS-DOS," "IBM," or "x86" precede these terms for specificity. Here, we'll simply use "MBR."

Start by zeroing out the MBR on disk device /dev/sdd, for example. This removes the disk's identity as a storage device based on filesystems and reverts it to a block device for data chunks:

-> dd count=1 if=/dev/zero of=/dev/sdd
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.0280746 s, 18.2 kB/s

Here, the sector size is the standard 512 bytes, which is also the default block size for the dd command. Use hexdump to verify:

-> hexdump -C -n512 /dev/sdd
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200

Disk-management commands interpret this null MBR this in various ways:

-> file -s /dev/sdd
/dev/sdd: data
-> fdisk -l /dev/sdd
...
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table
-> sfdisk -l /dev/sdd
...
sfdisk: ERROR: sector 0 does not have an msdos signature
 /dev/sdd: unrecognized partition table type
No partitions found
-> parted /dev/sdd print
Error: /dev/sdd: unrecognised disk label  

Next, use fdisk to write an MBR:

-> fdisk -cu /dev/sdd
Command (m for help): o
Building a new DOS disklabel with disk identifier 0x8ac0757f.
...
Command (m for help): w
...
-> file -s /dev/sdd
/dev/sdd: x86 boot sector, code offset 0x0
-> hexdump -C -n512 /dev/sdd
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  0b 0e 5f fc 00 00 00 00  |.........._.....|
000001c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200

This is a bare-bones MBR. It consists of a four-byte disk identifier at offsets 0x1B8-0x1BB, and it ends with the expected MBR signature of 55h and AAh as the final two bytes. The code section and partition table remain zeroed out. Alternatively, use parted to write an MBR:

-> dd count=1 if=/dev/zero of=/dev/sdd 2> /devnull
-> parted -s /dev/sdd mklabel msdos
-> file -s /dev/sdd
/dev/sdd: x86 boot sector, code offset 0xb8

This writes to the bootstrap code area, as hinted by the code offset 0xB8 above, but leaves the partition table zeroed out. The code area of an MBR spans offsets 0x00--0xDA.

Now, zero-out the MBR and then use fdisk to create four primary partitions on the disk. The corresponding partition table spans offsets 0x1BE--0x1FD. These 64 bytes lie between the disk identifier and the MBR signature. For example:

-> hexdump -C -n 512 /dev/sdd
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  bd 68 16 23 00 00 80 00  |.........h.#....|
000001c0  24 21 83 00 4a 2e 00 08  00 00 00 40 00 00 00 00  |$!..J......@....|
000001d0  4b 2e 06 00 ae 3a 00 48  00 00 00 40 00 00 00 00  |K....:.H...@....|
000001e0  af 3a 82 00 de 8a 00 88  00 00 00 50 00 00 00 00  |.:.........P....|
000001f0  df 8a de 00 fd ff 00 d8  00 00 00 1c 00 00 55 aa  |..............U.|
00000200

Finally, zero out the MBR again and subsequently put the signature in the last two bytes:

-> echo -e \\x55\\xaa > mbr-id
-> dd bs=512 count=1 if=/dev/zero of=/dev/sdd          2> /dev/null
-> dd bs=1   count=2 if=mbr-id    of=/dev/sdd seek=510 2> /dev/null
-> hexdump -C -n 512 /dev/sdd
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200
-> file -s /dev/sdd
/dev/sdd: x86 boot sector, code offset 0x0
-> parted /dev/sdd print | grep "Partition Table"
Partition Table: msdos
-> sfdisk -l -uS /dev/sdd
...
   Device Boot    Start       End   #sectors  Id  System
/dev/sde1             0         -          0   0  Empty
/dev/sde2             0         -          0   0  Empty
/dev/sde3             0         -          0   0  Empty
/dev/sde4             0         -          0   0  Empty

This exercise demonstrates the minimal MBR in the operational sense that file, fdisk, parted, and sfdisk interpret it reasonably and without griping.

See also: