26

I have an SSD that can be configured to report its physical sector size to an OS in two different ways:

Option 1: Logical = 512 Bytes, Physical = 512 Bytes

Option 2: Logical = 512 Bytes, Physical = 4096 Bytes (4K)

What benefit does an OS gain by being aware of the 4K physical sector size, considering:

  • The OS must talk to the drive in 512-byte sectors regardless

  • All modern OSes align to 4K and utilize 4K or multiples of 4K I/O regardless

The setting seems pointless, because modern OSes are already optimized for 4K sector drives. Modern OSes don't need to "ask" a drive whether its sectors are 512b or 4K, because the OS does everything in a 4K-friendly way by default.

For example, Windows 7 aligns partitions to 1MB (a multiple of 4K), NTFS cluster size is 4K or multiple thereof, and all I/O is done in 4K or multiple thereof. Windows doesn't give a damn what hard drive you have, it will apply the above behavior in all cases.

Anyway... my SSD has this "physical sector size" setting and so it must be there for some good reason... it's the reason for this I'm looking for.

BTW, for what it's worth, the drive is an Intel SSD DC S3510. The drive's datasheet says this (page 27):

By using SCT command 0xD801 with State=0, Option=1, ID Word 106 can be changed from 0x6003 to 0x4000 (4KB physical sector size to 512B physical sector size support change).

Pang
  • 1,017
misha256
  • 11,543
  • 8
  • 60
  • 70

7 Answers7

23

The 512-byte emulation is intended for compatibility with older systems. However, writes involving only part of a physical 4K sector can cause reduced performance because the sector needs to be read and modified before it can actually be written.

When a legacy operating system tries to write to an Advanced Format disk, performance issues can arise because the logical sectors written may not match up with the physical sectors.

  • When only part of a 4K physical sector is read, the data is simply read off the physical sector and there is no reduction in performance. However, when the system tries to write to part of a physical sector (e.g. an emulated 512-byte sector rather than the whole physical sector), the hard drive needs to read the whole physical sector, modify the changed portion in the hard drive's internal memory, and write it back to the platters. This is called read-modify-write (RMW), an operation which requires an extra rotation of the disk and therefore reduces performance. Seagate explains this as follows:

[...] the hard drive must first read the entire 4K sector containing the targeted location of the host write request, merge the existing data with the new data and then rewrite the entire 4K sector:

Read-modify-write cycle

In this instance, the hard drive must perform extra mechanical steps in the form of reading a 4K sector, modifying the contents and then writing the data. This process is called a read-modify-write cycle, which is undesirable because it has a negative impact on hard drive performance.

Disk partitions that are not aligned to a 4K boundary can cause degraded performance as well.

  • Traditionally, the first partition on a hard disk starts at sector 63. Windows XP and older operating systems partitioned disks in this manner. Newer versions of Windows will create partitions on a 1 MB boundary, ensuring proper alignment to the physical sectors. This is called Alignment 0.

  • Because LBA 63 is not a multiple of 8 (eight 512-byte legacy sectors fit into a 4K sector), an Advanced Format disk which is formatted in the old manner will have clusters (the smallest unit of filesystem data allocation, typically 4K in size) that are not aligned to the physical sectors on a 4K disk, a condition called Alignment 1. As a result, an I/O operation that otherwise involves 4K of data now spans two sectors leading to a read-modify-write operation that reduces performance.

While information about physical sector size is unnecessary if the OS always writes data on a 4K boundary, this information may still be needed by applications which perform low-level I/O.

  • When a drive reports that its physical sector size is 4K, the OS or application can tell that it is an Advanced Format drive and therefore must avoid performing I/O operations that do not span full physical sectors. A drive that reports 512-byte native sectors does not impose this restriction. While newer operating systems will usually try to read or write data in 4K units whenever possible (making this information irrelevant), applications which perform low-level I/O may need to know the physical sector size so that they can adjust accordingly and avoid misaligned or partial-sector writes that cause slow RMW cycles.

Your SSD provides the ability to change the reported physical sector size because it is necessary for compatibility with certain storage arrays.

  • Datacenters often have storage arrays consisting of legacy 512n drives. 4K drives, even those that emulate 512-byte sectors, may not be compatible with such arrays, so this feature is necessary to ensure compatibility. See this forum thread:

    We can't just stick a 4K drive in an array formatted with 512b disks. Many arrays (most notably ZFS based storage, which is increasingly popular as software defined storage makes waves) will not accept a replacement drive with a different physical sector format.

    Note that better performance will be attained on modern systems if the drive is configured to use 4K sectors.

bwDraco
  • 46,683
7

What benefit does an OS gain by being aware of the physical sector size when, regardless, the OS has to talk to the drive in 512-byte sectors.

The logical size is a minimum size to transfer data. Since this is a block device, any data transfer between host computer and drive will be in multiples of this logical block size.

The physical size is an optimal size to transfer data, and reflects the size of the actual read and write operations at the controller/drive level.

When the host computer requests a read of a logical sector, the controller/drive will perform a read operation of the physical sector that contains the logical sector.
When the logical sector size is equal to the physical sector size, the operation is simple. When the logical sector size is less than the physical sector size, the logical sector has to be extracted from the physical sector by the controller for transfer to the host computer.

When the host computer requests a write of a logical sector, the size of the physical sector matters.
When the logical sector size is equal to the physical sector size, the write operation is simple, and can proceed directly. The condition of the previous contents of the sector will not affect the write operation.

When the logical sector size is less than the physical sector size, the controller must first perform a read operation of the physical sector that contains the logical sector.
If the read is successful, then the logical sector is inserted into the physical sector, and the physical sector is written in its entirely.
If the read is not successful (even after retries), the write operation cannot be completed.

If the OS performs the read and write operations with the physical sector size (by utilizing the multi-sector operations available in the ATAPI command set), the write operations will be performed more efficiently (and without an unnecessary chance of incompletion).

The LOGICAL sector size entirely defines how an OS can talk to a drive. No exceptions. What use is it knowing the physical sector size, when you're only allowed to communicate in logical sector size?

Your contention of "no exceptions" is incorrect.
The ATAPI command set, which was introduced with the IDE HDD, has always had the capability to perform read and write operations with a sector count parameter. This is merely an extension of existing disk and floppy controller interfaces that were also capable of multi-sector read/write operations (so long as the sectors were on the same track).

sawdust
  • 18,591
5

If the OS knows the underlying physical sector size, it can optimize its queries to require as few physical operations as possible. Particularly with SSDs, the physical operation limit (4KB IOPS limit) is often the ultimate limit of device speed, so being able to make best use of this capacity is important.

2

512/4096 = OS responsible for alignment/optimization,

512/512 = Drive responsible for this

See also : http://support.microsoft.com/en-us/kb/2510009

Joe
  • 21
1

There are two different ways of accessing a location within a drive, one is the CHS scheme and the other is the LBA scheme.

CHS stands for Cylinder, Head, Sector and is the most low-level method of determining where to read or write from the drive. You tell it to use cylinder x, head y, and sector z and read or write the contents of that location to or from an address in the memory (a buffer). It is derived from the actual, physical components of a (traditional, spinning rust) hard drive, where you have physical cylinders and read heads. The sector is the smallest addressable unit, and was traditionally fixed at 512 bytes.

LBA is logical byte addressing wherein the drive reads from and writes to a sector address by its offset, for example, read the 123837th sector on the disk or write this to the 123734th sector on the disk (starting from zero).

The problem? Each of these values is limited in range. In fact, because of how severely limited CHS was, LBA had to be introduced. For CHS, the possible values for C (the cylinder) is 1023, while H (heads) can be 255 maximum, and S (sector) can only go up to 63, meaning you can have at most 1024 cylinders x 255 heads x 64 sectors x 512 bytes mapped in traditional CHS format, giving you a grand total of under 8 GiB! Using CHS, it's simply not possible to access a disk larger than 8 GiB!

So LBA was introduced with a 32-bit limit giving you 2^32 x 512 bytes or 2 TiB limit on disk size - this is the reason an MBR disk cannot exceed 2TiB because it uses CHS and LBA to specify partition sizes, and neither can support anything over 2TiB.

Newer, better options have been introduced like the GPT partitioning scheme which extends LBA to 64 bits, giving you a heck of a lot more than you'll ever need at 2^64 x 512 bytes - but there's a catch: a lot of legacy hardware and legacy operating systems and legacy BIOS implementations and legacy drivers don't support UEFI or GPT, and a lot of people would like to have something that can be more-easily upgraded to go past the 2TiB limit without having to rewrite the entire stack from scratch. And, at long last, we reach the 4096 sector size.

See, throughout all the limitations discussed above, one thing has been a fixed assumption: the sector size. From day one, it has been 512 bytes and it's stayed that way ever since. But recently, hard disk manufacturers realized there's an opportunity to work some magic: take the traditional CHS or 32-bit LBA and simply replace the sector size with 4096 (4k) instead of 512 bytes. When an OS says "give me the 2nd sector on the disk" by requesting LBA 1 (because LBA 0 is the first), we aren't going to give it bytes 512 - 1023 but rather bytes 4096 - 8191.

Suddenly, our 2TiB limit is upgraded to 2^32 x 4096 bytes, or 16 TiB, without having to ditch MBR, switch to UEFI or GPT, or anything!

The only catch is that if the OS isn't aware that this is a magic disk that uses 4096 sectors instead of 512 byte sectors, there's going to be a mismatch. Each time the OS says "hey, you, disk, write me these 512 bytes to offset xxx" the disk will use up 4096 bytes to store these 512 bytes (the rest being zeros or junk data, assuming you don't end up with a memory underflow) because they don't communicate in bytes, they communicate in sectors.

So BIOSes now (sometimes) include an option to let you manually specify that a 512-byte sector size should be used instead of the native 4096 byte sector size that newer disks are using - with the caveat that you cannot use it to access more than 2TiB of the disk on an MBR system, just like it was in the "good old days." But modern OSes that are 4k-aware can take advantage of all this to use this magic to read and write in 4096-byte chunks and voilĂ !

(An additional advantage is that things are a lot faster because if you're reading and writing 4096 bytes at a time, it's fewer operations to read or write, say, 4GiB of data.)

0

Just wanted to let you know of a situation where 4K sectors are a problem for modern operating systems.

Microsoft's VSS writer (Shadow Copy) does not work well with 4K sectors. In order to backup a DFS Replication share folder, our backup software "Backup Exec" needs to make a shadow copy of the DFS Replicated folder. The job fails if the DFS Replication folder is on a drive with 4K sectors due to VSS not working correctly with 4K sectors.

Jim

Jim
  • 11
-3

Physical means that of the actual drive itself, while Logical is that of the defined divisions within it. From PC Mag's Logical vs Physical:

In a Windows PC, a single physical hard drive is drive 0; however, it may be partitioned into several logical drives, such as C:, D: and E:.

To explain this in a digestible form, imagine an apple that is the width of your hand. That is the actual Physical size of the apple. Naturally, a whole apple will not fit in your mouth, so you decide to slice it into equal slices, each slice being the width of your finger. This is the Logical size, or size that your computer will utilize.

Several reasons for this are real-value capacity calculations and error mapping and correction, as explained by Wikipedia:

Typical hard disk drives attempt to "remap" the data in a physical sector that is failing to a spare physical sector provided by the drive's "spare sector pool" (also called "reserve pool"),[41] while relying on the ECC to recover stored data while the amount of errors in a bad sector is still low enough. The S.M.A.R.T (Self-Monitoring, Analysis and Reporting Technology) feature counts the total number of errors in the entire HDD fixed by ECC (although not on all hard drives as the related S.M.A.R.T attributes "Hardware ECC Recovered" and "Soft ECC Correction" are not consistently supported), and the total number of performed sector remappings, as the occurrence of many such errors may predict an HDD failure.

Just as you cannot have slices of the apple without the apple itself, you cannot have Logical without the Physical serving as its base.

xCare
  • 807
  • 3
  • 8
  • 18