16

I have encountered several USB sticks with a corrupted filesystem in only 2 years. In a Windows-only environment (Vista and newer), what can be done to reduce the chance of filesystem corruption and dataloss on a single USB drive?

  • Which filesystem is the most robust?
  • Which technologies or labels (xyz certified, etc) indicate that USB sticks supporting them are less likely to become corrupted?
  • Is there something else to look out for?
Peter
  • 4,630

5 Answers5

12

What can be done to reduce the chance of filesystem corruption and dataloss on a single USB drive?

Commonly used filesystems like FAT32 or NTFS don't store any data validation information (only on the internal filesystem itself). Keep backups of your data, validate the data with checksums (you can generate MD5/SHA1 hashes for your files only to check if any data has been corrupted), and/or store recovery archives.

And lastly, regardless of the filesystem, you should always properly unmount the drive. This ensures that any existing file reads/writes are completed, and any read/write buffers have been flushed.

Which filesystem is the most robust?

Robustness comes at a price - compatibility. Arguably, you'd want a filesystem with built-in data validation and checksumming (or redundant data) like ZFS, but that isn't very portable with Windows/OSX. If performance is a concern, you might want to try exFAT, which appears to be supported in most major operating systems out of the box or with some slight configuration.

Which technologies or labels (xyz certified, etc) indicate that USB sticks supporting them are less likely to become corrupted?

Anything that keeps flash memory alive longer, most notably wear leveling and over provisioning. If the drive supports wear leveling, a larger drive will keep more available sectors in the case some wear out.


At the end of the day, flash memory doesn't last forever. All current flash memory has a limited number of read/write cycles, which inherently causes data loss over time. You can mitigate this risk by taking regular backups, and validating your data with checksums to determine when a file has been corrupted.

It's also possible to use a filesystem with built-in data integrity and recovery, but these are uncommon in many non-UNIX environments as of writing this. They also may be slower and actually wear out the drive faster, due to the requirements of storing additional checksums & redundant information for each file.

For each case there's a solution, you just need to weigh the portability/integrity/speed considerations.

Breakthrough
  • 34,847
4

The most common reason flash drives get corrupted is impatience. I often refuse to wait to eject flash drives, and I know I'm not the only one. (In my defense, I also tend to make sure nothing critical is only on a flash drive, and you should, too.)

Drives get corrupted when you don't safely remove them because of something called "write cacheing." Essentially, write caching is a feature that improves write speeds. Rather than writing each request as it is received and forcing you to wait, your OS will cache these requests, and fulfill all of them in one fell swoop. When you tell your computer to safely remove or unmount your flash drive, you essentially warn the OS that you're going to remove it, so it writes all requests in its cache to the disk, and tells all background programs to stop accessing it. If you don't wait, you could have items waiting to be written to disk, which could result in a corrupt filesystem.

As for format, I personally prefer ext4 for my flash drives. For Windows, I would say go with NTFS, as ext4 tends to cause problems in Windows. NTFS supports large files, and journals, so it will work pretty well. Filesystem is largely a personal choice, and typically anything that is less prone to corruption is also going to be significantly slower. ZFS is becoming popular, though I don't know whether that works on Windows, and I don't know whether it can be put on a flash drive.

In terms of brands, I don't find a large difference in quality from one to the other. Some have better protection for the connectors, some certainly feel less 'flimsy' (although, surprisingly, I've found that the flimsy ones break less often). I usually just use whatever is cheap.

You should recognize that nothing important should ever be kept solely on flash memory. USB sticks are too easy to lose, step on, or drop into the toilet, etc. Important data should be backed up and kept on at least two distinct drives, and preferably in at least 2 separate physical locations (think fire risk, flood risk, etc).

4

Filesystem - if you only use your flash drive on one operating system then use the same filesystem that is on your computer:

Windows - NTFS

Mac OS X - HFS Plus

Linux - several options, here's an article

If you use your drive on multiple operating systems you need to use FAT32 because it is the most compatible filesystem, but also the most unreliable. Any of the above choices are a better option if compatibility is not an issue. Note: if you use Linux and Windows then you can use NTFS for both computer filesystems and your flash drives, but if you use Mac OS X then getting NTFS working is a pain and not worth it.

Brand - never base any purchase ever on brand

Technologies - @Breakthrough listed some that you could look for, but just about all flash drives today will have these features or some other proprietary alias for them that isn't worth looking into

In general, don't spend too much time looking into it. As @SethCurry touched on, redundancy is always the better answer for keeping your data safe. Any storage device can and will fail eventually, so you don't want to get comfortable with one solution.

jjno91
  • 315
1

One thing is not mentioned in the answers yet, and that is the type of flash memory used. This will only be a criterium when purchasing a Flash memory device.

There's SLC (Single Level Cell) and MLC (Multi Level Cell) flash memory.
MLC technology was developed to increase the density of data that can be stored.
It stores more than one bit per cell, usually two. The way it does it is by storing four voltage levels in the cell. That gives you two bits of data for every cell.

There are a couple problems with MLC.

MLC has about a factor of 10 fewer write cycles that it's able to handle. It can do 10,000 writes before it begins to have problems - whereas SLC can do 100,000.
As the cell's ability to store degrades, with MLC the ability the distinguish the levels will decrease faster than with just two levels.

Although all of these technologies use ECC, Error Correction Code, there's a limit to what it can do. Also, the wear leveling, as mentioned in an other answer will not 'fix' the difference. Maybe there's still some really cheap USB sticks that don't 'wear level' but you'd be hard to tell from the outside.

Also, whether a device uses SLC or MLC will be hard to tell. In the race to cram ever more data into the same 'surface' I assume that most manufacturers will have switched to MLC. If reliability is a prime factor, maybe you can look around and still find some SLC memory devices.

Jan Doggen
  • 4,657
0

In a Windows only environment you can also optimize for performance the USB device:

  1. Go in Device Manager and search for your USB stick under Disk Drives.
  2. Right-click on your device, select Properties.
  3. In Policies tab check the option "Better performance" instead of "Quick removal" and confirm.

The management of the writing is different: if you select "Quick removal" (default), Windows tries to write the data immediately on the USB pen, while using "Better performance" option the system will cache data instead of writing and will flush all only before unmounting the volume using Safely Remove Hardware (usually by left-clicking on the USB icon in the tray).

Be aware that with "Better performance" option you can loose not written data in case of unplugging of the stick without using the safely remove hardware way or in case of sudden shutdown of your machine.

See also Is There A Need To Safely Remove Device If "Quick Removal" Is Enabled?