26

I was watching FlyTech's video, and when he made a pair illegal folders (one named "<" and the other named ">"), it made Windows think the folder containing them was corrupted. This behavior did not appear to occur for any of the other folders with illegal characters.

I was wondering why this specific combination makes Windows think that the containing folder is corrupted. I searched a little, and could not find why this might be. Could anyone explain?

4 Answers4

46

The error isn't caused by angle brackets exactly, or by having two of them – instead it occurs when 1) a file name contains wildcard characters in its name, and 2) the wildcard would match a previously seen file, which results in Windows thinking that the folder search doesn't advance forwards like it should.

First, as far as I understand, listing a directory on Windows is done by wildcard expansion (the opposite of how it would be done on Linux). To expand a wildcard pattern, you start by calling FindFirstFile() with the initial pattern, then repeat FindNextFile() while NTFS finds matching files one-by-one. To list the entire directory, you do the same with * as the pattern.

Second, both < and > (as well as ") are actually treated as wildcards in the deeper parts of Windows file-handling code – they behave like the historical MS-DOS wildcard variants of * and ?. (For example, > aka DOS_STAR matches all characters up until the file extension.) The publicly available .NET source code contains a description of the algorithm, which is identical to the one found in leaked Windows NT kernel source.

So it's not just the angle brackets, but also " ? * that could be used to trigger this error – as long as they're used in combination with another file name that would be sorted before the wildcard, if the sorting is done by Unicode value (which is the order enforced by NTFS).

For example, you would also get the "folder corrupted" error if you had items named foo( and foo*. There is nothing special about the ( here, except that it goes before * in Unicode – while a character that sorts after * such as foo+ would not trigger the error. (You can open "Character Map" via charmap.exe if you want to see the Unicode positions of these characters.)

Similarly, a directory containing [foo<, foo=] or [foo?, fooo] would not trigger this situation, but a directory containing [foo=, foo>] or [foo+, foo?] would.

So if I understand everything correctly, what seems to happen is:

  1. The directory has items [foo(, foo*], with NTFS enforcing this exact order.
  2. Kernel asks NTFS "Get first item, starting at *".
  3. NTFS finds and returns foo(.
  4. Kernel asks NTFS "Get next item, continuing at foo(".
  5. NTFS finds foo( (exact match) and returns the next item foo*.
  6. Kernel asks NTFS "Get next item, continuing at foo*".
  7. NTFS finds foo* – which is recognized as a wildcard and matches foo( first, therefore the next item is foo* again – so an error is raised.

As > is handled similarly to the * wildcard, a folder named ">" causes the same problem by matching the previous "<" item before itself.

grawity
  • 501,077
3

Characters < and > belong to "reserved characters" group, which must not be used to name a file or directory in Windows.

https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

The following reserved characters are not valid to name a file or directory:

< (less than) > (greater than) : (colon) " (double quote) / (forward slash) \ (backslash) | (vertical bar or pipe) ? (question mark)

  • (asterisk)

1

As noted in another answer, these characters are in the reserved characters group:

https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file

The following reserved characters are not valid to name a file or directory:

< (less than) > (greater than) : (colon) " (double quote) / (forward slash) \ (backslash) | (vertical bar or pipe) ? (question mark)

  • (asterisk)

There are very specific reasons for each of these. There are ways around them, but each has a fundamental function in command processing:

  • < (less than) - Input from a file
  • > (greater than) - Output to a file
  • : (colon) - Drive identifier (e.g., C:)
  • " (double quote) - Quote a file name that includes spaces
  • / (forward slash) - POSIX folder/directory separator
  • \ (backslash) - Windows folder/directory separator
  • | (vertical bar or pipe) - Use output of one process as input to another process
  • ? (question mark) - Single character wildcard
  • * (asterisk) - Multiple character wildcard
0

This confusing error may take many shapes, such as :

  • Error code 0x80070570
  • Error code 0x570
  • Error code 1392 (570 in hex)
  • ERROR_FILE_CORRUPT
  • Message "The file or directory is corrupted and unreadable"

A typical error message is :

enter image description here

In all cases, this means that the file or folder could not be accessed, although it exists in the NTFS Master File Table (MFT).

The other answers have well explained that some characters are illegal within a file-name.

If one manages to create a file containing these illegal characters, usually by using Linux for which these characters are legal, then we have here an anomaly : On the one hand the file/folder exists in the file-table, but on the other hand any attempt to open it is refused because its name doesn't pass the verification checks.

Faced with this contradiction, the Windows kernel gives up and returns the above error code, saying that something is corrupted, and it is up to the user to fix the bad file-entry.

This error message could have been worded more clearly, but what it means is that there is a contradiction between the contents of the file-table and the data on the disk, which for Windows means "corruption".

The message doesn't necessarily mean that the folder containing a file with such a name is corrupted. This generic message will be issued for a file and also for a folder whose name contained illegal characters.

harrymc
  • 498,455