How to identify file by "File Record Segment"?

Question

How to actually identify files by "File Record Segment" addresses as reported by chkdsk?

There was a blackout and my drives' filesystems were corrupted, the hardware and operating system files are fine, I have made thorough checks, but I did the stupid thing by running chkdsk /f X: first and it processed 107855 data files.

I was able to gain access to System Volume Information folder and find the logs inside Chkdsk folder there. The log is 11.2MiB and I have uploaded it to Google Drive.

I am trying to programmatically scrape the log file and identify all the files affected by it. It is extremely easy for me to write the program, I know what I am doing.

I want to check the affected files and delete corrupted files and possibly redownload them. The drive contains Tebibytes of download pirated games and Tebibytes of data I created. I don't want to do data recovery and such.

Problem is, the terminology used by chkdsk is inconsistent and confusing:

Checking file system on D:
The type of the file system is NTFS.
Chkdsk cannot run because the volume is in use by another
process.  Chkdsk may run if this volume is dismounted first.
ALL OPENED HANDLES TO THIS VOLUME WOULD THEN BE INVALID.
Would you like to force a dismount on this volume? (Y/N) Volume dismounted.  All opened handles to this volume are now invalid.
Volume label is Tremillia.
Stage 1: Examining basic file system structure ...
Attribute record of type 0x80 and instance tag 0x4 is cross linked
starting at 0x9383fc2 for possibly 0xfe clusters.
Some clusters occupied by attribute of type 0x80 and instance tag 0x4
in file 0x46c4 is already in use.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x46C4.
Attribute record of type 0x80 and instance tag 0x4 is cross linked
starting at 0x93840c0 for possibly 0x95 clusters.
Some clusters occupied by attribute of type 0x80 and instance tag 0x4
in file 0x46c5 is already in use.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x46C5.
Attribute record of type 0x80 and instance tag 0x4 is cross linked
starting at 0x9384489 for possibly 0xc5 clusters.
Some clusters occupied by attribute of type 0x80 and instance tag 0x4
in file 0x4706 is already in use.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x4706.
Attribute record of type 0x80 and instance tag 0x4 is cross linked
starting at 0x938454e for possibly 0x90 clusters.
Some clusters occupied by attribute of type 0x80 and instance tag 0x4
in file 0x470b is already in use.
Deleting corrupt attribute record (0x80, "")
from file record segment 0x470B.
Attribute record of type 0x80 and instance tag 0x3 is cross linked
starting at 0x93799ea for possibly 0x40 clusters.
Some clusters occupied by attribute of type 0x80 and instance tag 0x3
in file 0x4f3f is already in use.
...

There are 3 types of hexadecimal addresses, termed "file record segment", "file" and "cluster". Right away you can see the cluster address starts very big and is way larger than addresses of "file" and "file record segment".

And then you can see "file" and "file record segment" addresses are of similar range and the same addresses are referred to as both "file" and "file record segment". So it is evident that "file" and "file record segment" are the same thing.

I tried to Google how to identify file by file record segment and most results are completely irrelevant, top results are useless tutorials made by questionable software companies designed to sell their product. Stuff like "how to fix", "what to do when file record segment is unreadable"...

Google is completely useless and the results are not what I want at all, I want to identify the files by "file record segment" number.

The only thing relevant I have found is this post.

Well, I finally had a chance to try out some Linux-based ntfs utilities. I am not sure whether they are from ntfs-3g, ntfsprogs, or ntfsutils, but anyone who has a favorite linux distro should be able to fire it up and get the information I needed pretty easily. The numbers are inode numbers, and two commands were very helpful. This one provided what I needed:

ntfscluster -I <inode # / file record segment #>

It included the full path to the file in question and very little other information. >This command provided a whole lot of information I didn't need:

ntfsinfo -i <inode # / file record segment #>

It didn't provide the path, but did provide the inode number of the parent directory, so you could reverse engineer the path by using this command over and over.

NTFSCluster is a Linux utility, searching ntfscluster windows turns up results about NTFS filesystem and its cluster size, with the message "Did you mean: ntfs cluster windows". The only thing relevant is the third result: NtfsProgs for Windows - GnuWin32 - SourceForge, said result is an ancient program from 2004 and is missing libintl3.dll and libiconv2.dll upon downloading. After pasting the .dlls to the folder it doesn't work:

PS C:\Users\Xeni> C:\Users\Xeni\Downloads\ntfsprogs-1.9.0-bin\bin\ntfscluster.exe -I 154681282 D:
Failed to set locale, using default '(null)'.
win32_io.c(199): ntfs_device_win32_open The handle is invalid.
 ioctl failed
Couldn't mount device 'D:': Invalid argument

NTFSInfo from SysInternals doesn't seem to provide the functionality to identify file by file record segment.

After searching for a replacement I found nfi.exe which is even more ancient, it is from 1999! It does work:

PS C:\Users\Xeni> nfi.exe D: 0x937950
NTFS File Sector Information Utility.
Copyright (C) Microsoft Corporation 1999. All rights reserved.
***Logical sector 9664848 (0x937950) on drive D is in file number 312188.
\Games\SPACEE~1\data\textures\galaxies\GALAXI~1.PAK
    $DATA (nonresident)
        logical sectors 8738944-10045631 (0x855880-0x9948bf)

But it takes logical-sector-number as argument instead of FILE RECORD SEGMENT:

PS C:\Users\Xeni> nfi.exe /?
NTFS File Sector Information Utility.
Copyright (C) Microsoft Corporation 1999. All rights reserved.
Dumps information about an NTFS volume, and optionally determines
which volume and file contains a particular sector.
Usage: D:\CliExec\nfi.exe drive-letter [logical-sector-number]
        Drive-letter can be a single character or a character followed
        by a colon (i.e., C or C: are acceptable).

        Logical-sector-number is a decimal or 0x-prefixed hex
        number, specifying a sector number relative to the volume
        whose drive letter is given by drive-letter. If not
        specified, then information about every file on the volume
        is dumped.

   D:\CliExec\nfi.exe NT-device-path physical-sector-number

        Determines which volume a given physical sector on a drive is
        within, and then which file on the volume it is in.

        NT-device-path is the NT-style path to a physical device.
        It must not include a partition specification.

        Physical-sector-number is a decimal or 0x-prefixed hex
        number, specifying a sector number relative to the physical
        drive whose device path is given by NT-device-path.

    D:\CliExec\nfi.exe full-win32-path

        Dumps information about a particular file. full-win32-path
        must start with a drive letter and a colon.

I thought logical-sector-number and file-record-segment are the same thing, but it seems they aren't:

PS C:\Users\Xeni> fsutil fsinfo ntfsInfo D:
NTFS Volume Serial Number :        0xfc2bea5043264555
NTFS Version      :                3.1
LFS Version       :                2.0
Total Sectors     :                7,810,824,157  (  3.6 TB)
Total Clusters    :                  976,353,019  (  3.6 TB)
Free Clusters     :                  275,935,695  (  1.0 TB)
Total Reserved Clusters :                 36,875  (144.0 MB)
Reserved For Storage Reserve :                 0  (  0.0 KB)
Bytes Per Sector  :                512
Bytes Per Physical Sector :        512
Bytes Per Cluster :                4096
Bytes Per FileRecord Segment    :  1024
Clusters Per FileRecord Segment :  0
Mft Valid Data Length :            2.14 GB
Mft Start Lcn  :                   0x0000000000000002
Mft2 Start Lcn :                   0x000000000a31f5fd
MFT Zone Size  :                   200.13 MB
Max Device Trim Extent Count :     0
Max Device Trim Byte Count :       0
Max Volume Trim Extent Count :     62
Resource Manager Identifier :      E548A579-8A28-11EB-BA87-F4B52033630B

It seems file record segment is 1024 bytes and logical sector number is 512 bytes, so maybe I need to double the number to get the actual file referenced. But that may very well be wrong as cluster number is very high.

So how do I actually identify the file by file record segment?

Update

I used some Google-fu and come up with this atrocity of a search keyword: "file record segment" -unreadable -fix -corrupt and I have finally found something useful:

file record segment: A record in the master file table that contains attributes for a specific file on an NTFS volume. The file record segment is always 1,024 bytes (1 kilobyte) in size.

Just two sentences containing crucial information. That is it. The result is the top result and all other results are irrelevant.

Now I need to find a way to programmatically query Master File Table and hope that I don't corrupt it accidentally, which amounts to more futile Google searching...

score 1 · Accepted Answer · 2024-01-22T12:57:11.967

I see you already found LBA sector number has zero to do with file record segment.

So:

There are 3 types of hexadecimal addresses, termed "file record segment", "file" and "cluster".

MFT basically consists of 1024 byte, numbered records. The number is your file record segment (FRS). This number is the same as File.

(source: https://flatcap.github.io/linux-ntfs/ntfs/concepts/file_record.html)

Cluster is actual groups of sectors allocated to a file. Sector is LBA address, start counting from 0 = first sector of the physical drive. Cluster starts counting from 0 @ first sector of the volume or file system. So to convert a cluster number to a LBA sector address we need LBA offset to volume + (cluster * (sectors per cluster))

I approach this from a file recovery point of view because that's what I do and so I use CreateFile to open a drive, interpret the boot sector to find the MFT (MFT first cluster is given as is the cluster factor (sectors per cluster). MFT is self referencing, and so first FRS gives us all clusters allocated to the MFT and from there I 'parse' file record segments and stuff key info into an array. FRS 197571 is then simply 197571th array element from which you can determine the filename.

Of course this gives you only a filename and to determine full path you'd need to resolve the $File_Name attribute to determine the 'parent', go to that FRS, determine filename etc..