28

After calling unrar on a RAR archive on my 1 TB NTFS drive, I am left with a file, that reportedly has a size of 86T.

Is it safe to delete such file? What is the best way to safely get rid of this file?

Of course, when removed, the file should be unlinked and any actual data belonging to other files should be left unaffected, but I really want to be sure on this one…

Edit 1:

du -h output in the directory containing the archive + the extracted file:

26M .
Edit 2: Progress and Findings
  • chkdsk F: /scan did find a corruption in the extracted file.
  • chkdsk F: /f did Deleting corrupt attribute record (0x80, "") (most likely belonging to this file as the record segment number matches) (attribute 0x80)
  • The file had 0 size afterwards
  • The 0 size file could be deleted
  • chkdsk F: /scan finds no problems
Edit 3: For the Curious

Berore fixing the file with chkdsk F: /f:

  • the file could not be deleted with Double Commander (file is already in use by no other than Double Commander – probably trying to read the whole file)
  • the file could not be deleted with rm (on Windows) due to permission denied (I had the cmd running as the Administrator)
IsawU
  • 391

2 Answers2

40

That's not necessarily erroneously bigger than your whole drive. Many filesystems including NTFS and ext4 support sparse files, in which areas consisting entirely of 'zero' bytes (00 00 00 00 ...) do not have any disk extents allocated to them – such files can easily have an "apparent" size larger than the filesystem, while the real data allocation (aka Windows "size on disk") is smaller.

You can check whether the file is sparse by comparing du and du --apparent, or by listing files with the ls -s/--size option, or using xfs_io to list the individual extents:

$ echo Test > large.bin
$ truncate -s 10G large.bin
$ ls -l -s
4.0K -rw-r--r-- 1 root users 10G Jan 10 12:44 large.bin

$ du -h large.bin; du -h --apparent large.bin 4.0K large.bin 10G large.bin

$ xfs_io -r -c "fiemap -v" large.bin large.bin: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..7]: 4118216..4118223 8 0x1 1: [8..20971519]: hole 20971512

If in doubt, connect the filesystem to a Windows system (a VM might do) and run chkdsk X: /scan or chkdsk X: /f and let it verify that no files overlap – or, after deleting the file, use it to verify that the "free space bitmap" doesn't disagree with existing files.

In general, programs would not be able to create actually larger files than the filesystem can fit: even file archivers do not have that kind of direct access. If you do end up with an erroneously large file that isn't sparse, this can only be the fault of the OS filesystem driver, in which case nothing you do with that filesystem from that point onwards can be guaranteed to be safe (as deletion is done by the same "bad" driver, after all). Use Windows' CHKDSK to verify the filesystem.

For NTFS on Linux, consider switching between the built-in ntfs3 and the earlier ntfs-3g drivers to verify whether they behave the same. Try also extracting your archive on a Linux native filesystem (e.g. on ext4) to see whether it also creates a large file.

grawity
  • 501,077
10

In case there is a problem with the disk, I suggest as first step to ensure that you have backups for the data on the disk.

As the disk is NTFS, it's best handled on Windows as follows.

As second step, run a Command Prompt and enter the following command :

chkdsk C:

If it finds any errors, the next step will be to fix the problems using the command :

chkdsk /f C:

As the last step, if everything completes correctly, you could delete this file.

harrymc
  • 498,455