103

Is there a tool to split large text file (9 GB) into smaller files, so that I can open it and look through?

Is there anything usable from the command line that comes with Windows (XP)?

Or what's the best way to split it? Can I use 7-Zip to create separate volumes and then unzip one of them separately? Will it be readable or does it need all the other parts to unzip into the big file again?


I put together a quick 48-lines Python script that splits the large file into 0.5 GB files which are easy to open even in Vim. I've just needed to look through data towards the last part of the log (yes, it is a log file). Each record is split across multiple lines so grep would not do.

stefanB
  • 1,173

11 Answers11

51

There is a freeware Windows file splitter called HJSplit.

It is available here. The website claims it can split files of any type and size, but 9 GB is a big file.

pavium
  • 6,490
44

The GNU Core Utils package (available here for Windows) includes the Split utility.

The --help documentation is as follows:

Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is `x'.  With no INPUT, or when INPUT
is -, read standard input.

Mandatory arguments to long options are mandatory for short options too. -a, --suffix-length=N use suffixes of length N (default 2) -b, --bytes=SIZE put SIZE bytes per output file -C, --line-bytes=SIZE put at most SIZE bytes of lines per output file -d, --numeric-suffixes use numeric suffixes instead of alphabetic -l, --lines=NUMBER put NUMBER lines per output file --verbose print a diagnostic to standard error just before each output file is opened --help display this help and exit --version output version information and exit

SIZE may have a multiplier suffix: b for 512, k for 1K, m for 1 Meg.

For example, to split input.txt into 100Mb chunks, only splitting at the ends of lines,

split input.txt -C 100m

will give you output files named xaa, xab, xac, etc.

ZygD
  • 2,577
Flyto
  • 756
33

One can use 7-Zip to create segments of text file in certain size (e.g. 100 MB segments out of 1.5 GB log file).

The Key options are - Use "Store" as opposed to "Compress" - Use "Split to volumes"

You should be able to see text in .001 (.nnn) files.

enter image description here

bummi
  • 1,725
  • 4
  • 16
  • 28
Mehul
  • 331
15

Another is GSplit - according to their site it can split very large files (larger than 4 GB <-- since they crossed the 4 GB limit, I guess they can do 9 GB as well).

But, another thing—you say you want to split it into smaller parts, so you can open it up and look at it. That sounds like a very big file, perhaps a log file.

In any case, for opening large text files, may I recommend EmEditor. They claim themselves it can open very large files (up to cca. 250 GB), and I've used it in the past for files up to 2 GB. But in any case, I think it may be a better solution than splitting.

Rook
  • 24,289
9

Check out Large Text File Viewer, it's great for things like this. Most archivers and splitters will separate the file into pieces which cannot be used to read each piece of data independently and properly, you need to extract them all to get the file back.

alt text

Large Text File Viewer is free and portable.

Gareth
  • 19,080
7

You can use 7-Zip itself to split the files. (You can save as a .zip or .7z format.) When you go to create the archive there is an option called "Split volume, bytes". Just select how large you want the chunks.

And yes, you can unzip them individually if you wish.

Split files in 7-Zip

Felix
  • 179
4

There's an online tool that splits text files if anyone is looking to split files quickly. http://www.textfilesplitter.com.

Works great for me. And splits files respecting lines which is what I was looking for. It also says it's all HTML5 client side so it's safe to use. I'm not sure how big it can go but I think it depends on your machine's ram.

Joe One
  • 51
2

Splitting files is also a function of Total Commander, the tool I can't do my work without.

Get your 30-day trial here: https://www.ghisler.com/ Licenses are dirt-cheap, concurrent and permanent.

In Total Commander, highlight the file you want to split. Select [file][split file] from the menu. In the pop-up, select your target-directory and "bytes per file". Choose from: 1.44 MB, 1.2 MB, 720 K, 360 K, 100 MB, 250 MB, 650 MB or 700 MB. Press OK and watch the magic happen...

0

The idea of seeing part of the file before deciding what to do with it is for me the best option.

The Large Text Viewer App can be installed on Windows through the Microsoft Store and it offers an option to cut the file in chunks of size. It may well be that it uses the same editor previously mentioned (behind the scenes), but the option to install it from a known source is better IMHO than the alternative links offered. It worked great for me.

The only issue if splitting by size is that it not necessarily breaks the file at a good place, so you may have to edit it to capture the exact content you want.

Leo
  • 1
0

I have found the program ffsj very useful. There doesn't seem to be a homepage around currently. But there is a download page here. Be careful with the download clicks, as they try to get you to download additional software, as well.

0

Batch file (split_file_by_length.bat) to split file into parts by length. Needed to workaround SET /P input file limit of 1023 characters.

@echo off

setlocal enableextensions enabledelayedexpansion

set "file=%1" set "max_length=%2" if "%max_length%"=="" set "max_length=1023"

if "%file%"=="" ( echo Usage: %0 filename [max_chars_per_file] echo filename: File to split echo max_chars_per_file: Maximum characters per split file ^(default: 1023^) echo. echo Example: %0 myfile.txt 500 goto :eof )

echo Splitting %file% into chunks of %max_length% characters each...

rem Clean up any existing split files for %%f in ("%file%_*") do ( echo Deleting existing split file: %%f del "%%f" )

set /a cnt=1 set /a char_count=0

for /F "usebackq tokens=*" %%A in ("%file%") do ( set "line=%%A" call :process_line "!line!" )

echo Split complete. Created !cnt! files. goto :eof

:process_line rem Don't use setlocal here - we need to modify global variables set "remaining_line=%~1"

:process_chunk if not defined remaining_line goto :eof

rem Get length of remaining line call :get_length "!remaining_line!" line_length

rem Calculate how much space we have left in current file set /a space_left=%max_length% - !char_count!

rem If line fits completely in current file if !line_length! leq !space_left! ( echo|set /p="!remaining_line!" >> "%file%_!cnt!" set /a char_count+=!line_length! goto :eof )

rem If current file has no space or we need to split the line if !space_left! leq 0 ( rem Start new file set /a cnt+=1 set /a char_count=0 set /a space_left=%max_length% echo Starting new file: %file%_!cnt! )

rem Take what fits in current file if !line_length! leq !space_left! ( rem Entire remaining line fits echo|set /p="!remaining_line!" >> "%file%!cnt!" set /a char_count+=!line_length! set "remaining_line=" ) else ( rem Split the line set "chunk=!remaining_line:~0,%space_left%!" echo|set /p="!chunk!" >> "%file%!cnt!" set "remaining_line=!remaining_line:~%space_left%!" set /a char_count=%max_length% )

goto process_chunk

:get_length setlocal enabledelayedexpansion set "str=%~1" set /a len=0 :loop if defined str ( set /a len+=1 set "str=!str:~1!" goto loop ) endlocal & set "%~2=%len%" goto :eof

BSalita
  • 915