-1

simple my question is different because i need to merge the files into one too then remove the duplicates lines from the that file which will be over 50GB txt i have large .txt from 10GB+ files

i want to merge them into 1 .txt file

then remove all the duplicates lines from that 1 large .txt file combined which will be around 50GB txt file or 100GB txt file

so what can handle that kind of large file and remove the duplicates from it smoothly ?

i need the fastest way because i tryied bouth notepad++ and emeditor they work super heavy with them for merge or duplicates removing and take forever

i have 12GB RAM

1 Answers1

2

If you are using Linux, you could do it like that:

cat aa.txt bb.txt | sort -u > newfile.txt

Here aa.txt is the first text file and bb.txt the second one.

sort -u sorts the file alphabetically and with -u (see also here https://stackoverflow.com/a/9377125/7311363) you're eliminating duplicates. With > newfile.txt you're writing that to newfile.txt.

chloesoe
  • 716