5

What i want to do is to copy 500K of files.

I want to copy within server from one destination to another.It includes emails mostly so many small files.

Its over 23 GB only but takes so long (over 30 mins and not done yet) , linux cp command also only uses 1 CPU .

So if i script it to use multiple cps , would that make it faster.

System is 16 cores , 16 GB Ram , 15K Drivers (15000 RPM SATA) .

What are other options?

I believe tarring and untaring would take even longer and wont use multi-core ..

Oliver Salzburg
  • 89,072
  • 65
  • 269
  • 311

5 Answers5

7

Your bottleneck is hard-drive speed. Multi-core can't speed this up.

Pubby
  • 364
3

Coping a single large file is faster than moving lots of small files as there is lots of latency with the setup and tear down of each operation - also the disk and OS can do lots of read-ahead with a single large file. So tarring it first would make it quicker. Though once you factor in the time taken to tar, it may not speed things up too much.

Note that you are only reading from a single disk, so parallelising your calls to the disk may actually slow things down, where it tries to serve multiple files at the same time.

Paul
  • 61,193
0

Although the question has been quite old, I think the best way is to zip using multi-cores like lbzip2 and pbzip2. Transfer the compressed file and decompress it using multi-cores. You can find about the commands on Internet.

Dharma
  • 103
0

Compression may half the size of the file that needs to be written. If you can fully and efficiently utilize the cores and most of the compression occurs in fast memory, this (theoretically) could cut your write time almost in half. Writes are also usually slower than reads. Half is just a guess of course, a lot depends on the type, size, and number of "small" files you are trying to compress. Large log files seem to compress the best b/c they're all text, lots of spaces, etc. whereas already compressed image files will yield little if any improvement at all. Just as with compilation, any terminal I/O from the copying program is extremely slow and should be assigned to a file or for pure speed to null using the >& sequence. Null of course will save no error information, and put the onus on the user to ensure the file(s) got copied. This works best for a few large files unless the files can be verified through a sequence or other method.

0

Is it all in the same directory? There is a script that starts multiple cp: http://www.unix.com/unix-dummies-questions-answers/128363-copy-files-parallel.html

For a tree you need to adjust it.

ott--
  • 2,251