Questions tagged [gnu-parallel]

GNU Parallel is a command-line tool that allows one to run multiple commands in parallel

From the website:

GNU Parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel.

GNU Parallel was written by Ole Tange, who also happens to have a Super User account.

The documentation is available online.

50 questions
12
votes
3 answers

How can I install GNU Parallel alongside Moreutils?

Homebrew has a formula for moreutils and GNU parallel. GNU Parallel conflicts with Moreutils since it also has a binary called parallel, which is just less useful. However I'd still like to install both formulae at the same time. How can I do…
slhck
  • 235,242
10
votes
2 answers

How to use GNU parallel with gunzip

I have a directory full of .gz, I want to expand each archive in parallel with GNU parallel. However I did not achieve anything. I tried parallel 'gunzip {}' ::: `ls *.gz` parallel gunzip `ls *.gz` with no results, bash tells me: /bin/bash:…
gc5
  • 340
5
votes
2 answers

Suppressing stderr in GNU Parallel

I'm using GNU Parallel to concurrently run a command several thousand times. To get logs of the execution I'm using --files and --results. To get nice progress bar while it's running I'm using --eta and --progress. Now, my problem is that while…
Jasiu
  • 171
4
votes
3 answers

Gnu parallel and ack not playing nicely due to stdin, pipe

I'm trying to use parallel and ack together to do some searching in parallel. However, ack seems to insist on using stdin if it finds itself in a pipe, even if you give it files to search: $ echo hello > test.txt $ ack hello test.txt hello $ echo…
mgalgs
  • 2,472
4
votes
2 answers

How do I use GNU split's "filter" option with GNU parallel?

I am trying to split a number of huge gz file into N-line compressed gzipped chunks. To demonstrate, let us consider the following: seq 100 | gzip > big_file0.gz I can split this into multiple 10-line compressed files as follows: zcat big_file0.gz…
saffsd
  • 143
4
votes
2 answers

GNU "parallel --pipe" doesn't process stdin by lines

I'm super confused about how to use GNU parallel to pass stdin to the job command. I have what I imagined to be a really common use case. I have some process xxd that does something with stdin and outputs to stdout. I have some way to generate or…
ThorSummoner
  • 1,240
4
votes
2 answers

GNU Parallel - global variables and function

I have this script: GLOBAL_VAR="some global value" function test { echo $1 echo ${GLOBAL_VAR} } export -f test parallel --jobs 5 --linebuffer test ::: "${files[@]}" How can I have $GLOBAL_VAR visible from parallel?
3
votes
1 answer

How can I force gnuparallel to carry out a command set sequentially?

Gnu parallel is a powerful tool that I use to run many independant BASH commands as a single set in parallel. I would like to be able to run the same commands SEQUENTIALLY without significant changes to the command I use. I know there is a switch to…
Mr Purple
  • 455
3
votes
1 answer

Using sed with parallel gives empty output when redirecting to file

I'm using the zsh shell. I am trying to use sed to substitute some text in many files, using parallel to speed up the process. When I tested this on one file and let the command output go to stdout I saw the expected result. When I tried to redirect…
3
votes
1 answer

Gnu Parallel hangs as one process is "sleeping"

I am running a command in parallel using Gnu Parallel, which has two parameters as input, a directory and a conf file: parallel --gnu my_command ::: (ls -d dir*test) ::: properties.conf I run it on top of a multi core cpu (24 cores) and…
Randomize
  • 543
3
votes
1 answer

Batch download of URLs from command line multithreaded

I have 100,000 URLs of small files to download. Would like to use 10 threads and pipelining is a must. I concatenate the result to one file. Current approach is: cat URLS | xargs -P5 -- curl >> OUTPUT Is there a better option that will show…
3
votes
1 answer

GNU parallel does not split work evenly

My understanding is that the -X option should distribute arguments evenly among the jobs. Yet, I get a very skewed distribution: user@host:/tmp/ptest$ count() { > echo $# > } user@host:/tmp/ptest$ export -f count user@host:/tmp/ptest$ count…
Zoltan
  • 249
  • 2
  • 10
3
votes
2 answers

Achieve better compression for multiple gzipped files

I have several directories containing thousands of gzip files (overall we are talking about 1M files). Some of these files are corrupted and most of them are really small in size (a couple of KB). Almost all of them are highly similar in content,…
nopper
  • 131
3
votes
2 answers

Multiple read from a txt file in bash (parallel processing )

Here is a simple bash script for HTTP status code while read url do urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "${url}" --max-time 5 ) echo "$url $urlstatus" >> urlstatus.txt done < $1 I am…
3
votes
1 answer

gnu parallel remove escape before space characters in command

I'm currently testing gnu parallel to distribute a compare command across multiple servers using bash. In its most basic function this compare command takes two inputs to compare (oracle database accessions) and requires an output filename via -o.…
1
2 3 4