26

Sequential: for i in {1..1000}; do do_something $i; done - too slow

Parallel: for i in {1..1000}; do do_something $i& done - too much load

How to run commands in parallel, but not more than, for example, 20 instances per moment?

Now usually using hack like for i in {1..1000}; do do_something $i& sleep 5; done, but this is not a good solution.

Update 2: Converted the accepted answer into a script: http://vi-server.org/vi/parallel

#!/bin/bash

NUM=$1; shift

if [ -z "$NUM" ]; then
    echo "Usage: parallel <number_of_tasks> command"
    echo "    Sets environment variable i from 1 to number_of_tasks"
    echo "    Defaults to 20 processes at a time, use like \"MAKEOPTS='-j5' parallel ...\" to override."
    echo "Example: parallel 100 'echo \$i; sleep \`echo \$RANDOM/6553 | bc -l\`'"
    exit 1
fi

export CMD="$@";

true ${MAKEOPTS:="-j20"}

cat << EOF | make -f - -s $MAKEOPTS
PHONY=jobs
jobs=\$(shell echo {1..$NUM})

all: \${jobs}

\${jobs}:
        i=\$@ sh -c "\$\$CMD"
EOF

Note that you must replace 8 spaces with 2 tabs before "i=" to make it work.

ᔕᖺᘎᕊ
  • 6,393
Vi.
  • 17,755

7 Answers7

17

GNU Parallel is made for this.

seq 1 1000 | parallel -j20 do_something

It can even run jobs on remote computers. Here's an example for re-encoding an MP3 to OGG using server2 and local computer running 1 job per CPU core:

parallel --trc {.}.ogg -j+0 -S server2,: \
     'mpg321 -w - {} | oggenc -q0 - -o {.}.ogg' ::: *.mp3

Watch an intro video to GNU Parallel here:

http://www.youtube.com/watch?v=OpaiGYxkSuQ

Gareth
  • 19,080
Ole Tange
  • 446
4

Not a bash solution, but you should use a Makefile, possibly with -l to not exceed some maximum load.

NJOBS=1000

.PHONY = jobs
jobs = $(shell echo {1..$(NJOBS)})

all: $(jobs)

$(jobs):
    do_something $@

Then to start 20 jobs at a time do

$ make -j20

or to start as many jobs as possible without exceeding a load of 5

$ make -j -l5
2

One simple idea:

Check for i modulo 20 and execute the wait shell-command before do_something.

harrymc
  • 498,455
2

posting the script in the question with formatting:

#!/bin/bash

NUM=$1; shift

if [ -z "$NUM" ]; then
    echo "Usage: parallel <number_of_tasks> command"
    echo "    Sets environment variable i from 1 to number_of_tasks"
    echo "    Defaults to 20 processes at a time, use like \"MAKEOPTS='-j5' parallel ...\" to override."
    echo "Example: parallel 100 'echo \$i; sleep \`echo \$RANDOM/6553 | bc -l\`'"
    exit 1
fi

export CMD="$@";

true ${MAKEOPTS:="-j20"}

cat << EOF | make -f - -s $MAKEOPTS
PHONY=jobs
jobs=\$(shell echo {1..$NUM})

all: \${jobs}

\${jobs}:
        i=\$@ sh -c "\$\$CMD"
EOF

Note that you must replace 8 spaces with 2 tabs before "i=".

Vi.
  • 17,755
warren
  • 10,322
1
for i in {1..1000}; do 
     (echo $i ; sleep `expr $RANDOM % 5` ) &
     while [ `jobs | wc -l` -ge 20 ] ; do 
         sleep 1 
     done
done
msw
  • 3,779
1

You could use ps to count how many processes you have running, and whenever this drops below a certain threshold you start another process.

Pseudo code:

i = 1
MAX_PROCESSES=20
NUM_TASKS=1000
do
  get num_processes using ps
  if num_processes < MAX_PROCESSES
    start process $i
    $i = $i + 1
  endif
  sleep 1 # add this to prevent thrashing with ps
until $i > NUM_TASKS
Paul R
  • 5,708
0

you can do it like this.

threads=20
tempfifo=$PMS_HOME/$$.fifo

trap "exec 1000>&-;exec 1000<&-;exit 0" 2
mkfifo $tempfifo
exec 1000<>$tempfifo
rm -rf $tempfifo

for ((i=1; i<=$threads; i++))
do
    echo >&1000
done

for ((j=1; j<=1000; j++))
do
    read -u1000
    {
        echo $j
        echo >&1000
    } &
done

wait
echo "done!!!!!!!!!!"

using named pipes, every time, it runs 20 sub shell in parallel.

Hope it help :)