Unix & Linux
shell-script parallelism
Updated Mon, 03 Oct 2022 02:24:51 GMT

Start 100 process at a time in bash script


In bash script, I have a program like this

for i in {1..1000}
do
   foo i
done

Where I call the function foo 1000 times with parameter i

If I want to make it run in multi-process, but not all at once, what should I do?

So if I have

for i in {1..1000}
do
   foo i &
done

It would start all 1000 processes at once, which is not what I want.

Is there a way to make sure that there is always 100 process running? If some processes are finished, start some new ones, until all 1000 iterations are done. Alternatively, I could wait till all 100 are finished and run another 100.




Solution

#!/bin/bash
jobs_to_run_num=10
simult_jobs_num=3
have_runned_jobs_cntr=0
check_interval=0.1
while ((have_runned_jobs_cntr < jobs_to_run_num)); do 
    cur_jobs_num=$(wc -l < <(jobs -r))
    if ((cur_jobs_num < simult_jobs_num)); then
        ./random_time_script.sh &
        echo -e "cur_jobs_num\t$((cur_jobs_num + 1))"
        ((have_runned_jobs_cntr++))
    # sleep is needed to reduce the frequency of while loop
    # otherwise it itself will eat a lot of processor time
    # by restlessly checking
    else
        sleep "$check_interval"
    fi  
done

The better way - by using wait -n. No need for checking jobs number every iteration and usage of sleep command.

jobs_to_run_num=10
simult_jobs_num=3
while ((have_runned_jobs_cntr < jobs_to_run_num)); do
    if (( i++ >= simult_jobs_num )); then
        wait -n   # wait for any job to complete. New in 4.3
    fi
    ./random_time_script.sh &
    ((have_runned_jobs_cntr++))
    # For demonstration
    cur_jobs_num=$(wc -l < <(jobs -r))
    echo -e "cur_jobs_num\t${cur_jobs_num}"
done 

Idea from here - I want to process a bunch of files in parallel, and when one finishes, I want to start the next. And I want to make sure there are exactly 5 jobs running at a time.

Testing

$ ./test_simult_jobs.sh 
cur_jobs_num    1
cur_jobs_num    2
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3
cur_jobs_num    3