linux - Bash: Loop through file and read substring as argument, execute multiple instances -
how now
i have script running under windows invokes recursive file trees list of servers.
i use autoit (job manager) script execute 30 parallel instances of lftp (still windows), doing this:
lftp -e "find .; exit" <serveraddr>
the file used input job manager plain text file , each line formatted this:
<serveraddr>|...
where "..." unimportant data. need run multiple instances of lftp in order achieve maximum performance, because single instance performance determined response time of server.
each lftp.exe instance pipes output file named
<serveraddr>.txt
how needs be
now need port whole thing on linux (ubuntu, lftp installed) dedicated server. previous, very(!) limited experience linux, guess quite simple.
what need write , what? example, still need job man script or can done in single script? how read file (i guess easy part), , how keep max. amount of 30 instances running (maybe timeout, because extremely unresponsive servers can clog queue)?
thanks!
parallel processing
i'd use gnu/parallel. isn't distributed default, can installed linux distributions default package repositories. works this:
parallel echo ::: arg1 arg2
will execute echo arg1
, and echo arg2
in parallel.
so easy approach create script synchronizes server in bash/perl/python - whatever suits fancy - , execute this:
parallel ./script ::: server1 server2
the script this:
#!/bin/sh #$0 holds program name, $1 holds first argument. #$1 passed gnu/parallel. save variable. server="$1" lftp -e "find .; exit" "$server" >"$server-files.txt"
lftp
seems available linux well, don't need change ftp client.
to run max. 30 instances @ time, pass -j30
this: parallel -j30 echo ::: 1 2 3
reading file list
now how transform specification file containing <server>|...
entries gnu/parallel arguments? easy - first, filter file contain host names:
sed 's/|.*$//' server-list.txt
sed
used replace things using regular expressions, , more. strip (.*
) after first |
line end ($
). (while |
means alternative operator in regular expressions, in sed, needs escaped work that, otherwise means plain |
.)
so have list of servers. how pass them script? xargs
! xargs
put each line if additional argument executable. example
echo -e "1\n2"|xargs echo fixed_argument
will run
echo fixed_argument 1 2
so in case should do
sed 's/|.*$//' server-list.txt | xargs parallel -j30 ./script :::
caveats
be sure not save results same file in each parallel task, otherwise file corrupt - coreutils simple , don't implement locking mechanisms unless implement them yourself. that's why redirected output $server-files.txt
rather files.txt
.
Comments
Post a Comment