linux - Bash: Loop through file and read substring as argument, execute multiple instances -


how now

i have script running under windows invokes recursive file trees list of servers.

i use autoit (job manager) script execute 30 parallel instances of lftp (still windows), doing this:

lftp -e "find .; exit" <serveraddr> 

the file used input job manager plain text file , each line formatted this:

<serveraddr>|... 

where "..." unimportant data. need run multiple instances of lftp in order achieve maximum performance, because single instance performance determined response time of server.

each lftp.exe instance pipes output file named

<serveraddr>.txt 

how needs be

now need port whole thing on linux (ubuntu, lftp installed) dedicated server. previous, very(!) limited experience linux, guess quite simple.

what need write , what? example, still need job man script or can done in single script? how read file (i guess easy part), , how keep max. amount of 30 instances running (maybe timeout, because extremely unresponsive servers can clog queue)?

thanks!

parallel processing

i'd use gnu/parallel. isn't distributed default, can installed linux distributions default package repositories. works this:

parallel echo ::: arg1 arg2 

will execute echo arg1 , and echo arg2 in parallel.

so easy approach create script synchronizes server in bash/perl/python - whatever suits fancy - , execute this:

parallel ./script ::: server1 server2

the script this:

#!/bin/sh #$0 holds program name, $1 holds first argument. #$1 passed gnu/parallel. save variable. server="$1" lftp -e "find .; exit" "$server" >"$server-files.txt" 

lftp seems available linux well, don't need change ftp client.

to run max. 30 instances @ time, pass -j30 this: parallel -j30 echo ::: 1 2 3

reading file list

now how transform specification file containing <server>|... entries gnu/parallel arguments? easy - first, filter file contain host names:

sed 's/|.*$//' server-list.txt 

sed used replace things using regular expressions, , more. strip (.*) after first | line end ($). (while | means alternative operator in regular expressions, in sed, needs escaped work that, otherwise means plain |.)

so have list of servers. how pass them script? xargs! xargs put each line if additional argument executable. example

echo -e "1\n2"|xargs echo fixed_argument 

will run

echo fixed_argument 1 2 

so in case should do

sed 's/|.*$//' server-list.txt | xargs parallel -j30 ./script ::: 

caveats

be sure not save results same file in each parallel task, otherwise file corrupt - coreutils simple , don't implement locking mechanisms unless implement them yourself. that's why redirected output $server-files.txt rather files.txt.


Comments

Popular posts from this blog

facebook - android ACTION_SEND to share with specific application only -

python - Creating a new virtualenv gives a permissions error -

javascript - cocos2d-js draw circle not instantly -