.TH swarm 1 "14 Feb 2013" "version 1.0"
.SH NAME
swarm - Tool for parallelising shell scripts
.SH SYNOPSIS
swarm [options] [file]
.SH DESCRIPTION
.B Swarm 
allows one to pass commands to simultaneously running instances of a shell such as
.BR bash "(1)."
.BR Swarm " can work with local shells and can also start shells on remote hosts."
.PP
The purpose of
.BR swarm " is to speed up the execution of shell scripts which involve a large number of time consuming tasks, where the order in which the tasks are completed is unimportant.

.SH OPTIONS
.TP
.BI -n " processes"
Set the number of shells for
.B swarm
to spawn. 
By default, one shell is spawned for each CPU found by a call to 
.BR sysconf "(3)."
.TP
.BI -s " shell"
Set the type of shell which will be spawned. The default is
.BR bash "(1)."
.TP
.BI -c " command"
Treat command as a single line of input. This option is not very useful unless there is already an instance of
.B swarm
invoked with the
.B --daemon
option running in the current directory.
.TP
.BI -o " file"
.B Swarm
will send output to
.I file
instead of stdout.


.TP
.BI -l " [file][:level]"
Set the log file and optionally level.
If no file is given, 
.BR swarm " logs to stderr.

The levels are: 0 (errors), 1 (warnings), 2 (notices), 3 (info).
Each level will include messages of a lower level.

By default 
.BR swarm " logs to stderr, at level 2.

Any errors occuring in locally running shells will also be sent to the log file.

.BR Note: " If swarm runs as a daemon, stderr (and hence all log messages) detached 
.IR unless " the 
.BR -l " option is used.

.BR Note: " Remote shells will log messages and stderr to ~/.swarm.slave.log

.TP
.BI -p " port"
Sets the port number to use for connections to remote hosts. 

By default, swarm does not use a fixed port; the port numbers for connections are chosen dynamically. Initially swarm uses ssh (port 22) to start remote swarm slave instances, passing a port to connect to as an argument. Subsequent ports for shell processes are sent to the slave swarm instance over this connection, which connects to the ports and starts the shell processes.
.TP
.BI --daemon
Run
.B swarm
as a daemon in the current directory.
.B Swarm 
will background itself by using 
.BR fork "(2), and attach stdin to a " fifo "(7).
The daemon will continue to run regardless of any EOF sent to the input 
.BR fifo "(7) until it receives an EXIT directive."

Unless the
.BR -o " option is also used, the daemon's stdout will be redirected to /dev/null".

Unless the
.BR -l " option is also used, the daemon's stderr (and all log messages) will be redirected to /dev/null".

Subsequent instances of 
.BR swarm " invoked with the " -c " option or a script file in the same directory as the daemon will not start their own shells, and will send the command to the daemon's input
.BR fifo "(7). Certain directives will affect the behaviour of the 'wrapping' swarm. The daemon
.BR does " " not " send any output to the wrapping instance of " swarm ".

Subsequent instances of 
.BR swarm " invoked without the " -c " option or a script file in the same directory will report an error.

It is of course possible to write commands directly to the daemon's input fifo using 
.BR echo "(1) or any other program. Using 
.BR swarm " is preferred, because it will check input for directives that will affect its behaviour.

.BR Note: " The 'daemon' is not really an official Unix style daemon, but 'daemon' sounds much cooler than 'background process'.
 

.SH ARGUMENTS
If the 
.BR -c " option was not supplied, a single argument remaining after option processing is assumed to name a file containing valid shell commands and 
.BR swarm " directives. If the 
.BR -c " option was also supplied, or there was more than a single additional argument, " swarm " will report an error and exit

.SH INPUT
.B Swarm
treats each line of input as a seperate command to send to a shell.
By default, each command is sent to the first available shell that isn't currently executing a command.


.SH OUTPUT
.B Swarm
will output the results of completed shell commands in the order of completion.
.B swarm
puts a prefix '%d: ' in front of the output to indicate the order in which commands were received.
.PP
It is also possible to direct the output of commands to specified files using the OUTPUT directive. This is probably more useful for scripting purposes.

.SH DIRECTIVES
A directive is any line that starts and ends with a '#' character.
.PP
If a line starts with a '#' but does not contain a second '#', it is treated as a comment. 
.PP
If a line contains characters 
.I after
the second '#' character, it is treated as a SHELL SPECIFIC COMMAND, 
.I not
a DIRECTIVE.
.PP
.B Swarm
understands several directives which can be used to control its behaviour at runtime. Any unrecognised directive is treated as a comment.

.TP
.B EXIT
Tell
.B swarm
to quit
.TP
.BI OUTPUT " [file]"
If
.I file
is supplied, the output of all commands received before the next OUTPUT directive will be sent to the file.
The file is created if it didn't exist, and overwritten if it did exist.

If 
.I file
contains a '%d' format string, the output of commands will be sent to individual files, where the filenames are created from a call to 
.B sprintf(3)
using the commands number as a format argument, and 
.I file
as the format string.

If 
.I file
is not supplied, the output of commands received before the next OUTPUT directive will not be saved to any file.

.TP 
BARRIER
.BR Swarm " will continue to read input and create tasks, but will not actually send any commands until all currently executing commands are finished.

.TP 
BARRIER BLOCK
This directive is intended for use with the 
.BR -c " option. If a shell script starts a swarm daemon, running swarm -c '#BARRIER BLOCK#' allows the script to block until the swarm daemon finishes all tasks."

.BR WARNING: " Sending BARRIER BLOCK to a swarm daemon 
.I without
using a wrapping instance of 
.BR swarm " will cause the swarm daemon to hang after it has completed all tasks.

For a non-daemon instance of swarm, this directive is essentially identical to BARRIER.

.TP
.BI ABSORB " host[:name] [processes]
.B Swarm 
will start and connect to shells on the remote host and name them accordingly. An exec'd 
.BR ssh "(1) is used to start the remote shells."

The default number of
.I processes
is equal to the number of CPUs on the remote host.
The default 
.I name
is the same as
.I host.

.BR Swarm " will start instances of " ssh "(1) and use remote port forwarding to secure the connections.

.TP
.BI ABSORB UNSECURE host[:name] [processes]
.B There are security issues associated with the use of this directive

This directive is the same as ABSORB, except no 
.BR ssh "(1) instances will be spawned for remote port forwarding; all data will be sent unencrypted."

Obviously using unencrypted connections is faster, but dangerous.


.SH SHELL SPECIFIC COMMANDS
By default, any command is sent to the first available shell. 
.PP
A line containing two '#' characters followed by a command will be sent to any shells with names matching a POSIX regex between the '#' characters. 
.PP
To send the command only to the first available shell with name matching the regex, the regex should be followed with ' &'.
.PP
Shells are normally named according to the host on which they are running, and the order they were spawned. The format is 'host:X' where X is an integer greater or equal to zero. Shells running locally are called 'local:X'. Using the ABSORB directive, it is possible to give remote shells a name that is not the same as the hostname of the remote host running the shell.
.PP
.TP 
To print the names of all shells, run:
#.*# echo $name
.TP
To print the name of the first available shell, run:
#.* &# echo $name

.TP
To run a command only on the shell called 'local:0':
#local:0# command

.SH EXITED SHELLS / SIGNAL HANDLING
.BR Swarm " detects when locally running shell exits (using a handler for " SIGCHLD "). If the shell exits normally, regardless of error code,
.BR swarm " will create a new shell to replace it. If the shell exits because of a signal that caused it to terminate, 
.BR swarm " will send " SIGTERM " to itself, which will cause itself (and as a result all other shells) to exit.

If a remote shell exits, for whatever reason, 
.BR swarm " will create a new shell to replace it. " Swarm " has no knowledge of and will not react to signals sent to remote shells on the remote host.

If
.BR swarm " itself exits for whatever reason, it will terminate all local and remote shells.

.SH SECURITY ISSUES
.B Never allow ssh access from accross a public internet to a host with swarm installed.
.PP
When the ABSORB UNSECURE directive is read,
.B swarm
will use 
.BR ssh "(1)
to start slaves processes on the remote host.
.PP
These will then open
.I unencrypted
connections to the master. Because commands are sent as plain text over the network,
The remote hosts become vulnerable to "man in the middle" attacks.

Never use ssh keys with empty passphrases, and restrict access to hosts with 
.BR swarm "installed.

.SH BUGS
.B Swarm
is very recently developed and therefore probably has a shit load of bugs. Probably related to buffering.
.PP
Do not place whitespace after a directive, or it will be treated as a host specific command.
.PP
The stderr of remote shells will always be lost to the void.
.PP
Using things that aren't shells with the 
.BR -s " option won't work, because " swarm " automatically adds extra commands for the shells.
.PP
Writing three bell characters in a row will cause 
.BR swarm " to break, because it looks for three bell characters to signal the end of output from each command."
.PP
.BR Swarm " does try to be nice and tell shells to 'exit', but usually this doesn't work and it sends them a " kill "(2)."
.PP
There are probably bugs with parsing, like assuming you never use ':' as part of a filename or a name for a shell, etc.
Just don't do it and it won't break.
.PP
Report any other bugs to matches@ucc.asn.au
.SH AUTHOR
Sam Moore (matches@ucc.asn.au)
.SH SEE ALSO
.BR bash "(1), " ssh "(1)."