WEST Work Manager

Introduction

WWMGR is the parallel task distribution framework originally included as part of the WEMD source. It was extracted to permit independent development, and (more importantly) independent testing. A number of different schemes can be selected at run-time for distributing work across multiple cores/nodes, as follows:

Name	Implementation	Multi-Core	Multi-Node	Appropriate For
serial	None	No	No	Testing, minimizing overhead when dynamics is inexpensive
threads	Python “threading” module	Yes	No	Dynamics propagated by external executables, large amounts of data transferred per segment
processes	Python “multiprocessing” module	Yes	No	Dynamics propagated by Python routines, modest amounts of data transferred per segment
mpi	mpi4py compiled and linked against system MPI	Yes	Yes	Distributing calculations across multiple nodes. Start with this on your cluster of choice.
zmq	ZeroMQ and PyZMQ	Yes	Yes	Distributing calculations across multiple nodes. Use this if MPI does not work properly on your cluster (particularly for spawning child processes).

Environment variables

For controlling task distribution

While the original WEMD work managers were controlled by command-line options and entries in wemd.cfg, the new work manager is controlled using command-line options or environment variables (much like OpenMP). These variables are as follow:

Variable	Applicable to	Default	Meaning
WM_WORK_MANAGER	(none)	processes	Use the given task distribution system: “serial”, “threads”, “processes”, or “zmq”
WM_N_WORKERS	threads, processes, zmq	number of cores in machine	Use this number of workers. In the case of zmq, use this many workers on the current machine only (can be set independently on different nodes).
WM_ZMQ_MODE	zmq	server	Start as a server (“server”) or a client (“client”). Servers coordinate a given calculation, and clients execute tasks related to that calculation.
WM_ZMQ_TASK_TIMEOUT	zmq	60	Time (in seconds) after which a worker will be considered hung, terminated, and restarted. This must be updated for long-running dynamics segments. Set to zero to disable hang checks entirely.
WM_ZMQ_TASK_ENDPOINT	zmq	Random port	Master distributes tasks at this address
WM_ZMQ_RESULT_ENDPOINT	zmq	Random port	Master receives task results at this address \|
WM_ZMQ_ANNOUNCE_ENDPOINT	zmq	Random port	Master publishes announcements (such as “shut down now”) at this address
WM_ZMQ_SERVER_INFO	zmq	`zmq_server_info_PID_ID.json` (where PID is a process ID and ID is a nearly random hex number)	A file describing the above endpoints can be found here (to ease cluster-wide startup)

For passing information to workers

One environment variable is made available by multi-process work managers (processes and ZMQ) to help clients configure themselves (e.g. select an appropriate GPU on a multi-GPU node):

Variable	Applicable to	Meaning
WM_PROCESS_INDEX	processes, zmq	Contains an integer, 0 based, identifying the process among the set of processes started on a given node.

The ZeroMQ work manager for clusters

The ZeroMQ (“zmq”) work manager can be used for both single-machine and cluster-wide communication. Communication occurs over sockets using the ZeroMQ messaging protocol. Within nodes, Unix sockets are used for efficient communication, while between nodes, TCP sockets are used. This also minimizes the number of open sockets on the master node.

The quick and dirty guide to using this on a cluster is as follows:

source env.sh
export WM_WORK_MANAGER=zmq
export WM_ZMQ_COMM_MODE=tcp
export WM_ZMQ_SERVER_INFO=$WEST_SIM_ROOT/wemd_server_info.json

w_run &

# manually run w_run on each client node, as appropriate for your batch system
# e.g. qrsh -inherit for Grid Engine, or maybe just simple SSH

for host in $(cat $TMPDIR/machines | sort | uniq); do
   qrsh -inherit -V $host $PWD/node-ltc1.sh &
done