alex.components.nlg.tectotpl.tool package

Submodules

alex.components.nlg.tectotpl.tool.cluster module

class alex.components.nlg.tectotpl.tool.cluster.Job(code=None, header=u'#!/usr/bin/env pythonn# coding=utf8nfrom __future__ import unicode_literalsn', name=None, work_dir=None, dependencies=None)[source]

Bases: object

This represents a piece of code as a job on the cluster, holds information about the job and is able to retrieve job metadata.

The most important method is submit(), which submits the given piece of code to the cluster.

Important attributes (some may be set in the constructor or at job submission, but all may be set between construction and launch): —————————————————————— name – job name on the cluster (and the name of the created

Python script, default will be generated if not set)
code – the Python code to be run (needs to have imports and
sys.path set properly)
header – the header of the created Python script (may contain
imports etc.)
memory – the amount of memory to reserve for this job on the
cluster

cores – the number of cores needed for this job work_dir – the working directory where the job script will be

created and run (will be created on launch)
dependencies-list of Jobs this job depends on (must be submitted
before submitting this job)

In addition, the following values may be queried for each job at runtime or later: —————————————————————— submitted – True if the job has been submitted to the cluster. state – current job state (‘qw’ = queued, ‘r’ = running, ‘f’

= finished, only if the job was submitted)

host – the machine where the job is running (short name) jobid – the numeric id of the job in the cluster (NB: type is

string!)
report – job report using the qacct command (dictionary,
available only after the job has finished)

exit_status- numeric job exit status (if the job is finished)

DEFAULT_CORES = 1
DEFAULT_HEADER = u'#!/usr/bin/env python\n# coding=utf8\nfrom __future__ import unicode_literals\n'
DEFAULT_MEMORY = 4
DIR_PREFIX = u'_clrun-'
FINISH = u'f'
NAME_PREFIX = u'pyjob_'
QSUB_MEMORY_CMD = u'-hard -l mem_free={0} -l act_mem_free={0} -l h_vmem={0}'
QSUB_MULTICORE_CMD = u'-pe smp {0}'
TIME_POLL_DELAY = 60
TIME_QUERY_DELAY = 1
add_dependency(dependency)[source]

Adds a dependency on the given Job(s).

exit_status

Retrieve the exit status of the job via the qacct report. Throws an exception the job is still running and the exit status is not known.

get_script_text()[source]

Join headers and code to create a meaningful Python script.

host

Retrieve information about the host this job is/was running on.

jobid

Return the job id.

name

Return the job name.

remove_dependency(dependency)[source]

Removes the given Job(s) from the dependencies list.

report

Access to qacct report. Please note that running the qacct command takes a few seconds, so the first access to the report is rather slow.

state

Retrieve information about current job state. Will also retrieve the host this job is running on and store it in the __host variable, if applicable.

submit(memory=None, cores=None, work_dir=None)[source]

Submit the job to the cluster. Override the pre-set memory and cores defaults if necessary. The job code, header and working directory must be set in advance. All jobs on which this job is dependent must already be submitted!

wait(poll_delay=None)[source]

Waits for the job to finish. Will raise an exception if the job did not finish successfully. The poll_delay variable controls how often the job state is checked.

Module contents