hpc - Detect errors with torque and grid engine and prevent execution of dependent tasks -
i have shell script queues multiple tasks execution on hpc cluster. same job submission script works either torque or grid engine minor conditional logic. pipeline output of earlier tasks fed later tasks further processing. i'm using qsub define job dependencies, later tasks wait earlier tasks complete before starting execution. far good.
sometimes, task fails. when failure happens, don't want of dependent tasks attempt processing output of failed task. however, dependent tasks have been queued execution long before failure occurred. way prevent unwanted processing?
you can use afterok
dependency argument. example, qsub command may like:
qsub -w depend=afterok:<jobid> submit.pbs
torque start next job if jobid exits without errors. see documentation on adaptive computing page.
Comments
Post a Comment