[gridengine users] Debugging crash when running program through GridEngine

Reuti reuti at staff.uni-marburg.de
Fri May 4 14:50:20 UTC 2018


> Am 04.05.2018 um 16:10 schrieb Skylar Thompson <skylar2 at uw.edu>:
> 
> Do you have any memory limits (in particular, h_vmem) imposed on your batch
> jobs?

Also worth to be checked is the setting of the stack size. Certain applications need a set stack size of 8 MB to 128 MB instead of being left as unlimited. You can compare the limits for the interactive access and a job submission by using:

echo hard limits
ulimit -aH
echo soft limits
ulimit -aS

-- Reuti


> 
> On Fri, May 04, 2018 at 01:45:24PM +0000, Simon Andrews wrote:
>> I've got a strange problem on our cluster where some python programs are segfaulting when run through qsub, but work fine on the command line, or even if run remotely through SSH.
>> 
>> Really simple (hello world) programs work OK, but anything which does a significant amount of imports seems to fail.  So for example;
>> 
>> htseq-count
>> 
>> works locally, but
>> 
>> qsub -o test.log -cwd -V -j y -b y htseq-count
>> 
>> Produces a segfault in the executed program.
>> 
>> ssh compute-0-0 htseq-count
>> 
>> ..works fine (we're using ssh to launch jobs on our cluster)
>> 
>> Any suggestions for how to go about trying to track this down?
>> 
>> Thanks
>> 
>> Simon.
>> 
>> The Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT Registered Charity No. 1053902.
>> The information transmitted in this email is directed only to the addressee. If you received this in error, please contact the sender and delete this email from your system. The contents of this e-mail are the views of the sender and do not necessarily represent the views of the Babraham Institute. Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms>
> 
>> _______________________________________________
>> users mailing list
>> users at gridengine.org
>> https://gridengine.org/mailman/listinfo/users
> 
> 
> -- 
> -- Skylar Thompson (skylar2 at u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list